In [1]:
from introdl.utils import wrap_print_text

print = wrap_print_text(print) # overload print to wrap text

## OpenAI API Demonstration

In this notebook we demonstrate how to access ChatGPT models programatically through OpenAI's python API.  If you want to experiment with the API you'll need to sign up for API account and pay for some credits.  It's really cheap so I encourage you to play around a bit.  The prices for recent models are given per 1 million tokens in the table below.  

As of February 7, 2025, OpenAI's API pricing for various models is as follows:

| Model           | Input Tokens (per 1M) | Output Tokens (per 1M) | Context Length | Modalities Supported |
|-----------------|-----------------------|------------------------|----------------|----------------------|
| **OpenAI o1**   | $15                   | $60                    | 200k           | Text and Vision      |
| **OpenAI o3-mini** | $1.10               | $4.40                  | 200k           | Text                 |
| **GPT-4o**      | $2.50                 | $10                    | 128k           | Text and Vision      |
| **GPT-4o mini** | $0.15                 | $0.60                  | 128k           | Text and Vision      |

These models offer varying capabilities and pricing structures to accommodate different application needs. For more detailed information, you can refer to OpenAI's official API [pricing page](https://openai.com/api/pricing/). 

We'll be learning more about tokens in the coming weeks, but they're just numerical representations of text.

The approximate ratio of words to tokens in English text varies depending on the complexity and style of the text. However, a commonly used estimate is:

- **1 word ≈ 1.3 to 1.5 tokens** for general English text.

This means that for every **1,000 words**, you can expect **1,300 to 1,500 tokens**. 

### Factors affecting the ratio:

1. **Shorter words (e.g., "a", "is", "the")** tend to be single tokens.
2. **Longer words (e.g., "transformative", "neuroscientific")** may be split into multiple tokens.
3. **Punctuation and special characters** (e.g., `!`, `?`, `--`) often count as separate tokens.
4. **Code, URLs, and non-standard text** typically have a higher token-to-word ratio.

For OpenAI models like GPT, you can test this with `tiktoken`:

In [2]:
import tiktoken

encoding = tiktoken.encoding_for_model("gpt-4o")
text = "This is a simple example sentence."
tokens = encoding.encode(text)
print(f"Word count: {len(text.split())}, Token count: {len(tokens)}")


Word count: 6, Token count: 7


### Understanding the System Prompt and the User Prompt

When interacting with the OpenAI API, two key types of prompts influence the model's responses: **the system prompt** and **the user prompt**.

1. **System Prompt**  
   - The system prompt is a message that sets the behavior and tone of the AI before the conversation begins.
   - It provides instructions about how the AI should respond throughout the session.
   - Example:  
     ```json
     {"role": "system", "content": "You are a helpful and concise assistant."}
     ```
   - This helps guide the AI’s responses consistently across different user inputs.

2. **User Prompt**  
   - The user prompt is the actual message sent by the user to request information or perform a task.
   - This is the main input that drives the AI’s response.
   - Example:  
     ```json
     {"role": "user", "content": "Explain black holes in simple terms."}
     ```
   - The AI will generate a response based on both the user’s request and the system prompt’s instructions.

The system prompt **shapes** how the AI responds, while the user prompt **directs** the AI on what to answer.

Unlike when we interact with a chatbot like ChatGPT the openAI API (and other large language models) are stateless.  They don't remember our previous interactions.  That means we need to send the system prompt and user prompt to the model each time.  If we're creating our own chatbot we'll also need to send conversation history.

### Accessing the OpenAI API

To access the OpenAI API you'll need to create an account and buy some credits.  Once your account is set up you'll need create an API key.  You generally don't want to share that key in a document such as this one so you can set it as an environment variable on your stystem.  If you want to play with the API in this class, you can add your API key to the file Lessons/Course_Tools/api_keys.env.  Copy it there without quotes and then run the cell below as we've done in other notebooks.  `config_paths_keys` checks your environment variables for `OPENAI_API_KEY` and if not present it sets the value from `api_keys.env`.

In [4]:
from introdl.utils import config_paths_keys

config_paths_keys();

# or you could do this
# import os
# os.environ["OPENAI_API_KEY"] = "your-api-key-here"
# but you're encouraged to use env file for security

MODELS_PATH=C:\Users\bagge\My Drive\Python_Projects\DS776_Develop_Project\models
DATA_PATH=C:\Users\bagge\My Drive\Python_Projects\DS776_Develop_Project\data
TORCH_HOME=C:\Users\bagge\My Drive\Python_Projects\DS776_Develop_Project\downloads
HF_HOME=C:\Users\bagge\My Drive\Python_Projects\DS776_Develop_Project\downloads


In [5]:
import os
from openai import OpenAI
import pandas as pd

client = OpenAI()

# Define a system prompt and a user prompt
system_prompt = "You are a helpful assistant."
user_prompt = "What are three interesting facts about space?"

# Call GPT-4o-mini with the latest API format
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt}
    ],
    max_tokens=None # unlimited response length, set to integer to limit length
)

# Display the response
print("GPT-4o-mini Response:")
print(response.choices[0].message.content)



GPT-4o-mini Response:
Here are three interesting facts about space:

1. **The Universe is Expanding**: One of the most fascinating discoveries in
modern astronomy is that the universe is expanding. Observations by astronomers,
notably those of Edwin Hubble in the 1920s, showed that galaxies are moving away
from us in all directions. This implies that the universe was once much smaller
and denser, leading to the Big Bang theory, which explains the origin of the
universe.

2. **Time Dilation**: According to Einstein's theory of relativity, time is not
absolute; it can be affected by gravity and speed. This phenomenon, known as
time dilation, means that time moves slower in stronger gravitational fields or
at higher speeds. For instance, astronauts on the International Space Station
age slightly slower than people on Earth due to their high orbital velocity and
the weaker gravitational field at that altitude.

3. **It’s Mostly Empty**: Despite the vastness of space, most of it is
incredib