## Introduction.

In this notebook, we'll explore how to use OpenAI Large Language Models (LLMs) using OpenAI API.

**OpenAI API** gives developers access to the state-of-the-art LLMs via Python code.

Curently, OpenAI has 2 flagship models:
1. **GPT-4o** - the most powerful model with high reasoning.
2. **GPT-4o Mini** - the cheapest and fastest model but less "smart".

You should use GPT-4o when:
- You need high reasoning (logical, analytical tasks).
- You build AI solutions with the AI Agents.
- The slower responses are not a problem.

Otherwise, GPT-4o Mini is probably a better choice.

In this tutorial, we'll use only GPT-4o Mini. But I'll show you how to use GPT-4o too.

Using the AI models is quite straightforward. It also has advantages over using tools such as ChatGPT:
- Access to models parameters.
- Access to the system prompt.
- Ability to connect models.

So it gives higher customization and control options.

In this notebook, we'll go through the following topics:
- Using GPT-4o and GPT-4o Mini via OpenAI API.
- The importance of the system prompt.
- Streaming responses.
- The detailed explanation of tokens.
- The practical applications of temperature.
And more!

To successfully run the notebook, you need to install several packages:
- **OpenAI API**: `openai` - the library to use OpenAI models via API calls.
- **Python Dotenv**: `python-dotenv` - to load secret variables from the .env file.
- **Tiktoken**: `tiktoken `- for counting tokens.

To install them, run the following command in your terminal:

```bash
$ pip install openai python-dotenv tiktoken
```

OK, let's move on the the coding part!


### Loading API keys

To make OpenAI API calls, we need a secret key.

I usually save the key in a `.env` file. Here's how it looks:

`OPENAI_API_KEY=sk-proj-your-actual-key-here`

*Note: I show you step-by-step how to do it in [this article](https://medium.com/ai-advances/how-to-start-your-first-ai-project-with-python-and-openai-api-ae116627a2e7?sk=d63a5157f7124d4501229a2a4b51079c)*.

Then, I load it using the `python-dotenv` library like this:

In [1]:
from dotenv import load_dotenv

load_dotenv()

True

### Initialize the OpenAI client.

To work with OpenAI API, we need to use the `OpenAI()` class. The common practice is to call it this way:

`client = OpenAI()`

In [2]:
from openai import OpenAI

client = OpenAI()

### Test with the simplest completion

Let's run this simple code to see, if everything works correctly:

In [3]:
completion = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "What is the capital of Poland?"}]
)

response = completion.choices[0].message.content
print(response)

The capital of Poland is Warsaw.


Awesome! We just saw the GPT-4o Mini response!

It means we successfully make API calls to the OpenAI API.

If you want to change the model to GPT-4o, you need to set `model="gpt-4o"`. Here's how:

In [7]:
completion = client.chat.completions.create(
    model="gpt-4o", # change the model here
    messages=[{"role": "user", "content": "What is the capital of Poland?"}]
)

response = completion.choices[0].message.content
print(response)

The capital of Poland is Warsaw.


Here are [all models](https://platform.openai.com/docs/models) available over OpenaAI API.

Now, let's have a closer look at the `completion`.

### Showing the `completion`

To see the response, we had to "dig" into `completion.choices[0].message.content`

But the completion itself is a `ChatCompletion` object.

Let's have a look.

In [4]:
print(completion)

ChatCompletion(id='chatcmpl-A0lqcWTfnlX0SIeu2CyvA1ptx9tlM', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='The capital of Poland is Warsaw.', refusal=None, role='assistant', function_call=None, tool_calls=None))], created=1724747290, model='gpt-4o-mini-2024-07-18', object='chat.completion', service_tier=None, system_fingerprint='fp_48196bc67a', usage=CompletionUsage(completion_tokens=7, prompt_tokens=16, total_tokens=23))


We can see, it's an object of type `ChatCompletion` by OpenAI API.

But, let's print it in a nicer way.

First, we need a helper function for that.

*Note: The function is here only to display the `ChatCompletion` object in a readible way. It has nothing to do with AI itself.*

Helper function:

In [5]:
import json

def serialize_completion(completion):
    if isinstance(completion, dict):
        return {key: serialize_completion(value) for key, value in completion.items()}
    elif isinstance(completion, list):
        return [serialize_completion(item) for item in completion]
    elif hasattr(completion, '__dict__'):
        return serialize_completion(vars(completion))
    else:
        return completion
    
def print_chat_completion(response_dict):
    formatted_json = json.dumps(response_dict, indent=4)
    print(formatted_json)
    
def serialize_and_print_completion(completion):
    completion_json = serialize_completion(completion)
    print_chat_completion(completion_json)

Printing the completion:

In [6]:
serialize_and_print_completion(completion)

{
    "id": "chatcmpl-A0lqcWTfnlX0SIeu2CyvA1ptx9tlM",
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "logprobs": null,
            "message": {
                "content": "The capital of Poland is Warsaw.",
                "refusal": null,
                "role": "assistant",
                "function_call": null,
                "tool_calls": null
            }
        }
    ],
    "created": 1724747290,
    "model": "gpt-4o-mini-2024-07-18",
    "object": "chat.completion",
    "service_tier": null,
    "system_fingerprint": "fp_48196bc67a",
    "usage": {
        "completion_tokens": 7,
        "prompt_tokens": 16,
        "total_tokens": 23
    }
}


We can see the `ChatCompletion` object holds more information, such as:
- The creation time of the response.
- The specific model we used.
- The token usage.

And more.

## Explaining message roles

As you noticed, the `messages` parameter is an array of objects. In our example it was:

```python
messages=[{"role": "user", "content": "What is the capital of Poland?"}]
```

Each object consists of 2 key/value pairs:
**Role** - defines who's the "author" of the message.

We've got 3 roles:
1. *User* - it's you.
2. *Assistant* - it's the AI model.
3. *System* - it's the main message that the AI model remembers throughout the entire conversation.

**Content** - it's the actual message.

Here's a great visual to picture that:

<img src="images/system2.png" alt="systemImage" width=500 />

*([Image source](https://www.deeplearning.ai/short-courses/chatgpt-prompt-engineering-for-developers/))*

### System Message.

System message sets the behavior of the AI model (assistant).

AI models keep this message always "on top". Even during long conversations, assistants remember the system prompt very well. It's like whispering in the ear the same message all the time.

Here are examples of how you can use the system prompt:
- Specify the output format.
- Define assistant's personality.
- Set context for the conversation.
- Define constraints and limitations.
- Provide instructions on how to respond.