# Exercise 3 - Introduction to OpenAI API and prompting
OpenAI makes their models available via an [REST API](https://platform.openai.com/docs/api-reference) and multiple SDKs.
In this exercise, we will explore how to use the Python SDK to interact with these models.

Before we can start, we must create an instance of the OpenAI client.
We can do this by providing the `api_key` and `azure_endpoint` to the `AzureOpenAI` class.

If you are using Github codespaces, you will already have access to the OpenAI secrets and can proceed to run the code below.


In [None]:
from llm_in_production.openai_utils import get_openai_client
import dotenv
import os

# This reads the .env file in your project and transforms its content into env variables.
# This way you don't have to hard code your secrets.
dotenv.load_dotenv()
# Here we create the client.
client = get_openai_client()


However, if you are not using Github codespaces, before you can run te code above, you must have an `.env` file at the root of your project.
There should be an example of this file named `.env.example`.
Copy this file and rename it to `.env`.
Then, ask your instructor for any missing secrets.



## Introduction OpenAI API
OpenAI offers two types of text generation: text-completion and chat completion.


### Completion/Instruction-following API
This API is most similar to the GPT-2 API we saw in the previous text generation exercise.
This API works as follows:
- You specify the text completion model (e.g. the now expired `babbage` model) or instruction-following model (e.g. `gpt-35-turbo-instruct`)via the `model` parameter.
- You specify the text to be completed via the `prompt` parameter.
- Optionally, you specify some stop words. Once the generated text ends in one of these words, it will automatically stop generating.

The response contains a lot more information about the generation process. However, we are mainly interested in the generated text, as shown below.

In [None]:
text = """
Lines starting with 'AI:' are the things the AI assistant says.
Lines starting with 'Human:' are the things the human says.
Lines starting with '####' mean that the conversation is over and a new one is starting. Things discussed in previous conversations are not remembered.

Human: Give me some ideas for dinner tonight.
AI:""".strip() # feel free to change this text to see what happens.

stop_words = ["AI:", "Human:", "####"]
n=100

completion = client.completions.create(
    model=os.environ["GPT_35_TURBO_INSTRUCT_MODEL_NAME"],
    prompt=text,
    stop=stop_words,
    max_tokens=n
)
print(completion.choices[0].text)

<mark>**Question:**</mark> **Look what happens when we instruct the model how to act. Why is this the case?**

<br>  
<details>
    
  <summary><span style="color:blue">Show answer</span></summary>
  
This instruction has triggered one of GPT's input guardrails.

Input guardrails aim to prevent inappropriate content getting to the LLM in the first place - some common use cases are:

- Topical guardrails: Identify when a user asks an off-topic question and give them advice on what topics the LLM can help them with.

- Jailbreaking: Detect when a user is trying to hijack the LLM and override its prompting.

- Prompt injection: Pick up instances of prompt injection where users try to hide malicious code that will be executed in any downstream functions the LLM executes.

All of these measures are preventative, running either before or in parallel with the LLM, and triggering your application to behave differently if one of these criteria are met.

    
</details>    

In [None]:
# text = """
# You are a helpful AI assistant that helps people with their daily tasks.
# Lines starting with 'AI:' are the things the AI assistant says.
# Lines starting with 'Human:' are the things the human says.
# Lines starting with '####' mean that the conversation is over and a new one is starting. Things discussed in previous conversations are not remembered.

# Human: Give me some ideas for dinner tonight.
# AI:""".strip() # feel free to change this text to see what happens.

# completion = client.completions.create(
#     model=os.environ["GPT_35_TURBO_INSTRUCT_MODEL_NAME"],
#     prompt=text,
#     stop=stop_words,
#     max_tokens=n
# )
# print(completion.choices[0].text)

### Chat API
Besides text generation, OpenAI also offers APIs around its chat models, such as Chat-GPT.
As you might expect, these models want their inputs in chat format.
This changes their function call slightly.
Instead of text, we input messages.
A message consists of `content,` which is the text of the message, and a `role.`
The role has to be one of the following values:
- `system`: This contains the system instructions the model should follow during the conversation.
- `user`:  This means that the message is something the user said.
- `assistant`: This means that the message is something the assistant/model said.

Feel free to try it out in the bellow cell. We are using a GPT-3.5 Turbo model: a fast, inexpensive model for chat tasks.

In [None]:
completion = client.chat.completions.create(
    messages=[
        {
            "role":  "system",
            "content": "You are a ChefBot that provides recipes.",
        },
        {
            "role": "user",
            "content": "Give me a recipe for a pasta dish.",
        },
        {
            "role": "assistant",
            "content": "Sure, do you have any allergies",
        },
         {
            "role": "user",
            "content": "No I don't",
        },
    ],
    model=os.environ["GPT_35_CHAT_MODEL_NAME"],
)
print(completion.choices[0].message.content)

OpenAI have now released GPT-4o mini: a compact, more cost-effective version of OpenAI’s GPT-4o model, designed to replace GPT-3.5 Turbo. It excels in textual comprehension, coding, and math, outperforming GPT-3.5 on benchmarks like MMLU and HumanEval. With a larger 128,000-token context window, it’s highly effective for handling extensive text documents and also has enhanced safety features.

Compare the response you get when using the GPT-4o mini model.

In [None]:
completion = client.chat.completions.create(
    messages=[
        {
            "role":  "system",
            "content": "You are a ChefBot that provides recipes.",
        },
        {
            "role": "user",
            "content": "Give me a recipe for a pasta dish.",
        },
        {
            "role": "assistant",
            "content": "Sure, do you have any allergies",
        },
         {
            "role": "user",
            "content": "No I don't",
        },
    ],
    model=os.environ["GPT_4_MODEL_NAME"],
)
print(completion.choices[0].message.content)


# Exercise 3a - base model vs instruction fine-tuned model
The OpenAI API gives us access to the base models of GPT-3. 

The base models are not yet fined tuned to follow instructions or a chat format. So let's explore how much of an impact this instruction fine-tunning has on the stearability of large langue models. 

The goal is to get both the base model and instruction fine-tuned to tell a joke. Tell only thing you are allowed to change is the prompt text.

### Instruction fine-tuned model
Here, we use the instructed fine-tuned model named [gpt-35-turbo-instruct](https://platform.openai.com/docs/models/gpt-3-5), which is a text completion version of ChatGPT.

Remember, the goal is to design a prompt that generates a new and funny joke.

In [None]:
# YOUR CODE HERE START
# YOUR CODE HERE END

completion = client.completions.create(
    model=os.environ["GPT_35_TURBO_INSTRUCT_MODEL_NAME"],
    prompt=text,  # Define this variable
)
print(completion.choices[0].text)

### Chat fine-tuned model 3.5
Here, we use a chat fine-tuned model named [gpt-35-turbo](https://platform.openai.com/docs/models/gpt-3-5), which was the model ChatGPT originally used.
With this model, you have to format your prompt as a list of message objects `{'role': system|user|assistant, 'content': ...}`. 

Remember, the goal is to design a prompt that generates a new and funny joke.


In [None]:
# YOUR CODE HERE START
# YOUR CODE HERE END

completion = client.chat.completions.create(
    messages=messages, # Define this variable
    model=os.environ["GPT_35_CHAT_MODEL_NAME"],
)
print(completion.choices[0].message.content)

### Chat fine-tuned model 4o-mini
Here, we use a chat fine-tuned model named [gpt-4o-mini](https://platform.openai.com/docs/models/gpt-4o-mini), which is the model ChatGPT currently uses.
With this model, you have to format your prompt as a list of message objects `{'role': system|user|assistant, 'content': ...}`. 

Remember, the goal is to design a prompt that generates a new and funny joke.


In [None]:
# YOUR CODE HERE START
# YOUR CODE HERE END

completion = client.chat.completions.create(
    messages=messages, # Define this variable
    model=os.environ["GPT_4_MODEL_NAME"],
)
print(completion.choices[0].message.content)

## Optional: Exercise 3b - base model vs instruction fine-tuned model
Let's make a CLI-based version of chat GPT.
Below, you will find some boilerplate code that takes care of asking the user for input and printing the conversation.
However, there are still a few things missing like:
- The content of the inital system prompt.
- The API call to `client.chat.completions.create` which leads to a response from the chatbot based on the conversation so far.


Can you finish it?

<mark>Note: The text box should appear at the top of your IDE (VS Code) window.</mark>

In [None]:
import time

# The initial system prompt.
messages = [
    {
        "role":  "system",
        # YOUR CODE HERE START
        # YOUR CODE HERE END
    }
]

while True:
    # Get the user's input.
    user_message = input('User:').strip()
    
    # Check whether to continue.
    if len(user_message) == 0 or user_message == "exit":
        print("exiting...")
        break
    print(1)
    # Remember the users response.
    messages.append({
        "role": "user",
        "content": user_message,
    })

    # Get the response from the GPT model.
    # YOUR CODE HERE START
    # YOUR CODE HERE END
    print(2)
    # Remember the assistant's response.
    messages.append({
        "role": "assistant",
        "content": assistant_message,
    })

    # Print the user input and response.
    print("User:", user_message)
    print("AI:", assistant_message)

    # We need to wait a little bit of time for the text to render.
    time.sleep(0.5) 

---