<img src="../Assets/Images/Day 1 Header.png">

# Welcome to Day 1

Today we will 

- Introduce LLMs
- Understand the available OpenAI LLMs
- Look at the OpenAI Chat Messages (Prompt) Structure
- Study Chat Completions API Parameters
- Decode the Chat Completions Response Object
- Learn how to Stream responses
- Look at JSON Response format
- Understand Reproducibility
- Learn about managing Tokens


## Brief Introduction to Large Language Models

<span style="font-size: 20px; color: orange"><b>Generative AI, and LLMs specifically, is a General Purpose Technology that is useful for a variety of applications</b></span>

<span style="font-size: 16px;"><i>"LLMs can be, generally, thought of as a next word prediction model"</i></span>

<span style="font-size: 16px; color: blue"><b>What is an LLM?</b></span>

- LLMs are __machine learning models__ that have learned from __massive datasets__ of human-generated content, finding statistical patterns to replicate human-like abilities.

- __Foundation models__, also known as base models, have been trained on trillions of words for weeks or months using extensive compute power. These models have __billions of parameters__, which represent their memory and enable sophisticated tasks.

- __Interacting with LLMs differs from traditional programming paradigms. Instead of formalized code syntax, you provide natural language prompts to the models__.

- When you pass a __prompt__ to the model, it predicts the next words and generates a __completion__. This process is known as __inference__.


<span style="font-size: 16px; color: blue"> <b>Prompts, Completions and Inference!</b></span>

<img src="../Assets/Images/LLM Inference.png">

__Recommendation__ : An excellent course "[Generative AI using Large Language Models](https://www.deeplearning.ai/courses/generative-ai-with-llms/)" is offered by DeepLearning.ai and AWS via Coursera.

I've prepared notes on this course that you can download 

<a href="https://abhinavkimothi.gumroad.com/l/GenAILLM"><img src="../Assets/Images/Coursera Course Notes.png"></a>

## Available OpenAI LLMs

<span style="font-size: 16px; color: blue"> <b>GPT-4</b></span>

__<u>(Production)</u>__

<u>Name        | Context Window    | Cut-off date      | Snapshot</u>

__gpt-4__       | 8,192 tokens      | Up to Sep 2021    | gpt-4-0613

__gpt-4-32k__   | 32,768 tokens     | Up to Sep 2021    | gpt-4-32k-0613

<b><u>(Preview)</b></u>

__gpt-4-turbo-preview__     | 128,000 tokens | Up to Dec 2023    | gpt-4-1106-preview

__gpt-4-vision-preview__    | 128,000 tokens | Up to Apr 2023    | gpt-4-1106-vision-preview

_Note : gpt-4-1106-vision-preview is a multi-modal model that works on both text and images_

---

<span style="font-size: 16px; color: blue"> <b>GPT-3.5</b></span>


__gpt-3.5-turbo__ | 16,385 tokens | Up to Sep 2021 | gpt-3.5-turbo-1106

__gpt-3.5-turbo-instruct__ | 4,096 tokens | Up to Sep 2021

---

For more details, visit the official documentation -> https://platform.openai.com/docs/models

---

<span style="font-size: 14px; color: orange">__IMP__ : __"model"__ is passed as a parameter in the chat completions API</span>

---


## OpenAI Chat Messages (Prompt) Structure

<span style="font-size: 16px;"><u>Role</u></span> : OpenAI allows for *three* roles/personas - 
1. __System__ : The overarching constraints/definitions/intructions of the system that the LLM should "remember"
2. __User__ : Any instruction a user wants to pass to the LLM
3. __Assistant__ : The response from the LLM

<span style="font-size: 16px;"><u>Content</u></span> : Any message or "prompt" of these personas are passed as "Content"



__Why is this important?__ : Makes it easier to adapt an LLM to store conversation history.

---

<span style="font-size: 14px; color: orange">__IMP__ : __"role"__ and __"content"__ are passed as a dictionary in the __"messages"__ parameter in the chat completions API </span>

---

## Let's try!

### Recall from Day 0

Have you installed required Python packages:
    
_pip install -r requirements.txt_

If not, run the cell below after removing the "''' (triple single quotes)" from the beginning and the end

Alternatively, please go to {Home}/OpenAI-API-Explorer/Readme.md and follow the instructions

---

In [1]:
'''import os

# Get the current working directory
current_directory = os.getcwd()

# Extract the directory name from the path
current_directory_name = os.path.basename(current_directory)

# Check if the directory name is 'Notebooks'
if current_directory_name != 'Notebooks':
    notebooks_directory = os.path.join(current_directory, 'Notebooks')
    
    # Check if 'Notebooks' directory exists
    if os.path.exists(notebooks_directory) and os.path.isdir(notebooks_directory):
        # Change the current working directory to 'Notebooks'
        os.chdir(notebooks_directory)
        print("Changed to 'Notebooks' directory.")
        %pip install -r ../requirements.txt --quiet
        print("Requirements Installed")
    else:
        print("Please go to {Home}/OpenAI-API-Explorer/Readme.md and follow the instructions")
else:
    print("Already in 'Notebooks' directory.")
    %pip install -r ../requirements.txt --quiet
    print("Requirements Installed")
'''

'import os\n\n# Get the current working directory\ncurrent_directory = os.getcwd()\n\n# Extract the directory name from the path\ncurrent_directory_name = os.path.basename(current_directory)\n\n# Check if the directory name is \'Notebooks\'\nif current_directory_name != \'Notebooks\':\n    notebooks_directory = os.path.join(current_directory, \'Notebooks\')\n    \n    # Check if \'Notebooks\' directory exists\n    if os.path.exists(notebooks_directory) and os.path.isdir(notebooks_directory):\n        # Change the current working directory to \'Notebooks\'\n        os.chdir(notebooks_directory)\n        print("Changed to \'Notebooks\' directory.")\n        %pip install -r ../requirements.txt --quiet\n        print("Requirements Installed")\n    else:\n        print("Please go to {Home}/OpenAI-API-Explorer/Readme.md and follow the instructions")\nelse:\n    print("Already in \'Notebooks\' directory.")\n    %pip install -r ../requirements.txt --quiet\n    print("Requirements Installed")\n

In [2]:
#### Import Libraries ####
from openai import OpenAI #OpenAI Client

In [3]:
from dotenv import load_dotenv
import os

load_dotenv()

openai_api_key=os.getenv("OPENAI_API_KEY")

client = OpenAI(api_key=openai_api_key)

### API Call

Now let's call the Chat Completions API. 

Let's talk to the LLM about the beautiful game of Cricket.

Remember "System" is the overarching instruction that you can give to the LLM. Here let's ask the LLM to be a helpful assistant who has a knowledge of cricket.

Then will ask it a question about cricket.

In [11]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "You are a helpful assistant knowledgeable in the field of Cricket."}, ### The overarching instruction
    {"role": "user", "content": "When did Australia win their first Cricket World Cup?"} ### Our Question
  ]
)

Note that we used the gpt-3.5-turbo model. This is the same model used in the free version of ChatGPT.

Let's look at the response that the LLM generated. We'll look at the response object later in detail. The response text is found under choices->message->content in the response object

In [12]:
import textwrap
print(textwrap.fill(response.choices[0].message.content, 150))

Australia won their first Cricket World Cup in 1987 when they co-hosted the tournament with India and Pakistan. They defeated England in the final to
claim their maiden title.


Great! We got a response from the LLM. The response also seems fairly accurate. 

Remember that there is no guarantee that the response will be factually correct

Now, let's ask a follow up question - "How much did they score?"

This question implies that we are chatting with the LLM and the LLM has the context of the previously asked questions and responses. To the API we will have to pass all of the previous prompts and responses.

In [13]:


response = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "You are a helpful assistant knowledgeable in the field of Cricket."}, ### The overarching instruction
    {"role": "user", "content": "When did Australia win their first Cricket World Cup?"}, ### Our Question
    {"role": "assistant", "content": "Australia won their first Cricket World Cup in the year 1987. They defeated England in the final to clinch their maiden title in the tournament."}, ### LLM Response
    {"role": "user", "content": "How much did they score?"} ### Our followup question
  ]
)


print(textwrap.fill(response.choices[0].message.content,150))

In the final of the 1987 Cricket World Cup, Australia scored 253/5 in their allotted 50 overs. England, in response, managed to score 246/8, resulting
in Australia winning the match by 7 runs and clinching their first World Cup title.


Great, we can see that model considers all the previous context while answering the latest question.  

Now that we understand how prompts and responses work, let's deep dive into the API parameters

## Chat Completion API Parameters

<span style="font-size: 20px; color: orange"><b>"model"</b> and <b>"messages"</b> are the two required API parameters</span>

<span style="font-size: 16px; color: blue"> There are several other optional parameters that help configure the response</span>

---

__n__ : Number of responses you want the LLM to generate for the instruction

__max_tokens__ : Maximum number of tokens you want to restrict the Inference to (This includes both the prompt/messages and the completion)

__temperature__ : Temperature controls the "randomness" of the responses. Higher value increases the randomness; lower value makes the output deterministic *(value between 0 and 2)*

__top_p__ : The model considers the results of the tokens with top_p probability mass *(value between 0 and 1)*


<img src="../Assets/Images/Temperature - Top P.png">

<span style="font-size: 14px; color: orange"> __IMP__ : It is recommended to configure either one of "temperature" and "top_p" but not both</span>


__frequency_penalty__ : Penalize new tokens based on their existing frequency in the text so far *(Value between -2 and 2)*

__presence_penalty__ : Penalize new tokens based on whether they appear in the text so far *(Value between -2 and 2)*

__logprobs__ : Flag to return log probability of the generated tokens *(True/False)*

__logit_bias__ : Parameter to control the presence of particular tokens in the output *(Value between -100 and 100)*

__response_format__ : Response of the model can be requested in a particular format *(Currently : JSON and Text)*

__seed__ : Beta feature for reproducible outputs (setting a seed value may produce the same output repeatedly)

__stop__ : End of Sequence tokens that will stop the generation

__stream__ : To receive partial message deltas *(True/False)*

__user__ : ID representing end user (This helps OpenAI detect abuse. May be mandatory for higher rate limits)

__tools__ : used in function calling

__tool_choice__ : used in function calling

---

Let's create a function for calling the API

In [14]:
def gpt_call(model:str="gpt-3.5-turbo",prompt:str="Have I provided any input",n:int=1,max_tokens:int=100,temperature:float=0.5,presence_penalty:float=0):


    response = client.chat.completions.create(
    model=model,
    messages=[
       {"role": "user", "content": prompt}
    ],
    max_tokens=max_tokens,
    temperature=temperature,
    presence_penalty=presence_penalty,
    n=n
    )
    


    if len(response.choices)>1:
        output=''
        for i in range(0,len(response.choices)):
            output+='\n\n-- n = '+str(i+1)+' ------\n\n'+str(response.choices[i].message.content)
    else:
        output=response.choices[0].message.content
        
    

    return output

In [18]:
print(gpt_call(prompt="Write a title for a workshop on openai API",n=2,temperature=1,model="gpt-3.5-turbo",max_tokens=100,presence_penalty=0))



-- n = 1 ------

"Unlocking the Potential of AI: A Deep Dive into the OpenAI API"

-- n = 2 ------

"Exploring the Potential of OpenAI: A Hands-On Workshop on Harnessing the Power of AI Technology"


__Notice what happens when the temperature value is set to zero__

__Try changing different parameter values__

In [19]:
print(gpt_call(prompt="Write a title for a workshop on openai API",n=3,temperature=0,model="gpt-3.5-turbo",max_tokens=100,presence_penalty=0))



-- n = 1 ------

"Unlocking the Power of OpenAI: A Hands-On Workshop on Harnessing the OpenAI API"

-- n = 2 ------

"Unlocking the Power of OpenAI: A Hands-On Workshop on Harnessing the OpenAI API"

-- n = 3 ------

"Unlocking the Power of OpenAI: A Hands-On Workshop on Harnessing the OpenAI API"


In [17]:
import gradio as gr
def text_to_uppercase(model,n,max_tokens,temperature,presence_penalty,prompt):
    return str(type(n)) + str(type(temperature))

model=gr.Radio(["gpt-4","gpt-3.5-turbo"], label="Select Model")
n=gr.Radio([1,2,3], label="Number of Responses")
max_tokens=gr.Slider(minimum=10, maximum=500, label="Maximum Tokens")
temperature=gr.Slider(minimum=0.0, maximum=1.0, label="Temperature")
prompt=gr.Text(label="Prompt")


iface = gr.Interface(fn=gpt_call, inputs=[model,prompt,n,max_tokens,temperature], outputs="text")
iface.launch()


  from .autonotebook import tqdm as notebook_tqdm


Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.




In [20]:
iface.close()

Closing server running on port: 7860


## Decoding the Chat Completions Response Object

In [21]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "You are a helpful assistant knowledgeable in the field of Cricket."},
    {"role": "user", "content": "When did Australia win their first Cricket World Cup?"},
    {"role": "assistant", "content": "Australia won their first Cricket World Cup in the year 1987. They defeated England in the final to clinch their maiden title in the tournament."},
    {"role": "user", "content": "How much did they score?"}
  ]
)

In [22]:
print((response.model_dump_json(indent=5)))


{
     "id": "chatcmpl-9IiBFB7wmCrGo4RZguowGSWjDr2L5",
     "choices": [
          {
               "finish_reason": "stop",
               "index": 0,
               "logprobs": null,
               "message": {
                    "content": "In the final of the 1987 Cricket World Cup, Australia scored 253 runs for the loss of 5 wickets in their allocated 50 overs. England, in response, could only manage 246 runs, thus Australia won the match by 7 runs.",
                    "role": "assistant",
                    "function_call": null,
                    "tool_calls": null
               }
          }
     ],
     "created": 1714246761,
     "model": "gpt-3.5-turbo-0125",
     "object": "chat.completion",
     "system_fingerprint": "fp_3b956da36b",
     "usage": {
          "completion_tokens": 54,
          "prompt_tokens": 77,
          "total_tokens": 131
     }
}


<img src="../Assets/Images/Chat Completion Object.png" >


## Streaming Responses

In [23]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "You are a helpful assistant knowledgeable in the field of Cricket."},
    {"role": "user", "content": "When did Australia win their first Cricket World Cup?"},
    {"role": "assistant", "content": "Australia won their first Cricket World Cup in the year 1987. They defeated England in the final to clinch their maiden title in the tournament."},
    {"role": "user", "content": "How much did they score?"}
  ],
  stream=True
)

for chunk in response:
  print(chunk.choices[0].delta.content)


In
 the
 final
 of
 the
 
198
7
 Cricket
 World
 Cup
,
 Australia
 scored
 
253
/
5
 in
 their
 
50
 overs
.
 England
,
 in
 response
,
 could
 only
 manage
 
246
/
8
,
 thereby
 giving
 Australia
 a
 comfortable
 victory
 in
 the
 match
.
None


## JSON Mode

In [24]:
prompt="generate the entire text for a blog on a cricket match. \"Title\" is a catchy and attractive title of the blog. The \"Heading\" is the heading for each point in the blog and the \"Body\" is the text for that heading.Output in a json structure"

In [25]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
        {"role": "user", "content": prompt}
  ],
  response_format={ "type": "json_object" }
)

print((response.choices[0].message.content))

{
  "Title": "Thrilling Cricket Match: A Recap of the Exciting Game",
  "Heading": [
    {
      "Heading": "Introduction",
      "Body": "The cricket match between Team A and Team B was nothing short of a rollercoaster ride for both the players and the fans. The game was filled with excitement, suspense, and unforgettable moments that kept everyone on the edge of their seats throughout."
    },
    {
      "Heading": "Innings Recap",
      "Body": "Team A won the toss and elected to bat first. They got off to a great start with their openers scoring quick runs. However, Team B fought back with some excellent bowling and fielding, restricting Team A to a decent total. In response, Team B got off to a shaky start but managed to steady their innings through some solid partnerships in the middle order."
    },
    {
      "Heading": "Key Performances",
      "Body": "One of the standout performances of the match was the century scored by Team A's captain, who played a captain's knock unde

## Reproducibility of Outputs

In [26]:
response = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": "You are a helpful assistant knowledgeable in the field of Cricket."},
    {"role": "user", "content": "When did Australia win their first Cricket World Cup?"}
  ],
  seed=42
)

print(response.choices[0].message.content)

Australia won their first Cricket World Cup in 1987. They defeated England in the final held at Eden Gardens in Kolkata, India.


## Tokens

<span style="font-size: 20px; color: orange"><b>Tokens are the fundamental units of NLP</b></span>

<span style="font-size: 16px; color: blue"><b>These units are typically words, punctuation marks, or other meaningful substrings that make up the text</b></span>

Counting the number of tokens becomes important because - 
- Number of Tokens determine the amount of computation required and hence the cost you incur both in terms of money and the latency
- Context Window or the maximum number of tokens an LLM can process in one go is limited


In [28]:
import tiktoken

In [29]:
####num_tokens_from_string function to count number of tokens in a text string
####uses tiktoken to count number of tokens in a text string
####parameters: "string" is the text string, "encoding_name" is the encoding name to be used by tiktoken
####returns: num_tokens->number of tokens in the text string
####This function is used within extract_data, extract_page, extract_YT, extract_audio, extract_image functions
def num_tokens_from_string(string: str, encoding_name="cl100k_base") -> int: #### Function to count number of tokens in a text string ####
    encoding = tiktoken.get_encoding(encoding_name) #### Initialize encoding ####
    return len(encoding.encode(string)) #### Return number of tokens in the text string ####

In [37]:
input_text="Hello! how are you?"

print(f"The total number of tokens in '{input_text}' are : {num_tokens_from_string(input_text)}")

The total number of tokens in 'Hello! how are you?' are : 6


In [38]:
with open("../Assets/Data/alice_in_wonderland.txt") as f:
    AliceInWonderland = f.read()

print(f"The total number of tokens in Alice In Wonderland are : {num_tokens_from_string(AliceInWonderland)}")

The total number of tokens in Alice In Wonderland are : 38680


__PRICING__

__gpt-3.5-turbo-0125__	    |  PROMPT - $0.50 / 1M tokens   |   RESPONSE - $1.50 / 1M tokens

__gpt-4__	                |   PROMPT - $30.00 / 1M tokens	|   RESPONSE - $60.00 / 1M tokens

__gpt-4-turbo__             |	PROMPT - $10.00 / 1M tokens	|   RESPONSE - $30.00 / 1M tokens

Congratulations! We're at the end of Day 1!

Hopefully, now we are fairly confident around using the Chat Completions API and generating text using OpenAI models. 



<img src="../Assets/Images/That’s all for the day.png">

# About



<img src="../Assets/Images/Profile_AK.png" width=100> 

#### Hi! I'm Abhinav! A data science and AI professional with over 15 years in the industry. Passionate about AI advancements, I constantly explore emerging technologies to push the boundaries and create positive impacts in the world. Let’s build the future, together!

<span style="font-size: 20px; color: orange"><b>Connect with me!</b></span>


[![GitHub followers](https://img.shields.io/badge/Github-000000?style=for-the-badge&logo=github&logoColor=black&color=orange)](https://github.com/abhinav-kimothi)
[![LinkedIn](https://img.shields.io/badge/LinkedIn-000000?style=for-the-badge&logo=linkedin&logoColor=orange&color=black)](https://www.linkedin.com/comm/mynetwork/discovery-see-all?usecase=PEOPLE_FOLLOWS&followMember=abhinav-kimothi)
[![Medium](https://img.shields.io/badge/Medium-000000?style=for-the-badge&logo=medium&logoColor=black&color=orange)](https://medium.com/@abhinavkimothi)
[![Insta](https://img.shields.io/badge/Instagram-000000?style=for-the-badge&logo=instagram&logoColor=orange&color=black)](https://www.instagram.com/akaiworks/)
[![Mail](https://img.shields.io/badge/email-000000?style=for-the-badge&logo=gmail&logoColor=black&color=orange)](mailto:abhinav.kimothi.ds@gmail.com)
[![X](https://img.shields.io/badge/Follow-000000?style=for-the-badge&logo=X&logoColor=orange&color=black)](https://twitter.com/abhinav_kimothi)
[![Linktree](https://img.shields.io/badge/Linktree-000000?style=for-the-badge&logo=linktree&logoColor=black&color=orange)](https://linktr.ee/abhinavkimothi)
[![Gumroad](https://img.shields.io/badge/Gumroad-000000?style=for-the-badge&logo=gumroad&logoColor=orange&color=black)](https://abhinavkimothi.gumroad.com/)


<span style="font-size: 20px; color: orange"><b>You can also book a time-slot with me</b></span>

[![Static Badge](https://img.shields.io/badge/Resume%20Review%20(DS/AI/ML)%20(30%20min)-000000?style=for-the-badge&logo=googlecalendar&logoColor=blue&color=black)](https://topmate.io/abhinav_kimothi/544382)
[![Static Badge](https://img.shields.io/badge/AIML%20Learning%20Path%20(30%20min)-000000?style=for-the-badge&logo=googlecalendar&logoColor=black&color=blue)](https://topmate.io/abhinav_kimothi/544380)
[![Static Badge](https://img.shields.io/badge/Generative%20AI%20Consulting%20(60%20min)-000000?style=for-the-badge&logo=googlecalendar&logoColor=blue&color=black)](https://topmate.io/abhinav_kimothi/544379)


<span style="font-size: 20px; color: orange"><b>Also, read my ebooks for more on Generative AI!</b></span>



<a href="https://abhinavkimothi.gumroad.com/l/GenAILLM">
    <img src="https://public-files.gumroad.com/jsdnnne2gnhu61f6hrdprwx2255i" width=150>
</a><a href="abhinavkimothi.gumroad.com/l/RAG">
    <img src="https://public-files.gumroad.com/v17k9tp2fnbbtg8iwoxt4m3xgivq" width=150>
</a><a href="abhinavkimothi.gumroad.com/l/GenAITaxonomy">
    <img src="https://public-files.gumroad.com/a730ysxb7a928bb5xkz6fuqabaqp" width=150>
</a>



