# L1 Language Models, the Chat Format and Tokens

In [2]:
!pip install tiktoken

Collecting tiktoken
  Downloading tiktoken-0.7.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
[K     |████████████████████████████████| 1.1 MB 667 kB/s eta 0:00:01
[?25hCollecting regex>=2022.1.18
  Downloading regex-2024.5.15-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (774 kB)
[K     |████████████████████████████████| 774 kB 1.9 MB/s eta 0:00:01
Installing collected packages: regex, tiktoken
  Attempting uninstall: regex
    Found existing installation: regex 2021.8.3
    Uninstalling regex-2021.8.3:
      Successfully uninstalled regex-2021.8.3
Successfully installed regex-2024.5.15 tiktoken-0.7.0


In [3]:
import os
import openai
import tiktoken
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.environ['OPENAI_API_KEY']

In [4]:
def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output 
    )
    return response.choices[0].message["content"]

### Prompt the model and get a completion

In [5]:
response = get_completion("What is the capital of France?")

In [6]:
print(response)

The capital of France is Paris.


### Tokens
- Why does it give the wrong answer? Because it does not predict the next word but the next token!
- Why it works? Because it tokenize and better see the individual letter.

In [7]:
response = get_completion("Take the letters in lollipop \
and reverse them")
print(response)

pilpolol


In [8]:
response = get_completion("""Take the letters in \
l-o-l-l-i-p-o-p and reverse them""")

In [9]:
response

'p-o-p-i-l-l-o-l'

### Helper function (chat format)

In [10]:
def get_completion_from_messages(messages, 
                                 model="gpt-3.5-turbo", 
                                 temperature=0, 
                                 max_tokens=500):
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=temperature, # this is the degree of randomness of the model's output
        max_tokens=max_tokens, # the maximum number of tokens the model can ouptut 
    )
    return response.choices[0].message["content"]

In [11]:
messages =  [  
{'role':'system', 
 'content':"""You are an assistant who\
 responds in the style of Dr Seuss."""},    
{'role':'user', 
 'content':"""write me a very short poem\
 about a happy carrot"""},  
] 
response = get_completion_from_messages(messages, temperature=1)
print(response)

Oh, the happy carrot in the ground so deep,
With its orange hue, it makes me leap!
In the garden, it grows so fine,
Bringing joy with each veggie vine.
So crunchy, so sweet, it's always a delight,
The happy carrot, shining bright!


In [12]:
# length
messages =  [  
{'role':'system',
 'content':'All your responses must be \
one sentence long.'},    
{'role':'user',
 'content':'write me a story about a happy carrot'},  
] 
response = get_completion_from_messages(messages, temperature =1)
print(response)

Once upon a time, a carrot named Carl discovered his true purpose in life was to bring joy to the world with his bright orange smile.


In [13]:
# combined
messages =  [  
{'role':'system',
 'content':"""You are an assistant who \
responds in the style of Dr Seuss. \
All your responses must be one sentence long."""},    
{'role':'user',
 'content':"""write me a story about a happy carrot"""},
] 
response = get_completion_from_messages(messages, 
                                        temperature =1)
print(response)

In a garden so bright, a carrot took flight, spreading joy and delight, all day and all night.


In [14]:
def get_completion_and_token_count(messages, 
                                   model="gpt-3.5-turbo", 
                                   temperature=0, 
                                   max_tokens=500):
    
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=temperature, 
        max_tokens=max_tokens,
    )
    
    content = response.choices[0].message["content"]
    
    token_dict = {
'prompt_tokens':response['usage']['prompt_tokens'],
'completion_tokens':response['usage']['completion_tokens'],
'total_tokens':response['usage']['total_tokens'],
    }

    return content, token_dict

In [15]:
messages = [
{'role':'system', 
 'content':"""You are an assistant who responds\
 in the style of Dr Seuss."""},    
{'role':'user',
 'content':"""write me a very short poem \ 
 about a happy carrot"""},  
] 
response, token_dict = get_completion_and_token_count(messages)

In [16]:
print(response)

In a garden so bright, a carrot did grow,
With a smile on its face, a happy little glow.
It danced in the sun, and wiggled its toes,
Oh, what a joyous veggie, everyone knows!


In [17]:
print(token_dict)

{'prompt_tokens': 37, 'completion_tokens': 48, 'total_tokens': 85}
