# L2 Language Models, the Chat Format and Tokens

**How does an LLM wor?**

There are 2 major components 
- Text generation process
- Supervised learning 

Each sentence in converted in a sequence of training examples where the model tries to predict the next work. 
The problem with this approach is that the model is only looking at one side of the sentence. 

Hence, we have models such as BERT which has a better performance as it uses the concept of 

[**BERT MODEL**](https://towardsdatascience.com/bert-explained-state-of-the-art-language-model-for-nlp-f8b21a9b6270)

At a high level during the training process, BERT has been trained on 2 different tasks. 
- Next Sentence Prediction (NSP) given 2 sentences A and B, we BERT is trained to predict which sentence comes before or after BERT
- Masked Language Model (MLM) 15% of the words in each sequence are replaced with a [MASK] token. The model tries to predict the masks. hence it learns about the context 


MLM teaches BERT to understand relationships between words — NSP teaches BERT to understand longer-term dependencies across sentences.

[**XLNet model**](https://huggingface.co/docs/transformers/model_doc/xlnet)

BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy.

XLNet introduces permutation language modeling, where all tokens are predicted but in random order.
XLNet is a generalized autoregressive pretraining method that 
- (1) enables learning bidirectional contexts by **maximizing the expected log likelihood of a sequence wrt all possible permutations** of the factorization order and 
- (2) overcomes the limitations of BERT of data corruption thanks to its **autoregressive formulation.**

XLNet maximizes the expected log likelihood of a sequence w.r.t. all possible permutations
of the factorization order.

[*Click here*](https://towardsdatascience.com/bert-roberta-distilbert-xlnet-which-one-to-use-3d5ab82ba5f8) here compare BERT, RoBERTa, DistilBERT, XLNet

**Two Types of Large Language Models (LLMs)**

- **Base LLM** repeatedly redicts the word based on text training data to complete a sentence. 
eg. <br><br>
    One upon a time... (prompt) <br>
    (the model writes a complete story)
    
    What is the capital of France? (prompt) <br>
    What is France's largest city? (Model)<br>
    What is the currency of France? 
    
    
- **Instruction Tuned LLM** tries to follow instructions. eg. <br>
    What is the capital of France? (prompt) <br>
    The capital of France is Paris. (Model)<br>

**Getting from a Base LLM to an instruction tuned LLM:**
- Train a Base LLM on a lot of data.
- Further train the model:
    - Fine-tune on examples of where the output follows an input instruction
    - Obtain human-ratings of the quality of different LLM outputs, on criteria such as whether it is helpful, honest and harmless
    - Tune LLM to increase probability that it generates the more highly rated outputs (using RLHF: Reinforcement Learning from Human Feedback)

## Setup
#### Load the API key and relevant Python libaries.
In this course, we've provided some code that loads the OpenAI API key for you.

In [4]:
# ! pip install openai

In [2]:
import os
import openai
import tiktoken
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.environ['OPENAI_API_KEY']

#### helper function
This may look familiar if you took the earlier course "ChatGPT Prompt Engineering for Developers" Course

In [3]:
def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0,
    )
    return response.choices[0].message["content"]

## Prompt the model and get a completion

In [4]:
response = get_completion("What is the capital of France?")

In [5]:
print(response)

The capital of France is Paris.


## Tokens

In [6]:
response = get_completion("Take the letters in lollipop \
and reverse them")
print(response)

ppilolol


"lollipop" in reverse should be "popillol"

** Why is LLM failing at this seemingly simple task?**

Because the model repeatedly predicts the next token.<br>
The model breaks the word into the following token: lol-li-pop <br>
(as these are the commonly occouring tokens)

**Limitation on the number of tokes** <br>
gpt3.5-turbo has a limitation of ~4000 tokens (input + context + output)


Now, lets f ask a question which fixes this isses 

In [7]:
response = get_completion("""Take the letters in \
l-o-l-l-i-p-o-p and reverse them""")

In [8]:
response

'p-o-p-i-l-l-o-l'

## Helper function (chat format)
Here's the helper function we'll use in this course.

In [9]:
def get_completion_from_messages(messages, 
                                 model="gpt-3.5-turbo", 
                                 temperature=0, 
                                 max_tokens=500):
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=temperature, # this is the degree of randomness of the model's output
        max_tokens=max_tokens, # the maximum number of tokens the model can ouptut 
    )
    return response.choices[0].message["content"]

In [10]:
messages =  [  
{'role':'system', 
 'content':"""You are an assistant who\
 responds in the style of Dr Seuss."""},    
{'role':'user', 
 'content':"""write me a very short poem\
 about a happy carrot"""},  
] 
response = get_completion_from_messages(messages, temperature=1)
print(response)

Oh, the happy carrot is a sight to see,
So orange and bright, it fills me with glee!
With its leafy green top and cozy orange base,
It's a veggie that's sure to put a smile on your face!


In [11]:
# length
messages =  [  
{'role':'system',
 'content':'All your responses must be \
one sentence long.'},    
{'role':'user',
 'content':'write me a story about a happy carrot'},  
] 
response = get_completion_from_messages(messages, temperature =1)
print(response)

Once there was a happy carrot named Carl who loved the sunshine and the rain, and he grew up to be big and healthy, which made all the other vegetables in the garden very proud.


In [12]:
# combined
messages =  [  
{'role':'system',
 'content':"""You are an assistant who \
responds in the style of Dr Seuss. \
All your responses must be one sentence long."""},    
{'role':'user',
 'content':"""write me a story about a happy carrot"""},
] 
response = get_completion_from_messages(messages, 
                                        temperature =1)
print(response)

There once was a carrot named Larry, who was so happy he never felt weary.


In [13]:
def get_completion_and_token_count(messages, 
                                   model="gpt-3.5-turbo", 
                                   temperature=0, 
                                   max_tokens=500):
    
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=temperature, 
        max_tokens=max_tokens,
    )
    
    content = response.choices[0].message["content"]
    
    token_dict = {
'prompt_tokens':response['usage']['prompt_tokens'],
'completion_tokens':response['usage']['completion_tokens'],
'total_tokens':response['usage']['total_tokens'],
    }

    return content, token_dict

In [14]:
messages = [
{'role':'system', 
 'content':"""You are an assistant who responds\
 in the style of Dr Seuss."""},    
{'role':'user',
 'content':"""write me a very short poem \ 
 about a happy carrot"""},  
] 
response, token_dict = get_completion_and_token_count(messages)

In [15]:
print(response)

Oh, the happy carrot, so bright and so bold,
With a smile on its face, and a story untold.
It grew in the garden, with sun and with rain,
And now it's so happy, it can't help but exclaim!


In [16]:
print(token_dict)

{'prompt_tokens': 39, 'completion_tokens': 52, 'total_tokens': 91}


#### Notes on using the OpenAI API outside of this classroom

To install the OpenAI Python library:
```
!pip install openai
```

The library needs to be configured with your account's secret key, which is available on the [website](https://platform.openai.com/account/api-keys). 

You can either set it as the `OPENAI_API_KEY` environment variable before using the library:
 ```
 !export OPENAI_API_KEY='sk-...'
 ```

Or, set `openai.api_key` to its value:

```
import openai
openai.api_key = "sk-..."
```

#### A note about the backslash
- In the course, we are using a backslash `\` to make the text fit on the screen without inserting newline '\n' characters.
- GPT-3 isn't really affected whether you insert newline characters or not.  But when working with LLMs in general, you may consider whether newline characters in your prompt may affect the model's performance.