# Chapter 2 Language Model, Questioning Paradigm and Token

- [I. Environment Configuration](#I. Environment Configuration)
- [1.1 Loading API key and some Python libraries. ](#1.1-Loading-API-key-and-some-Python-libraries.)
- [1.2 Helper function auxiliary function](#1.2-Helper-function-auxiliary function)
- [II. Try to ask questions to the model and get results](#II. Try to ask questions to the model and get results)
- [III. Tokens](#III. Tokens)
- [IV. Helper function auxiliary function (questioning paradigm)](#IV. Helper-function-auxiliary function-(questioning paradigm))

In this chapter, we will share with you how large language models (LLMs) work, how they are trained, and how details such as tokenizers affect LLM output. We will also introduce LLM's chat format, a way to specify system messages and user messages, and show you how to take advantage of this capability.

## 1. Environment Configuration

### 1.1 Loading the API key and some Python libraries.
In this course, you are provided with some code to load the OpenAI API key.

In [None]:
!pip install openai
!pip install langchain
!pip install --upgrade tiktoken

In [1]:
import os
import openai
# import tiktoken This is not used later. If you are interested in its use, you can refer to this article for related content: https://zhuanlan.zhihu.com/p/629776230

# from dotenv import load_dotenv, find_dotenv
# _ = load_dotenv(find_dotenv()) # Read the local .env environment file. (It is recommended to use this method in the future and put the key in the .env file to protect your own key)

openai.api_key  = 'sk-***' # 更换成您自己的key

### 1.2 Helper function
If you have taken the "ChatGPT Prompt Engineering for Developers" course before, you will be relatively familiar with this.
Call this function and enter Prompt, and it will give the corresponding Completion.

In [2]:
# Official document writing https://platform.openai.com/overview

def get_completion(prompt, model="gpt-3.5-turbo"):
    """
    使用 OpenAI 的模型生成聊天回复。

    参数:
    prompt: 用户的输入，即聊天的提示。
    model: 使用的模型，默认为"gpt-3.5-turbo"。
    """
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0,
    )
    return response.choices[0].message["content"] # 模型生成的回复

## 2. Try to ask the model questions and get results

LLMs can be built using supervised learning, where they learn by repeatedly predicting the next word. And, given a large training set, tens of billions of words or more, you can create a large-scale training set where you start with a sentence or part of a text and repeatedly ask the language model to learn to predict what the next word is.

There are two main types of LLMs: Base LLMs and Instruction Tuned LLMs, which are becoming increasingly popular. Base LLMs are trained by repeatedly predicting the next word, so if we give it a prompt like "Once upon a time there was a unicorn," it might be able to complete a story about the unicorn's life in a magical forest with his other unicorn friends by predicting word by word.

However, the downside of this approach is that if you give it a prompt like "What is the capital of China?", it's likely that it has a list of quiz questions about China from the Internet in its data. At this point, it might answer the question with "What is the largest city in China? What is the population of China?" and so on. But in reality, you just want to know what the capital of China is, not to list all these questions. However, an Instruction Tuned LLM will try to follow the prompt and give the answer "The capital of China is Beijing."

So how do you turn a Base LLM into an Instruction Tuned LLM? This is the process of training an Instruction Tuned LLM (such as ChatGPT). First, you need to train the Base LLM on a lot of data, so you need hundreds of billions of words, or even more. This process can take months on large supercomputing systems. After training the base language model, you train it further on a small set of examples to make the model’s output match the input.For example, you can ask a contractor to help you write many examples of instructions and train the model on the correct responses to those instructions. This creates a training set for fine-tuning, where the model learns to predict the next word when following instructions.

Later, to improve the quality of the language model output, a common approach is to have humans rate many different outputs, such as useful, true, harmless, etc. You can then further tune the language model to increase the probability of generating highly rated outputs. This is often done using reinforcement learning human feedback (RLHF) techniques. Compared to the months it may take to train a base language model, the transition from a base language model to an instruction-fine-tuned language model may only take days, using smaller datasets and computing resources.

In [13]:
response = get_completion("What is the capital of China?")
print(response)

The capital of China is Beijing.


In [14]:
response = get_completion("中国的首都是哪里？")
print(response)

中国的首都是北京。


## Tokens

In the description of LLM so far, we have described it as predicting one word at a time, but there is actually a more important technical detail. That is, **`LLM does not actually repeatedly predict the next word, but repeatedly predicts the next token`**. When LLM receives input, it converts it into a series of tokens, where each token represents a common sequence of characters. For example, for the sentence "Learning new things is fun!", each word is converted into a token, and for less common words such as "Prompting as powerful developer tool", the word "prompting" is split into three tokens, namely "prom", "pt", and "ing".

When you ask ChatGPT to reverse the letters of "lollipop", ChatGPT has difficulty outputting the correct order of the letters because the tokenizer breaks "lollipop" into three tokens, namely "l", "oll", "ipop". You can help ChatGPT better recognize each letter in a word and output them correctly by adding hyphens or spaces between letters so that the tokenizer breaks each letter into separate tokens..

In [15]:
# In order to better display the effect, there is no prompt translated into Chinese here
# Note that the letter flipping here is wrong. Andrew Ng uses this example to explain how tokens are calculated.
response = get_completion("Take the letters in lollipop \
and reverse them")
print(response)

The reversed letters of "lollipop" are "pillipol".


"lollipop" in reverse should be "popillol"

In [17]:
response = get_completion("""Take the letters in \
l-o-l-l-i-p-o-p and reverse them""")

print(response)

p-o-p-i-l-l-o-l


![Tokens.png](../../figures/Tokens.png)

For English input, a token generally corresponds to 4 characters or three quarters of a word; for Chinese input, a token generally corresponds to one or half a word.

Different models have different token limits. It should be noted that the token limit here is the sum of the input prompt and the output completion tokens. Therefore, the longer the input prompt, the lower the upper limit of the completion that can be output.

The token upper limit of ChatGPT3.5-turbo is 4096.

## 4. Helper function (questioning paradigm)
The following are the helper functions used in the course.
The following figure is a questioning paradigm provided by OpenAI. Next, Mr. Andrew Ng demonstrates how to use this paradigm to ask better questions
![Chat-format.png](../../figures/Chat-format.png)

System information is used to specify the rules of the model, such as settings and answer criteria, while assistant information is the specific instructions for the model to complete.

In [5]:
def get_completion_from_messages(messages, 
                                 model="gpt-3.5-turbo", 
                                 temperature=0, 
                                 max_tokens=500):
    '''
    封装一个支持更多参数的自定义访问 OpenAI GPT3.5 的函数

    参数: 
    messages: 这是一个消息列表，每个消息都是一个字典，包含 role(角色）和 content(内容)。角色可以是'system'、'user' 或 'assistant’，内容是角色的消息。
    model: 调用的模型，默认为 gpt-3.5-turbo(ChatGPT)，有内测资格的用户可以选择 gpt-4
    temperature: 这决定模型输出的随机程度，默认为0，表示输出将非常确定。增加温度会使输出更随机。
    max_tokens: 这决定模型输出的最大的 token 数。
    '''
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=temperature, # 这决定模型输出的随机程度
        max_tokens=max_tokens, # 这决定模型输出的最大的 token 数
    )
    return response.choices[0].message["content"]

In [6]:
messages =  [  
{'role':'system', 
 'content':"""You are an assistant who\
 responds in the style of Dr Seuss."""},    
{'role':'user', 
 'content':"""write me a very short poem\
 about a happy carrot"""},  
] 
response = get_completion_from_messages(messages, temperature=1)
print(response)

In a garden so bright, a carrot would sprout,
With a cheery orange hue, without a doubt.
With a leafy green top, it danced in the breeze,
A happy carrot, so eager to please.

It grew in the soil, oh so deep and grand,
Stretching its roots, reaching far and expand.
With a joyful smile, it soaked up the sun,
Growing tall and strong, its journey begun.

Days turned to weeks, as it grew day and night,
Round and plump, it was quite a delight.
With every raindrop that fell from above,
The carrot grew sweeter, spreading more love.

At last, the day came when it was time to eat,
With a grin on my face, I took a seat.
I chopped and I sliced, so grateful, you see,
For this happy carrot, bringing joy to me.

So let us remember, when times may get tough,
A happy carrot's journey, it's enough.
For even in darkness, there's always delight,
Just like a carrot, shining so bright.


In [18]:
messages =  [  
{'role':'system', 
 'content':'你是一个助理， 并以 Seuss 苏斯博士的风格作出回答。'},    
{'role':'user', 
 'content':'就快乐的小鲸鱼为主题给我写一首短诗'},  
] 
response = get_completion_from_messages(messages, temperature=1)
print(response)

在海洋的深处，有一只小鲸鱼，
她快乐又聪明，灵感从不匮乏。
她游遍五大洲，探索未知的秘密，
用歌声传递喜悦，令人心旷神怡。

她跃出海面，高高的飞翔，
尾巴抽空着水花，像梦幻般的画。
她和海豚一起，跳跃在太阳下，
与海洋中的生命，在欢乐中共舞。

她喜欢和海龟一起，缓缓漫游，
看美丽的珊瑚，和色彩鲜艳的鱼群。
她欢迎每个新朋友，无论大或小，
因为在她眼中，每个人都独特而珍贵。

她知道快乐是如此简单，如此宝贵，
在每个时刻中，她都努力传达幸福的表情。
所以当你感到疲惫，沮丧或者低落，
想起小鲸鱼的快乐，让你心中再次充满鲜活。


In [7]:
# length
messages =  [  
{'role':'system',
 'content':'All your responses must be \
one sentence long.'},    
{'role':'user',
 'content':'write me a story about a happy carrot'},  
] 
response = get_completion_from_messages(messages, temperature =1)
print(response)

Once upon a time, there was a cheerful carrot named Charlie who always brightened everyone's day with his vibrant orange color and contagious laughter.


In [19]:
# Length Control
messages =  [  
{'role':'system',
 'content':'你的所有答复只能是一句话'},    
{'role':'user',
 'content':'写一个关于快乐的小鲸鱼的故事'},  
] 
response = get_completion_from_messages(messages, temperature =1)
print(response)

在追随波浪的起伏中，小鲸鱼快乐地跳跃着，因为它知道游泳的真正乐趣不仅仅在目的地，而是在于享受整个旅程。


In [8]:
# combined
messages =  [  
{'role':'system',
 'content':"""You are an assistant who \
responds in the style of Dr Seuss. \
All your responses must be one sentence long."""},    
{'role':'user',
 'content':"""write me a story about a happy carrot"""},
] 
response = get_completion_from_messages(messages, 
                                        temperature =1)
print(response)

Once upon a time, there was a carrot so happy and bright, it danced and sang from morning till night.


In [20]:
# Combination of the above
messages =  [  
{'role':'system',
 'content':'你是一个助理， 并以 Seuss 苏斯博士的风格作出回答，只回答一句话'},    
{'role':'user',
 'content':'写一个关于快乐的小鲸鱼的故事'},
] 
response = get_completion_from_messages(messages, temperature =1)
print(response)

在蓝色的大海里，有一只小鲸鱼，无忧无虑，快乐游泳，一切因快乐而变得光辉。


In [9]:
def get_completion_and_token_count(messages, 
                                   model="gpt-3.5-turbo", 
                                   temperature=0, 
                                   max_tokens=500):
    """
    使用 OpenAI 的 GPT-3 模型生成聊天回复，并返回生成的回复内容以及使用的 token 数量。

    参数:
    messages: 聊天消息列表。
    model: 使用的模型名称。默认为"gpt-3.5-turbo"。
    temperature: 控制生成回复的随机性。值越大，生成的回复越随机。默认为 0。
    max_tokens: 生成回复的最大 token 数量。默认为 500。

    返回:
    content: 生成的回复内容。
    token_dict: 包含'prompt_tokens'、'completion_tokens'和'total_tokens'的字典，分别表示提示的 token 数量、生成的回复的 token 数量和总的 token 数量。
    """
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=temperature, 
        max_tokens=max_tokens,
    )

    content = response.choices[0].message["content"]
    
    token_dict = {
'prompt_tokens':response['usage']['prompt_tokens'],
'completion_tokens':response['usage']['completion_tokens'],
'total_tokens':response['usage']['total_tokens'],
    }

    return content, token_dict

In [24]:
messages = [
{'role':'system', 
 'content':"""You are an assistant who responds\
 in the style of Dr Seuss."""},    
{'role':'user',
 'content':"""write me a very short poem \ 
 about a happy carrot"""},  
] 
response, token_dict = get_completion_and_token_count(messages)
print(response)

In a garden so bright, with colors so cheery,
There lived a carrot, oh so merry!
With a vibrant orange hue, and a leafy green top,
This happy carrot just couldn't stop.

It danced in the breeze, with a joyful sway,
Spreading happiness throughout the day.
With a smile so wide, and eyes full of glee,
This carrot was as happy as can be.

It loved the sunshine, and the rain's gentle touch,
Growing tall and strong, oh so much!
From the earth it sprouted, reaching for the sky,
A happy carrot, oh my, oh my!

So if you're feeling down, just remember this tale,
Of a carrot so happy, it'll never fail.
Find joy in the little things, and let your heart sing,
Just like that carrot, oh what joy it will bring!


In [22]:
messages =  [  
{'role':'system', 
 'content':'你是一个助理， 并以 Seuss 苏斯博士的风格作出回答。'},    
{'role':'user', 
 'content':'就快乐的小鲸鱼为主题给我写一首短诗'},  
] 
response, token_dict = get_completion_and_token_count(messages)
print(response)

In [25]:
print(token_dict)

{'prompt_tokens': 37, 'completion_tokens': 173, 'total_tokens': 210}


Finally, we believe that the revolutionary impact of Prompt on AI application development is still underappreciated. In the traditional supervised machine learning workflow, if you want to build a classifier that can classify restaurant reviews as positive or negative, you first need to get a large amount of labeled data, maybe hundreds of them, and this process may take weeks or even a month. Then you need to train a model on this data, find a suitable open source model, and do model tuning and evaluation, which may take days, weeks, or even months. Finally, you may need to use cloud services to deploy the model, upload the model to the cloud, and let it run before you can finally call your model. The whole process usually takes a team months to complete.

In contrast, with the prompt-based machine learning approach, when you have a text application, you just provide a simple prompt. This process may only take a few minutes, or at most a few hours if it takes multiple iterations to get a valid prompt. Within a few days (although in practice it is usually a few hours), you can run the model through API calls and start using it. Once you reach this step, it only takes a few minutes or hours to start calling the model for inference. As a result, applications that previously might have taken six months or even a year to build can now be built in minutes, hours, or at most days.time, you can build it using Prompt. This approach is dramatically changing how AI applications are built quickly.

It is important to note that this approach works for many unstructured data applications, especially text applications, and increasingly vision applications, although current vision technology is still developing. It does not work for structured data applications, that is, machine learning applications that process large numbers of values ​​in Excel spreadsheets. However, for applications that are suitable for this approach, AI components can be built quickly and are changing the construction workflow of the entire system. Building the entire system may still take days, weeks, or longer, but at least this part can be done faster.

In the next chapter, we'll show how to use these components to evaluate the input of a customer service assistant.
This will be part of a more complete example in this course of building a customer service assistant for an online retailer.