# Prompt Engineering for Chat Model

**[ Please use m5.large to run this notebook ]**

## Elements of a Prompt: 
- **Instruction**: a specific task or instruction you want the model to perform. 
- **Context**: external information or additional context that can steer the model to better responses. 
- **Input Data**: the input or question that we are interested to find a response for. 
- **Output Indicator**: the type or format of the output.

## Constructing a prompt

Chat models have been trained with very different formats for converting conversations into a single tokenizable string. Using a format different from the format a model was trained with will usually cause severe, silent performance degradation, so matching the format used during training is extremely important! 

In a chat context, rather than continuing a single string of text (as is the case with a standard language model), the model instead continues a conversation that consists of one or more messages, each of which includes a role, like “user” or “assistant”, as well as message text.

### 1. OpenAI
#### Chat Markup Language (ChatML) - Transition to `messages` format
An increasingly common use case for LLMs is chat. In a chat context, rather than continuing a single string of text (as is the case with a standard language model), the model instead continues a conversation that consists of one or more messages, each of which includes a role, like “user” or “assistant”, as well as message text.

https://help.openai.com/en/articles/7042661-chatgpt-api-transition-guide<br>
https://platform.openai.com/docs/guides/text-generation/chat-completions-api



- Instead of sending a single string as your prompt, you send a list of messages as your input. 
- Each message in the list has two properties: role and content. 
    - The 'role' can take one of three values: 'system', 'user' or the 'assistant'
    - The 'content' contains the text of the message from the role. 
- The system instruction can give high level instructions for the conversation
- The messages are processed in the order they appear in the list, and the assistant responds accordingly. 

Example 
```json
 messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Who won the world series in 2020?"},
    {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
    {"role": "user", "content": "Where was it played?"}
 ]
```

### 2. Claude 
#### Human: / Assistant: formatting
https://docs.anthropic.com/claude/docs/constructing-a-prompt

*It is technically allowed by the API to include text before the first \n\nHuman:; this is sometimes called a "System Prompt". However, Claude does not currently attend to information in this location as strongly or as accurately as it does to text within the conversational turns. It's generally best to put all critical information and instructions in the post-\n\nHuman: part of the prompt.*

Example
```
Human: We want to de-identify some text by removing all personally identifiable information from this text so that it can be shared safely with external contractors.

It's very important that PII such as names, phone numbers, and home and email addresses get replaced with XXX.

Here is the text, inside <text></text> XML tags.
<text>
{{TEXT}}
</text>

Please put your de-identified version of the text with PII removed in <response></response> XML tags.

Assistant:
```

### 3. Llama2 
#### Prompt Template with special tokens
https://huggingface.co/blog/llama2

``` 
 <s>[INST] <<SYS>>
 {{ system_prompt }}
 <</SYS>>
 
 {{ user_msg_1 }} [/INST] {{ model_answer_1 }} </s><s>[INST] {{ user_msg_2 }} [/INST]
```


Example
```
<s>[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.

If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>

There's a llama in my garden 😱 What should I do? [/INST]
```

#### 4. How About Other LLMs?

## Templates for Chat Models
https://huggingface.co/docs/transformers/main/en/chat_templating<br>
https://huggingface.co/blog/chat-templates

Hugging Face tokenizers now have a `chat_template` attribute that can be used to save the chat format the model was trained with. 

Chat templates are part of the tokenizer. They specify how to convert conversations, represented as lists of messages, into a single tokenizable string in the format that the model expects.

`chat_template` normally keep in `tokenizer_config.json`, so recommend to check if chat_template exists in tokenizer_config.json.

Here is an example of tokenizer_config.json from **meta-llama/Llama-2-7b-chat-hf**
```json
{
  "add_bos_token": true,
  "add_eos_token": false,
  "bos_token": {
    "__type": "AddedToken",
    "content": "<s>",
    "lstrip": false,
    "normalized": false,
    "rstrip": false,
    "single_word": false
  },
  "chat_template": "{% if messages[0]['role'] == 'system' %}{% set loop_messages = messages[1:] %}{% set system_message = messages[0]['content'] %}{% else %}{% set loop_messages = messages %}{% set system_message = false %}{% endif %}{% for message in loop_messages %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% if loop.index0 == 0 and system_message != false %}{% set content = '<<SYS>>\\n' + system_message + '\\n<</SYS>>\\n\\n' + message['content'] %}{% else %}{% set content = message['content'] %}{% endif %}{% if message['role'] == 'user' %}{{ bos_token + '[INST] ' + content.strip() + ' [/INST]' }}{% elif message['role'] == 'assistant' %}{{ ' '  + content.strip() + ' ' + eos_token }}{% endif %}{% endfor %}",
  "clean_up_tokenization_spaces": false,
  "eos_token": {
    "__type": "AddedToken",
    "content": "</s>",
      .
      .
      .
}
```

You can also find other information in tokenizer_config.json, or config.json, generation_config.json, etc.
- **bos_token**: beginning of sentence token
- **eos_token**: end of sentence token
- **unk_token**: unknown token

### Generate Prompt from messages format with Hugging Face

In [2]:
!pip install -Uq torch transformers tiktoken sentencepiece

[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m23.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [3]:
print("Input Hugging Face Model ID")
print("---------------------------")
print("Example: TheBloke/Llama-2-7b-chat-fp16")
print("https://huggingface.co/models?pipeline_tag=text-generation")
print("")
HF_MODEL_ID = input()
print("")
print(f"HF_MODEL_ID = {HF_MODEL_ID}")

Input Hugging Face Model ID
---------------------------
Example: meta-llama/Llama-2-7b-chat-hf
https://huggingface.co/models?pipeline_tag=text-generation



 meta-llama/Llama-2-7b-chat-hf



HF_MODEL_ID = meta-llama/Llama-2-7b-chat-hf


In [4]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(HF_MODEL_ID, trust_remote_code=True)

messages = [
    { "role": "system", "content": "You are a teacher. You explain things for six-year old kids." },
    { "role": "user", "content": "What is the difference between a llama and an alpaca?" },
    #{ "role": "user", "content": "Hello, how are you?" },
    #{ "role": "assistant", "content": "I'm doing great. How can I help you today?" },
    #{ "role": "user", "content": "I'd like to show off how chat templating works!" }
 ]
tokenized_chat = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
prompt = tokenizer.decode(tokenized_chat[0])
print("")
print(tokenizer.decode(tokenized_chat[0]))


<s> [INST] <<SYS>>
You are a teacher. You explain things for six-year old kids.
<</SYS>>

What is the difference between a llama and an alpaca? [/INST]


In [32]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(HF_MODEL_ID, trust_remote_code=True)

messages = [
    { "role": "system", "content": """AnyCompany has a database with a table named sales_data containing sales records. The table has following columns:
- date (YYYY-MM-DD)
- product_id
- price
- units_sold
""" },
    { "role": "user", "content": """Can you generate SQL queries for the below: 
- Identify the top 5 best selling products by total sales for the year 2023
- Calculate the monthly average sales for the year 2023
""" },
 ]
tokenized_chat = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
prompt = tokenizer.decode(tokenized_chat[0])
print("")
print(tokenizer.decode(tokenized_chat[0]))

tokenizer_config.json:   0%|          | 0.00/749 [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/411 [00:00<?, ?B/s]


No chat template is defined for this tokenizer - using the default template for the CodeLlamaTokenizerFast class. If the default is not appropriate for your model, please set `tokenizer.chat_template` to an appropriate template. See https://huggingface.co/docs/transformers/main/chat_templating for more information.




<s> [INST] <<SYS>>
AnyCompany has a database with a table named sales_data containing sales records. The table has following columns:
- date (YYYY-MM-DD)
- product_id
- price
- units_sold

<</SYS>>

Can you generate SQL queries for the below: 
- Identify the top 5 best selling products by total sales for the year 2023
- Calculate the monthly average sales for the year 2023 [/INST]


### Other References

- OpenAI Prompt Engineering: https://platform.openai.com/docs/guides/prompt-engineering
- LLM Prompting Guide: https://huggingface.co/docs/transformers/tasks/prompting
- Learn Prompting: https://www.promptingguide.ai/
- Prompt Engineering Guide: https://learnprompting.org/docs/intro