# How to format inputs to ChatGPT models

ChatGPT is powered by `gpt-3.5-turbo`.

You can build your own applications with `gpt-3.5-turbo` using the OpenAI API.

Chat models take a series of messages as input, and return an AI-written message as output.

This guide illustrates the chat format with a few example API calls.

## 1. Import the openai library

In [4]:
# if needed, install and/or upgrade to the latest version of the OpenAI Python library
!pip install --upgrade openai

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting openai
  Downloading openai-0.27.4-py3-none-any.whl (70 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m70.3/70.3 kB[0m [31m4.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting aiohttp
  Downloading aiohttp-3.8.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m27.8 MB/s[0m eta [36m0:00:00[0m
Collecting aiosignal>=1.1.2
  Downloading aiosignal-1.3.1-py3-none-any.whl (7.6 kB)
Collecting yarl<2.0,>=1.0
  Downloading yarl-1.9.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (269 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m269.3/269.3 kB[0m [31m23.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting multidict<7.0,>=4.5
  Downloading multidict-6.0.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (114 kB)
[2K     [90m━━━

In [10]:
# import the OpenAI Python library for calling the OpenAI API
import openai

# 2. An example chat API call

A chat API call has two required inputs:
- `model`: the name of the model you want to use (e.g., `gpt-3.5-turbo`)
- `messages`: a list of message objects, where each object has at least two fields:
    - `role`: the role of the messenger (either `system`, `user`, or `assistant`)
    - `content`: the content of the message (e.g., `Write me a beautiful poem`)

Typically, a conversation will start with a system message, followed by alternating user and assistant messages, but you are not required to follow this format.

Let's look at an example chat API calls to see how the chat format works in practice.

In [11]:
#@title openai.api_key
openai.api_key = "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"

In [12]:
# Example OpenAI Python library request
MODEL = "gpt-3.5-turbo"
response = openai.ChatCompletion.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Knock knock."},
        {"role": "assistant", "content": "Who's there?"},
        {"role": "user", "content": "Orange."},
    ],
    temperature=0,
)

response

<OpenAIObject chat.completion id=chatcmpl-78v2K5WxPFs1n5goyReDyqh2dsH9W at 0x7f128eb29e00> JSON: {
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "Orange who?",
        "role": "assistant"
      }
    }
  ],
  "created": 1682360228,
  "id": "chatcmpl-78v2K5WxPFs1n5goyReDyqh2dsH9W",
  "model": "gpt-3.5-turbo-0301",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 3,
    "prompt_tokens": 39,
    "total_tokens": 42
  }
}

As you can see, the response object has a few fields:
- `id`: the ID of the request
- `object`: the type of object returned (e.g., `chat.completion`)
- `created`: the timestamp of the request
- `model`: the full name of the model used to generate the response
- `usage`: the number of tokens used to generate the replies, counting prompt, completion, and total
- `choices`: a list of completion objects (only one, unless you set `n` greater than 1)
    - `message`: the message object generated by the model, with `role` and `content`
    - `finish_reason`: the reason the model stopped generating text (either `stop`, or `length` if `max_tokens` limit was reached)
    - `index`: the index of the completion in the list of choices

Extract just the reply with:

In [13]:
response['choices'][0]['message']['content']

'Orange who?'

Even non-conversation-based tasks can fit into the chat format, by placing the instruction in the first user message.

For example, to ask the model to explain asynchronous programming in the style of the pirate Blackbeard, we can structure conversation as follows:

In [14]:
# example with a system message
response = openai.ChatCompletion.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain reinforcement learning in the style of the pirate Blackbeard."},
    ],
    temperature=0,
)

print(response['choices'][0]['message']['content'])


Ahoy matey! Reinforcement learning be like a treasure hunt on the high seas. Ye start with a ship and a map, but ye don't know where the treasure be buried. Ye must explore the waters, make decisions, and learn from yer mistakes. Every time ye find a clue or a piece of treasure, ye get a reward. And every time ye make a wrong turn or get attacked by another ship, ye get a punishment. 

Over time, ye become smarter and more skilled at navigating the waters and finding the treasure. Ye start to recognize patterns and make better decisions. And eventually, ye find the treasure and become the richest pirate on the seven seas! That be the power of reinforcement learning, me hearties. It be a way to teach machines to learn from their experiences and improve their performance over time. Arrr!


In [15]:
# example without a system message
response = openai.ChatCompletion.create(
    model=MODEL,
    messages=[
        {"role": "user", "content": "Explain reinforcement learning in the style of the pirate Blackbeard."},
    ],
    temperature=0,
)

print(response['choices'][0]['message']['content'])


Ahoy mateys! Let me tell ye about a fancy way of teachin' a computer to learn like a pirate. It be called reinforcement learnin'.

Now, imagine ye be a young swashbuckler, just startin' out on yer piratin' journey. Ye don't know much about the sea or how to navigate it. But every time ye make a good decision, like steerin' the ship away from a dangerous reef, ye get a reward - maybe a pat on the back from yer captain or a share of the loot.

Reinforcement learnin' be like that. A computer program starts out not knowin' much, but every time it makes a good decision, it gets a reward. And every time it makes a bad decision, it gets a punishment. Over time, the program learns which actions lead to rewards and which lead to punishments, just like a young pirate learns which decisions lead to success and which lead to failure.

So, if ye want to teach a computer to learn like a pirate, ye use reinforcement learnin'. And if ye want to be a successful pirate, ye keep makin' good decisions and

## 3. Tips for instructing gpt-3.5-turbo-0301

Best practices for instructing models may change from model version to model version. The advice that follows applies to `gpt-3.5-turbo-0301` and may not apply to future models.

### System messages

The system message can be used to prime the assistant with different personalities or behaviors.

However, the model does not generally pay as much attention to the system message, and therefore we recommend placing important instructions in the user message instead.

In [16]:
# An example of a system message that primes the assistant to explain concepts in great depth
response = openai.ChatCompletion.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a friendly and helpful teaching assistant. You explain concepts in great depth using simple terms, and you give examples to help people learn. At the end of each explanation, you ask a question to check for understanding"},
        {"role": "user", "content": "Can you explain how reinforcement learning work?"},
    ],
    temperature=0,
)

print(response["choices"][0]["message"]["content"])


Sure! Reinforcement learning is a type of machine learning where an agent learns to make decisions by interacting with an environment. The agent receives feedback in the form of rewards or punishments based on its actions, and it uses this feedback to learn which actions are good and which are bad.

The goal of reinforcement learning is to maximize the total reward the agent receives over time. The agent does this by learning a policy, which is a mapping from states to actions. The policy tells the agent what action to take in each state to maximize its expected future reward.

To learn the policy, the agent uses a trial-and-error approach. It starts by taking random actions and observing the rewards it receives. Over time, it learns which actions lead to higher rewards and which lead to lower rewards. It then adjusts its policy to take more of the good actions and fewer of the bad ones.

At the end of each explanation, I like to ask a question to check for understanding. So, can you t

In [18]:
# An example of a system message that primes the assistant to give brief, to-the-point answers
response = openai.ChatCompletion.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a laconic assistant. You reply with brief, to-the-point answers with no elaboration."},
        {"role": "user", "content": "Can you explain how reinforcement learning work?"},
    ],
    temperature=0,
)

print(response["choices"][0]["message"]["content"])


Reinforcement learning involves an agent learning to make decisions based on rewards and punishments received from its environment.


### Few-shot prompting

In some cases, it's easier to show the model what you want rather than tell the model what you want.

One way to show the model what you want is with faked example messages.

For example:

In [29]:
# An example of a faked few-shot conversation to prime the model into translating business jargon to simpler speech
response = openai.ChatCompletion.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a helpful, pattern-following assistant."},
        {"role": "user", "content": "Help me translate the following corporate jargon into plain English."},
        {"role": "assistant", "content": "Sure, I'd be happy to!"},
        {"role": "user", "content": "New synergies will help drive top-line growth."},
        {"role": "assistant", "content": "Things working well together will increase revenue."},
        {"role": "user", "content": "Let's circle back when we have more bandwidth to touch base on opportunities for increased leverage."},
        {"role": "assistant", "content": "Let's talk later when we're less busy about how to do better."},
        {"role": "user", "content": "This late pivot means we don't have time to boil the ocean for the client deliverable."},
    ],
    temperature=0,
)

print(response["choices"][0]["message"]["content"])


We don't have enough time to complete the entire project perfectly.


To help clarify that the example messages are not part of a real conversation, and shouldn't be referred back to by the model, you can instead set the `name` field of `system` messages to `example_user` and `example_assistant`.

Transforming the few-shot example above, we could write:

In [33]:
# The business jargon translation example, but with example names for the example messages
response = openai.ChatCompletion.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a helpful, pattern-following assistant that translates corporate jargon into plain English."},
        {"role": "system", "name":"example_user", "content": "New synergies will help drive top-line growth."},
        {"role": "system", "name": "example_assistant", "content": "Things working well together will increase revenue."},
        {"role": "system", "name":"example_user", "content": "Let's circle back when we have more bandwidth to touch base on opportunities for increased leverage."},
        {"role": "system", "name": "example_assistant", "content": "Let's talk later when we're less busy about how to do better."},
        {"role": "user", "content": "This late pivot means we don't have time to boil the ocean for the client deliverable."},
    ],
    temperature=0,
)

print(response["choices"][0]["message"]["content"])


This sudden change in plans means we don't have enough time to do everything for the client's project.


More examples: 

In [35]:
response = openai.ChatCompletion.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a professional artificial intelligence engineer."},
        {"role": "user", "content": "Formulate the tic-tac-toe problem as a reinforcement learning problem."},
    ],
    temperature=0,
)

print(response['choices'][0]['message']['content'])


In the game of tic-tac-toe, the reinforcement learning problem can be formulated as follows:

1. State: The state of the game at any given point can be represented as a 3x3 grid with each cell containing either 'X', 'O', or empty.

2. Action: At each state, the agent can take an action by placing its symbol ('X' or 'O') in an empty cell.

3. Reward: The agent receives a reward of +1 if it wins the game, -1 if it loses the game, and 0 for a draw.

4. Policy: The policy of the agent is to learn the optimal action to take at each state to maximize the expected cumulative reward.

5. Value function: The value function represents the expected cumulative reward for each state-action pair.

6. Q-learning: The agent can use Q-learning algorithm to learn the optimal policy by updating the Q-values for each state-action pair based on the observed rewards.

7. Exploration vs Exploitation: The agent can balance exploration and exploitation by using an epsilon-greedy policy, where it chooses a rand

In [36]:
response = openai.ChatCompletion.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a professional artificial intelligence engineer."},
        {"role": "user", "content": "Recall our boundary following robot. The correct behavior is that the robot should follow a boundary in a consistent way, clockwise if it’s inside the wall of the room, and counter-clockwise if it is outside an obstacle in the room. Formulate this task as a reinforcement learning problem."},
    ],
    temperature=0,
)

print(response['choices'][0]['message']['content'])

To formulate this task as a reinforcement learning problem, we need to define the following components:

1. State: The state of the robot at any given time can be defined by its position and orientation with respect to the boundary it is following.

2. Action: The action of the robot can be defined as the direction in which it should move next. The robot can either move forward, turn left or turn right.

3. Reward: The reward function should encourage the robot to follow the boundary in a consistent way. The robot should receive a positive reward for moving in the correct direction and a negative reward for moving in the wrong direction or colliding with an obstacle.

4. Policy: The policy should define the mapping between the state and the action. The policy should be learned through trial and error using reinforcement learning algorithms.

5. Environment: The environment should simulate the robot's interaction with the boundary and obstacles in the room.

Using these components, we c

Not every attempt at engineering conversations will succeed at first.

If your first attempts fail, don't be afraid to experiment with different ways of priming or conditioning the model.

As an example, one developer discovered an increase in accuracy when they inserted a user message that said "Great job so far, these have been perfect" to help condition the model into providing higher quality responses.


References:
https://github.com/openai/openai-cookbook

**Programming Assignment:**

In [108]:
response = openai.ChatCompletion.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are kind, careful HSBC bank accountant and you must provide answers to query about the bank service strictly following to charge scheme below paragraph.\
                                      You must reply with exact prices with careful calculation for queries regarding cost only from information provided below paragraph. \
                                      You must reply with brief explantion for all types of queries only from service information provided below paragraph. \
                                      \
                                      The charge for each service is (senior means aged 65 or above, underage means aged 18 or below): \
                                      Issue (Waived for senior for all accounts) or Repurchase a cahsier's order: Personal Customer = HK$75, Personal Integrated Account = HK$75,  HSBC One = HK$60, HSBC Premier = HK$40, HSBC Jade = Waived \
                                      Loss of cahsier's order: HK$60 plus $HK331 collected on behalf of Hong Kong Interbank Clearing Limited on circulars issued \
                                      Additional fee handling instruction not using Bank's standard form: Personal Customer = HK$150, Personal Integrated Account = HK$150, HSBC One = HK$150, HSBC Premier = HK$150, HSBC Jade = HK$150 \
                                      Notes changed, withdrawn or exchanged from coins (per bag of coin): Personal Customer = HK$2, Personal Integrated Account = HK$2, HSBC One = HK$2, HSBC Premier = HK$1, HSBC Jade = Waived \
                                      Coins deposit (below 500 coins): Waived for all accounts \
                                      Coins deposit (500 coins or more): Personal Customer = 2% of deposit (minimum = $HK50), Personal Integrated Account = 2% of deposit (minimum = $HK50), HSBC One = 2% of deposit (minimum = $HK50), HSBC Premier = 1% of deposit (minimum = $HK25), HSBC Jade = Waived \
                                      Bulk cash deposit (up to 200 notes): Waived for all accounts \
                                      Bulk cash deposit (over 200 notes): 0.25% of deposit (minimum = $HK50) for all accounts \
                                      Bulk cheque deposit (up to 30 piece): Waived for all accounts \
                                      Bulk cheque deposit (over 30 piece) (waived if cheque are deposited through cheque deposit machine or other non branch counter channels): HK$1 per cheque for all accounts \
                                      RMB note deposit, withdrawal: Waived for all accounts \
                                      Foreign Currency note deposit, withdrawal: Waived for all accounts \
                                      Gift cheque (Waived for senior for all accounts): Personal Customer = HK$10, Personal Integrated Account = HK$10, HSBC One = HK$8, HSBC Premier = Waived, HSBC Jade = Waived \
                                      Paper statement (Waived for underage, senior, receipients of governmnet allowance, physically disabled, visually impaired for all accounts): Personal Customer = HK$60, Personal Integrated Account = HK$60, HSBC One = HK$60, HSBC Premier = HK$60, HSBC Jade = HK$60 \
                                      Safe deposit boxes: All customer has to set up autopay from his/her account for annual safe deposit rental fee."},
    
        {"role": "user", "content": "1. How much to receive paper statements for a HSBC Jade account holder? \
                                    2. How much to receive paper statements for a senior HSBC Jade account holder? \
                                    3. How much to issue a cashier’s check for a senior HSBC One account holder? \
                                    4. How much is the charge to deposit 500 HK$5 coins for a HSBC Premier account holder? \
                                    5. How much is the charge to deposit 500 HK$5 coins for a HSBC One account holder? \
                                    6. Explain the main differences between HSBC Premier and HSBC One accounts based on the information"},
    ],

    temperature=0.1,
)

print(response['choices'][0]['message']['content'])

1. The charge to receive paper statements for a HSBC Jade account holder is HK$60. 
2. The charge to receive paper statements for a senior HSBC Jade account holder is waived. 
3. The charge to issue a cashier's check for a senior HSBC One account holder is HK$0 (waived). 
4. The charge to deposit 500 HK$5 coins for a HSBC Premier account holder is 1% of the deposit (minimum = HK$25). Therefore, the charge would be HK$25. 
5. The charge to deposit 500 HK$5 coins for a HSBC One account holder is 2% of the deposit (minimum = HK$50). Therefore, the charge would be HK$50. 
6. The main differences between HSBC Premier and HSBC One accounts are not explicitly stated in the information provided. However, based on the charges listed, it appears that HSBC Premier account holders receive lower fees for certain services (such as repurchasing a cashier's order and bulk cash deposits) compared to HSBC One account holders. Additionally, HSBC Premier account holders have a lower minimum charge for dep