#ChatGPT Prompting Guidelines

###There are broadly two types of LLMs

####Base LLM : Predicts next word, based on text training data often trained on large amount of data found on the internet

####Instruction Tuned LLM : Tries to follow instructions as it is Fine-tuned on instructions and good attempts at following those instructions and then Reinforcement Learning with Human Feedback(RHLF)

##Guidelines for Prompting

Two Key Principles for writing prompts:

1. Write clear and specific instructions
2. Give the model time to think

In [1]:
!pip install openai



In [14]:
from openai import OpenAI
import os

In [13]:
from google.colab import userdata

os.environ['OPENAI_API_KEY'] = userdata.get('openai')

In [21]:
api_key = os.getenv('OPENAI_API_KEY')
client = OpenAI(api_key=api_key)

In [22]:
def get_completion(prompt, model='gpt-3.5-turbo'):
  messages = [{'role':'user','content':prompt}]
  response = client.chat.completions.create(
      model=model,
      messages=messages,
      temperature=0,
  )
  return response.choices[0].message.content

##Principle 1 : Write clear and specific instructions|

##Tactics


###Tactic 1: Use delimiters to clearly indicate distinct parts of the input
- Delimiters can be anything like: ```, ---,""", < >, `<tag> </tag>`, `:`

In [24]:
text = """
You should express what you want a model to do by
providing instructions that are as clear and
specific as you can possibly make them.
This will guide the model towards the desired output,
and reduce the chances of receiving irrelevant
or incorrect responses. Don't confuse writing a
clear prompt with writing a short prompt.
In many cases, longer prompts provide more clarity
and context for the model, which can lead to
more detailed and relevant outputs.
"""

prompt = f"""
Summarize the text delimited by triple backticks
into a single sentence.

```{text}```
"""

response = get_completion(prompt)
print(response)

It is important to provide clear and specific instructions to guide a model towards the desired output and reduce the chances of receiving irrelevant or incorrect responses, even if it means writing longer prompts for more clarity and context.


Using Delimiters helps to avoid prompt injections(conflicting instructions to the model)

###Tactic 2: Ask for structured output

HTML, JSON

In [26]:
prompt = f"""
Generate a list of three made-up book titles along
with their authors and genres.
Provide them in JSON format with the following keys:
book_id, title, author, genre.
"""
response = get_completion(prompt)
print(response)

[
    {
        "book_id": 1,
        "title": "The Midnight Garden",
        "author": "Elena Rivers",
        "genre": "Fantasy"
    },
    {
        "book_id": 2,
        "title": "Echoes of the Past",
        "author": "Nathan Black",
        "genre": "Mystery"
    },
    {
        "book_id": 3,
        "title": "Whispers in the Wind",
        "author": "Samantha Reed",
        "genre": "Romance"
    }
]


###Tactic 3 : Ask the model to check whether conditions are satisfied

Check assumptions required to do the task

In [27]:
text_1 = f"""
Making a cup of tea is easy! First, you need to get some
water boiling. While that's happening,
grab a cup and put a tea bag in it. Once the water is
hot enough, just pour it over the tea bag.
Let it sit for a bit so the tea can steep. After a
few minutes, take out the tea bag. If you
like, you can add some sugar or milk to taste.
And that's it! You've got yourself a delicious
cup of tea to enjoy.
"""
prompt = f"""
You will be provided with text delimited by triple quotes.
If it contains a sequence of instructions,
re-write those instructions in the following format:

Step 1 - ...
Step 2 - …
…
Step N - …

If the text does not contain a sequence of instructions,
then simply write \"No steps provided.\"

\"\"\"{text_1}\"\"\"
"""
response = get_completion(prompt)
print("Completion for Text 1:")
print(response)

Completion for Text 1:

Step 1 - Get some water boiling.
Step 2 - Grab a cup and put a tea bag in it.
Step 3 - Pour the hot water over the tea bag.
Step 4 - Let the tea steep for a few minutes.
Step 5 - Take out the tea bag.
Step 6 - Add sugar or milk to taste.
Step 7 - Enjoy your delicious cup of tea.


In [28]:
text_2 = f"""
The sun is shining brightly today, and the birds are
singing. It's a beautiful day to go for a
walk in the park. The flowers are blooming, and the
trees are swaying gently in the breeze. People
are out and about, enjoying the lovely weather.
Some are having picnics, while others are playing
games or simply relaxing on the grass. It's a
perfect day to spend time outdoors and appreciate the
beauty of nature.
"""
prompt = f"""
You will be provided with text delimited by triple quotes.
If it contains a sequence of instructions,
re-write those instructions in the following format:

Step 1 - ...
Step 2 - …
…
Step N - …

If the text does not contain a sequence of instructions,
then simply write \"No steps provided.\"

\"\"\"{text_2}\"\"\"
"""
response = get_completion(prompt)
print("Completion for Text 2:")
print(response)

Completion for Text 2:
No steps provided.


###Tactic 4 : "Few-shot" prompting

Give successful examples of completing the tasks and then ask the model to perform the task

In [29]:
prompt = f"""
Your task is to answer in a consistent style.

<child>: Teach me about patience.

<grandparent>: The river that carves the deepest
valley flows from a modest spring; the
grandest symphony originates from a single note;
the most intricate tapestry begins with a solitary thread.

<child>: Teach me about resilience.
"""
response = get_completion(prompt)
print(response)

<grandparent>: Resilience is like a tree that bends in the storm but does not break. It is the ability to bounce back from adversity, to persevere in the face of challenges, and to find strength in difficult times. Just as a tree grows stronger with each storm it weathers, so too can we grow stronger through resilience.


The second principle is to give the model time to think.
If a model is making reasoning errors by
rushing to an incorrect conclusion, you should try reframing the query
to request a chain or series of relevant reasoning
before the model provides its final answer. Another way to think about
this is that if you give a model a task that's
too complex for it to do in a short amount
of time or in a small number of words, it
may make up a guess which is likely to be incorrect. And
you know, this would happen for a person too. If
you ask someone to complete a complex math
question without time to work out the answer first, they
would also likely make a mistake. So, in these situations, you
can instruct the model to think longer about a problem, which
means it's spending more computational effort on
the task.

##Principle 2 : Give the model time to think


###Tactic 1 : Specify the steps required to complete the task

In [30]:
text = f"""
In a charming village, siblings Jack and Jill set out on
a quest to fetch water from a hilltop
well. As they climbed, singing joyfully, misfortune
struck—Jack tripped on a stone and tumbled
down the hill, with Jill following suit.
Though slightly battered, the pair returned home to
comforting embraces. Despite the mishap,
their adventurous spirits remained undimmed, and they
continued exploring with delight.
"""
# example 1
prompt_1 = f"""
Perform the following actions:
1 - Summarize the following text delimited by triple
backticks with 1 sentence.
2 - Translate the summary into French.
3 - List each name in the French summary.
4 - Output a json object that contains the following
keys: french_summary, num_names.

Separate your answers with line breaks.

Text:
```{text}```
"""
response = get_completion(prompt_1)
print("Completion for prompt 1:")
print(response)

Completion for prompt 1:
1 - Jack and Jill go on a quest to fetch water, but encounter misfortune along the way, yet remain adventurous.

2 - Jack et Jill partent en quête d'eau, mais rencontrent des malheurs en chemin, tout en restant aventureux.

3 - Jack, Jill

4 - 
{
  "french_summary": "Jack et Jill partent en quête d'eau, mais rencontrent des malheurs en chemin, tout en restant aventureux.",
  "num_names": 2
}


Ask a specified format

In [31]:
prompt_2 = f"""
Your task is to perform the following actions:
1 - Summarize the following text delimited by
  <> with 1 sentence.
2 - Translate the summary into French.
3 - List each name in the French summary.
4 - Output a json object that contains the
  following keys: french_summary, num_names.

Use the following format:
Text: <text to summarize>
Summary: <summary>
Translation: <summary translation>
Names: <list of names in summary>
Output JSON: <json with summary and num_names>

Text: <{text}>
"""
response = get_completion(prompt_2)
print("\nCompletion for prompt 2:")
print(response)


Completion for prompt 2:
Summary: Jack and Jill, two siblings, go on a quest to fetch water from a hilltop well but encounter misfortune along the way.

Translation: Jack et Jill, deux frères et sœurs, partent en quête d'eau d'un puits au sommet d'une colline mais rencontrent des malheurs en chemin.

Names: Jack, Jill

Output JSON: 
{
  "french_summary": "Jack et Jill, deux frères et sœurs, partent en quête d'eau d'un puits au sommet d'une colline mais rencontrent des malheurs en chemin.",
  "num_names": 2
}


The next tactic is to instruct the model to work out its own
solution before rushing to a conclusion. And again, sometimes
we get better results when we kind of explicitly
instruct the models to reason out its own solution
before coming to a conclusion. And this is kind of
the same idea that we were discussing about
giving the model time to actually work things
out before just kind of saying if an
answer is correct or not, in the same way that a person would.

###Tactic 2: Instruct the model to work out its own solution before rushing to a conclusion

In [33]:
prompt = f"""
Determine if the student's solution is correct or not.

Question:
I'm building a solar power installation and I need
 help working out the financials.
- Land costs $100 / square foot
- I can buy solar panels for $250 / square foot
- I negotiated a contract for maintenance that will cost
me a flat $100k per year, and an additional $10 / square
foot
What is the total cost for the first year of operations
as a function of the number of square feet.

Student's Solution:
Let x be the size of the installation in square feet.
Costs:
1. Land cost: 100x
2. Solar panel cost: 250x
3. Maintenance cost: 100,000 + 100x
Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000
"""
response = get_completion(prompt)
print(response)

The student's solution is correct. The total cost for the first year of operations as a function of the number of square feet is indeed 450x + 100,000.


The student solution above is wrong because it just skimmed and arrived at the same conclusion. We can fix this by instructing the model to work out its own solution first

In [35]:
prompt = f"""
Your task is to determine if the student's solution
is correct or not.
To solve the problem do the following:
- First, work out your own solution to the problem including the final total.
- Then compare your solution to the student's solution
and evaluate if the student's solution is correct or not.
Don't decide if the student's solution is correct until
you have done the problem yourself.

Use the following format:
Question:
```
question here
```
Student's solution:
```
student's solution here
```
Actual solution:
```
steps to work out the solution and your solution here
```
Is the student's solution the same as actual solution
just calculated:
```
yes or no
```
Student grade:
```
correct or incorrect
```

Question:
```
I'm building a solar power installation and I need help
working out the financials.
- Land costs $100 / square foot
- I can buy solar panels for $250 / square foot
- I negotiated a contract for maintenance that will cost
me a flat $100k per year, and an additional $10 / square
foot
What is the total cost for the first year of operations
as a function of the number of square feet.
```
Student's solution:
```
Let x be the size of the installation in square feet.
Costs:
1. Land cost: 100x
2. Solar panel cost: 250x
3. Maintenance cost: 100,000 + 100x
Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000
```
Actual solution:
"""
response = get_completion(prompt)
print(response)

To calculate the total cost for the first year of operations, we need to add up the costs of land, solar panels, and maintenance.

Given:
- Land cost: $100 / square foot
- Solar panel cost: $250 / square foot
- Maintenance cost: $100,000 flat + $10 / square foot

Let x be the size of the installation in square feet.

Costs:
1. Land cost: $100x
2. Solar panel cost: $250x
3. Maintenance cost: $100,000 + $10x

Total cost: $100x + $250x + $100,000 + $10x = $360x + $100,000

Therefore, the total cost for the first year of operations as a function of the number of square feet is $360x + $100,000.

Is the student's solution the same as the actual solution just calculated:
```
No
```
Student grade:
```
Incorrect
```


This is an example of how asking the model to do a calculation itself and breaking
down the task into steps to give the
model more time to think can help you
get more accurate responses.

##Model Limitations: Hallucinations

It's really important to keep these in
mind while you're kind of developing applications with large language models.
So, even though the language model has been exposed to
a vast amount of knowledge during its training process,
it has not perfectly memorized the information
it's seen, and so, it doesn't know the boundary of
its knowledge very well. This means that it might
try to answer questions about obscure topics and can
make things up that sound plausible but are not actually true. And
we call these fabricated ideas hallucinations.

In [36]:
prompt = f"""
Tell me about AeroGlide UltraSlim Smart Toothbrush by Boie
"""
response = get_completion(prompt)
print(response)

The AeroGlide UltraSlim Smart Toothbrush by Boie is a high-tech toothbrush designed to provide a superior cleaning experience. It features ultra-soft bristles that are gentle on the gums and teeth, while still effectively removing plaque and debris. The toothbrush also has a slim design that makes it easy to maneuver and reach all areas of the mouth.

One of the standout features of the AeroGlide UltraSlim Smart Toothbrush is its smart technology. It connects to a mobile app that tracks your brushing habits and provides personalized recommendations for improving your oral hygiene routine. The app also includes a timer to ensure you are brushing for the recommended two minutes.

The toothbrush is made from durable, antimicrobial materials that resist bacteria growth and can be easily cleaned and sanitized. It is also eco-friendly, as the brush head is replaceable and the handle is made from recyclable materials.

Overall, the AeroGlide UltraSlim Smart Toothbrush by Boie is a sleek and i

- Boie is a real company, the product name is not real.

This is an example of
where the model confabulates a description of a
made-up product name from a real toothbrush company.

The tactic to reduce hallucinations, in the
case that you want the model to kind of
generate answers based on a text, is to ask
the model to first find any relevant quotes from the text and
then ask it to use those quotes to kind of answer questions.
And kind of having a way to trace the answer back
to the source document is often pretty helpful
to kind of reduce these hallucinations.