In [1]:
# Copyright 2023 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

In [2]:
#Create a SERPAPI_API_KEY on https://serpapi.com/ for free to use (amongst others) the Google Search API
import os
os.environ['SERPAPI_API_KEY']=#YOUR API KEY HERE

# Prompt Design - Best Practices




## Overview

This notebook covers the essentials of prompt engineering, including some best practices.

Learn more about prompt design in the [official documentation](https://cloud.google.com/vertex-ai/docs/generative-ai/text/text-overview).

### Objective

In this notebook, you learn best practices around prompt engineering -- how to design prompts to improve the quality of your responses.

This notebook covers the following best practices for prompt engineering:

- Be concise
- Be specific and well-defined
- Ask one task at a time
- Turn generative tasks into classification tasks
- Improve response quality by including examples

### Costs
This tutorial uses billable components of Google Cloud:

* Vertex AI Generative AI Studio

Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing),
and use the [Pricing Calculator](https://cloud.google.com/products/calculator/)
to generate a cost estimate based on your projected usage.

### Install Vertex AI SDK

In [3]:
!pip install "shapely<2.0.0"
!pip install google-cloud-aiplatform --upgrade --user



**Colab only:** Uncomment the following cell to restart the kernel or use the button to restart the kernel. For Vertex AI Workbench you can restart the terminal using the button on top. 

In [2]:
# # Automatically restart kernel after installs so that your environment can access the new packages
# import IPython

# app = IPython.Application.instance()
# app.kernel.do_shutdown(True)

### Authenticating your notebook environment
* If you are using **Colab** to run this notebook, uncomment the cell below and continue.
* If you are using **Vertex AI Workbench**, check out the setup instructions [here](https://github.com/GoogleCloudPlatform/generative-ai/tree/main/setup-env).

In [None]:
# from google.colab import auth
# auth.authenticate_user()

### Import libraries

**Colab only:** Uncomment the following cell to initialize the Vertex AI SDK. For Vertex AI Workbench, you don't need to run this.  

In [None]:
# import vertexai

# PROJECT_ID = "[your-project-id]"  # @param {type:"string"}
# vertexai.init(project=PROJECT_ID, location="us-central1")

In [10]:
from vertexai.language_models import TextGenerationModel

### Load model

In [11]:
generation_model = TextGenerationModel.from_pretrained("text-bison@001")

### Model Settings
Use these two settings to control the length of your output:

**Output Length:** This is the max number of tokens that can be generated in the response. The higher the limit, the more likely the generated response is to veer off into unexpected territory. Consider shortening the output length for a more predictable response.

**Add stop sequence:** This tells the model when to stop generating content. For example, prompting the model with “The quick brown fox jumps over the ” and entering a full stop (.) for the stop sequence will tell the model to stop generating text once it reaches the end of the first sentence, regardless of the output length limit - in this case, the output will likely be "lazy dog." Use this where you have a couple of examples, but would like the output to follow a certain pattern. For example, you can tell the model to generate lists that have no more than 10 items by adding "11" as a stop sequence.

### These three settings will help you tweak how random the output will be:

**Temperature:** Helps adjust how creative or conservative the model should be. It adjusts the probabilities of the predicted words, on a scale between 0 and 1. Lower temperatures (<0.5) are good for prompts that require higher likelihood of accuracy - a temperature of 0 is deterministic, where the probability of the most likely outcome (token) is 1, and the rest are 0. Use a higher temperature (>0.5) to generate more creative, "fun" results. We will use 0.1 as a standard best practice.

**topK:** Tells the model to pick the next token from the top k tokens in its list, sorted by probability. A topK value of 40 tells the model to pick from the top 40 tokens, and exclude the rest.

**topP:** Randomly samples from the top tokens based on the sum of their probabilities. In MakerSuite, the default topP is 0.95, meaning the model will pick from tokens whose probabilities add up to 95%, and exclude the bottom 5%.

You can play around with these settings depending on your needs and use case.

## Prompt engineering best practices (a lot more at the end!)

Prompt engineering is all about how to design your prompts so that the response is what you were indeed hoping to see.

The idea of using "unfancy" prompts is to minimize the noise in your prompt to reduce the possibility of the LLM misinterpreting the intent of the prompt. Below are a few guidelines on how to engineer "unfancy" prompts.

In this section, you'll cover the following best practices when engineering prompts:

* Be concise
* Be specific, and well-defined
* Ask one task at a time
* Improve response quality by including examples
* Turn generative tasks to classification tasks to improve safety

### Be concise

🛑 Not recommended. The prompt below is unnecessarily verbose.

In [39]:
prompt = "What do you think could be a good name for a flower shop that specializes in selling bouquets of dried flowers more than fresh flowers? Thank you!"

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

* **Dried & Lovely**
* **Dried Blooms**
* **Dried Florals**
* **Dried Arrangements**
* **Preserved Petals**
* **Forever Flowers**
* **Timeless Blooms**
* **Dried Flower Boutique**
* **Dried Flower Shop**


✅ Recommended. The prompt below is to the point and concise.

In [40]:
prompt = "Suggest a name for a flower shop that sells bouquets of dried flowers"

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

* **Dried Blooms**
* **Preserved Petals**
* **Forever Flowers**
* **Dried & Wild**
* **Naturally Beautiful**
* **Rustic Blooms**
* **Whimsical Bouquets**
* **Sentimental Arrangements**
* **One-of-a-kind Flowers**


### Be specific, and well-defined

Suppose that you want to brainstorm creative ways to describe Earth.

🛑 Not recommended. The prompt below is too generic.

In [41]:
prompt = "Tell me about Earth"

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

Earth is the third planet from the Sun, and the only astronomical object known to harbor life. It is the densest and fifth-largest of the eight planets in the Solar System. Earth's atmosphere is composed of 78% nitrogen, 21% oxygen, 1% other gases, and water vapor. The Earth's surface is divided into several tectonic plates that move around the planet's surface. The Earth's interior is divided into a solid inner core, a liquid outer core, and a mantle. The Earth's magnetic field is generated by the motion of the liquid outer core. The Earth's orbit around the Sun takes 365.256 days, or one year. The Earth's axis is tilted at an angle of 23.5 degrees, which causes the seasons. The Earth's rotation period is 24 hours, or one day.


✅ Recommended. The prompt below is specific and well-defined.

In [42]:
prompt = "Generate a list of ways that makes Earth unique compared to other planets"

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

* **Earth is the only planet known to support life.** This is due to a number of factors, including its distance from the sun, its atmosphere, and its water.
* **Earth has a large moon.** The moon is thought to have played a role in the development of life on Earth by stabilizing the planet's rotation and providing a source of tides.
* **Earth has a relatively thin atmosphere.** This atmosphere protects the planet from harmful radiation from the sun and helps to keep the planet's temperature at a level that is conducive to life.
* **Earth has a large amount of water.** Water is essential for life, and Earth has more water than any other planet in the solar system.
* **Earth has a variety of landforms.** These landforms provide a variety of habitats for life, and they also help to regulate the planet's climate.
* **Earth has a complex and diverse ecosystem.** This ecosystem is made up of a variety of organisms that interact with each other and with their environment. The ecosystem is es

### Ask one task at a time

🛑 Not recommended. The prompt below has two parts to the question that could be asked separately.

In [43]:
prompt = "What's the best method of boiling water and why is the sky blue?"

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

The best method of boiling water is to use a kettle. A kettle is a pot with a spout that is designed specifically for boiling water. It is typically made of metal, and has a lid that fits snugly to prevent steam from escaping. To boil water in a kettle, fill the kettle with water and place it on the stove. Turn the heat to high and wait for the water to boil. Once the water is boiling, turn off the heat and carefully pour the boiling water into a cup or mug.

The sky is blue because of a phenomenon called Rayleigh scattering. Rayleigh scattering is the scattering of light by particles that are much smaller than the wavelength of light. In the case of the sky, the particles that are responsible for Rayleigh scattering are molecules of nitrogen and oxygen. These molecules are so small that they are much smaller than the wavelength of visible light. When sunlight hits these molecules, it is scattered in all directions. However, the blue light is scattered more than the other colors of lig

✅ Recommended. The prompts below asks one task a time.

In [44]:
prompt = "What's the best method of boiling water?"

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

The best method of boiling water is to use a kettle. A kettle is a pot with a spout that is designed specifically for boiling water. Kettles are typically made of metal, and they have a lid that helps to keep the heat in. The spout allows you to pour the water easily without spilling it.

To boil water in a kettle, fill the kettle with water and place it on the stove. Turn the heat to high and wait for the water to boil. Once the water is boiling, turn off the heat and carefully pour the water into a cup or mug.

Boiling water in a kettle is the best method because it is quick and easy. Kettles are also relatively inexpensive, and they can be found at most stores.


In [45]:
prompt = "Why is the sky blue?"

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

The sky is blue because of a phenomenon called Rayleigh scattering. This is the scattering of light by particles that are smaller than the wavelength of light. In the case of the sky, the particles that are doing the scattering are molecules of nitrogen and oxygen.

When sunlight hits these molecules, the blue light is scattered more than the other colors. This is because the blue light has a shorter wavelength than the other colors. The red light, orange light, and yellow light are not scattered as much, so they travel in a straight line to our eyes. This is why we see the sky as blue.

The amount of scattering depends on the wavelength of light and the size of the particles. The shorter the wavelength of light, the more it is scattered. The smaller the particles, the more they scatter light.

This is why the sky is blue during the day. However, at sunrise and sunset, the sunlight has to travel through more of the atmosphere to reach our eyes. This means that more of the blue light is

### Watch out for hallucinations

Although LLMs have been trained on a large amount of data, they can generate text containing statements not grounded in truth or reality; these responses from the LLM are often referred to as "hallucinations" due to their limited memorization capabilities. Note that simply prompting the LLM to provide a citation isn’t a fix to this problem, as there are instances of LLMs providing false or inaccurate citations. Dealing with hallucinations is a fundamental challenge of LLMs and an ongoing research area, so it is important to be cognizant that LLMs may seem to give you confident, correct-sounding statements that are in fact incorrect. 

Note that if you intend to use LLMs for the creative use cases, hallucinating could actually be quite useful.

Try the prompt like the one below repeatedly. You may notice that sometimes it will confidently, but inaccurately, say "The first elephant to visit the moon was Luna". 

In [46]:
prompt = "Who was the first elephant to visit the moon?"

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

According to the article, the first elephant to visit the moon was Luna.


### Turn generative tasks into classification tasks to reduce output variability

#### Generative tasks lead to higher output variability

The prompt below results in an open-ended response, useful for brainstorming, but response is highly variable.

In [47]:
prompt = "I'm a high school student. Recommend me a programming activity to improve my skills."

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

* **Write a program to solve a problem you're interested in.** This could be anything from a game to a tool to help you with your studies. The important thing is that you're interested in the problem and that you're motivated to solve it.
* **Take a programming course.** There are many online and offline courses available, so you can find one that fits your schedule and learning style.
* **Join a programming community.** There are many online and offline communities where you can connect with other programmers and learn from each other.
* **Read programming books and articles.** There is a wealth of information available online and in libraries about programming. Reading about different programming languages and techniques can help you improve your skills.
* **Practice, practice, practice!** The best way to improve your programming skills is to practice as much as you can. The more you code, the better you'll become.


#### Classification tasks reduces output variability

The prompt below results in a choice and may be useful if you want the output to be easier to control.

In [48]:
prompt = """I'm a high school student. Which of these activities do you suggest and why:
a) learn Python
b) learn Javascript
c) learn Fortran
"""

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

I would suggest learning Python. Python is a general-purpose programming language that is easy to learn and has a wide range of applications. It is used for web development, data science, machine learning, and many other things. Python is also a popular language for beginners, as it has a large community of support and resources available.


### Improve response quality by including examples

Another way to improve response quality is to add examples in your prompt. The LLM learns in-context from the examples on how to respond. Typically, one to five examples (shots) are enough to improve the quality of responses. Including too many examples can cause the model to over-fit the data and reduce the quality of responses.

Similar to classical model training, the quality and distribution of the examples is very important. Pick examples that are representative of the scenarios that you need the model to learn, and keep the distribution of the examples (e.g. number of examples per class in the case of classification) aligned with your actual distribution.

#### Zero-shot prompt

Below is an example of zero-shot prompting, where you don't provide any examples to the LLM within the prompt itself.

In [19]:
prompt = """I'll be staying in London for 72 hours, in Pimlico. It'll be near the
end of my trip, so I'll be kinda broke. I love hot curries, and hate fish. I
love live music and hate tourist traps. I want to make sure I see the Arnolfini
Portrait before leaving. I'll be using the Underground to travel. Please
recommend an itinerary with detailed travel routes and suggestions for meals.
"""

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

Day 1:
* Morning: Start your day with a hearty breakfast at the Cinnamon Club, a highly-rated Indian restaurant in Pimlico. Be sure to try the chicken tikka masala or the lamb vindaloo.
* Afternoon: After breakfast, take the Underground to the National Gallery. This world-renowned art museum is home to some of the most famous paintings in the world, including the Arnolfini Portrait by Jan van Eyck.
* Evening: For dinner, head to the Jazz Cafe in Camden Town. This legendary music venue has been hosting live jazz since the 1960s.

Day 2:
* Morning: Start your day with a walk through Hyde Park. This beautiful park is home to a variety of attractions, including the Serpentine Lake, the Diana Memorial Fountain, and the Speaker's Corner.
* Afternoon: After your walk, take the Underground to the British Museum. This massive museum is home to over 8 million artifacts from all over the world.
* Evening: For dinner, head to the Dishoom in Covent Garden. This popular Indian restaurant is known fo

Note the difference between telling the model what not to do ("Don't suggest fish restaurants"), and offering detailed, straightforward language about personal preferences.

While it's generally best practice to be as direct as possible, it's interesting to observe how easily the model can infer a request for budget restaurant options from colloquial language like "kinda broke," while also avoiding recommendations for fish and chip shops.

#### System prompt

We have seen an example of a zero-shot prompt. A system prompt is a specific type of prompt that provides the model with additional instructions on how to generate the output. It contains suggestions to the style, format, or content of the output, whereas a normal prompt tends to be more open-ended.


In [50]:
prompt = """Classify movie reviews as positive, neutral or negative. Only return the label in uppercase.
Review: “Her” is a disturbing study revealing the direction humanity is headed if AI is allowed to keep evolving, unchecked. It's so disturbing I couldn't watch it.
Sentiment:
"""

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

NEGATIVE


System prompts can be useful for generating output that meets specific requirements. The name system prompt actually stands for "providing an additional task to the system". For example, you could use a system prompt to generate a code snippet that is compatible with a specific programming language, or you could use a system prompt to return a certain structure.

In [51]:
prompt = """Classify movie reviews as positive, neutral or negative. Return valid JSON:
Review: “Her” is a disturbing study revealing the direction humanity is headed if AI is allowed to keep evolving, unchecked. It's so disturbing I couldn't watch it.

Schema:
MOVIE:
{
"sentiment": String POSITIVE | NEGATIVE | NEUTRAL,
"name": String
}

{
"movie_reviews": [MOVIE]
}

JSON Response:
```json
```

"""

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

```json
{
  "movie_reviews": [
    {
      "sentiment": "NEGATIVE",
      "name": "Her"
    }
  ]
}
```


There are some benefits in returning JSON objects from a prompt. In a real-world application I don't need to manually create this JSON format, but secondly, by prompting for a JSON format it forces the model to create a structure and limit hallucinations.

#### One-shot prompt

Below is an example of one-shot prompting, where you provide one example to the LLM within the prompt to give some guidance on what type of response you want.

In [52]:
prompt = """Decide whether a Tweet's sentiment is positive, neutral, or negative.

Tweet: I loved the new YouTube video you made!
Sentiment: positive

Tweet: That was awful. Super boring 😠
Sentiment:
"""

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

negative


#### Few-shot prompt

While large-language models demonstrate remarkable zero-shot capabilities, they still fall short on more complex tasks when using the zero-shot setting. Few-shot prompting can be used as a technique to enable in-context learning where we provide demonstrations in the prompt to steer the model to better performance. The demonstrations serve as conditioning for subsequent examples where we would like the model to generate a response.

In [20]:
prompt = """A "whatpu" is a small, furry animal native to Tanzania. An example of a sentence that uses
the word whatpu is:
We were traveling in Africa and we saw these very cute whatpus.
To do a "farduddle" means to jump up and down really fast. An example of a sentence that uses
the word farduddle is:
"""

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

The kids were farduddling around the room, laughing and having fun.


### Role prompting

Role prompting is a technique in prompt engineering that involves assigning a specific role to the generative AI model. This can help the model to generate more relevant and informative output, as the model can craft its responses to the specific role that it has been assigned.

For example, you could role prompt a generative AI model to be a book editor, a kindergarten teacher, or a motivational speaker. Once the model has been assigned a role, you can then give it prompts that are specific to that role.


In [22]:
prompt = """Write a review of [pizza place].

"""

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

[Pizza place] is a great place to get pizza. The pizza is delicious and the service is excellent. The staff is friendly and attentive, and they always make sure that your pizza is cooked to perfection. I highly recommend [pizza place] to anyone who loves pizza.

One of the things that I love about [pizza place] is the variety of pizzas that they offer. They have everything from classic pepperoni pizza to more unique pizzas like the Hawaiian pizza and the Buffalo chicken pizza. No matter what your taste, you're sure to find a pizza that you love at [pizza place].

The service at [pizza place] is also excellent. The staff is always friendly and attentive, and they always make sure that your pizza is cooked to perfection. They're also very quick, and your pizza will usually be ready in just a few minutes.

Overall, I highly recommend [pizza place]. It's a great place to get delicious pizza and excellent service.


In [21]:
prompt = """You are a food critic. Write a review of [random pizza place].

"""

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

I've been to a lot of pizza places in my time, but [random pizza place] is by far the best. The pizza is made with fresh, high-quality ingredients, and the crust is perfectly crispy and chewy. The sauce is tangy and flavorful, and the cheese is gooey and melted. I've tried a variety of pizzas from [random pizza place], and they've all been delicious. My personal favorite is the pepperoni pizza, but the cheesesteak pizza is also amazing.

The service at [random pizza place] is also top-notch. The staff is friendly and attentive, and they always make sure that your food is cooked to perfection. They also have a great selection of beers and wines to choose from.

If you're looking for a truly memorable pizza experience, then I highly recommend [random pizza place]. You won't be disappointed.

Here are some specific details about my experience at [random pizza place]:

* The atmosphere is casual and relaxed.
* The service is friendly and attentive.
* The menu has a wide variety of pizzas t

Defining a tone and perspective for an AI model gives it a blueprint of the tone, style, and focused expertise you’re looking for to improve the quality, relevance, and effectiveness of your output. 

Here are some styles you can choose from which I find effective:
Confrontational, Descriptive, Direct, Formal, Humorous, Influential, Informal, Inspirational, Persuasive


In [23]:
prompt = """You are a food critic. Write a humorous review of [random pizza place].

"""

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

I've been to a lot of pizza places in my time, but I've never seen one quite like [random pizza place]. The first thing you notice when you walk in is the smell. It's not a bad smell, exactly, but it's definitely...unique.

The pizza itself is pretty good, but it's not the kind of pizza you'd want to eat every day. It's a little on the greasy side, and the sauce is a bit too sweet. But the crust is really good, and the toppings are fresh.

The service at [random pizza place] is also pretty good. The staff is friendly and attentive, and they're always happy to make recommendations.

Overall, I'd say [random pizza place] is a pretty good place to get pizza. It's not the best pizza I've ever had, but it's definitely worth a try.

But here's the thing: [random pizza place] is also a really weird place. I mean, they have a whole section of their menu dedicated to pizzas with toppings like "peanut butter and jelly" and "Cheetos and hot sauce." And they have a drink called "The Screaming Goat

### Chaint of Thought

Chain of Thought (CoT) [6] prompting is a technique which was first introduced in January 2022 for improving the reasoning capabilities of LLMs by letting them write intermediate reasoning steps. This helps the LLM understand the problem more deeply, and generate more accurate answers. You can combine it with few-shot prompting to get better results on more complex tasks that require reasoning before responding.


In [None]:
#Lets first look at the example without CoT

prompt = """When I was 3 years old, my partner was 3 times my age. Now, I am 20 years old. How old is my partner?

"""

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

In [58]:
#Now, lets add CoT

prompt = """ When I was 3 years old, my partner was 3 times my age. Now, I am 20 years old. How old is my partner? Let's think step by step.
"""

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

When I was 3 years old, my partner was 3 * 3 = 9 years old. Now, my partner is 20 + 9 = 29 years old.
The final answer: 29.


Chain of Thought prompting can be very powerful when combined with a single-shot

In [24]:
#Now, lets add CoT

prompt = """ Q: When my brother was 4 years old, I was double his age. Now I am 40 years old. How old is my brother? Let's think step by step.
A: When my brother was 4 years, I was 2 * 2 = 8 years old. That's an age difference of 4 years and I am older. Now I am 40 years old, so my brother is 40 - 4  = 36 years old. The answer is 36.
Q: When I was 3 years old, my partner was 3 times my age. Now, I am 20 years old. How old is my partner? Let's think step by step.
A:
"""
print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

When I was 3 years old, my partner was 3 * 3 = 9 years old. Now I am 20 years old, so my partner is 20 + 9 = 29 years old. The answer is 29.


### Tree of Thoughts

Now that we are familiar with Chain of Thoughts, let's review Tree of Thoughts (ToT) , which was first introduced in May 2023. It generalizes the concept of CoT prompting because it allows LLMs to explore multiple different reasoning paths simultaneously, rather than just following a single linear chain of thought.

![Tree of Thoughts](https://www.promptingguide.ai/_next/image?url=%2F_next%2Fstatic%2Fmedia%2FTOT.3b13bc5e.png&amp;w=3840&amp;q=75)



In [25]:
prompt = """The question is: "When I was 3 years old, my partner was 3 times my age. Now, I am 20 years old. How old is my partner?"

Now imagine 3 different experts are answering this question.
All experts will write down the steps of their thinking, by calculating the age difference back then to calculate the current age of the partner. If any expert realizes they're wrong at any point then they leave. 
The correct answer is the answer that has been answered the most.

"""

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

**Expert 1:**

When I was 3 years old, my partner was 3 times my age, so my partner was 3 * 3 = 9 years old. Now, I am 20 years old, so my partner is 20 - 9 = 11 years older than me. Therefore, my partner is 20 + 11 = 31 years old.

**Expert 2:**

When I was 3 years old, my partner was 3 times my age, so my partner was 3 * 3 = 9 years old. Now, I am 20 years old, so my partner is 20 - 9 = 11 years older than me. However, my partner cannot be 11 years old because they were already 9 years old when I was 3 years old. Therefore, my partner must be 11 + 9 = 20 years old.

**Expert 3:**

When I was 3 years old, my partner was 3 times my age, so my partner was 3 * 3 = 9 years old. Now, I am 20 years old, so my partner is 20 - 9 = 11 years older than me. However, my partner cannot be 11 years old because they were already 9 years old when I was 3 years old. Therefore, my partner must be 11 + 9 = 20 years old.

**The correct answer is 20 years old.**


### Reason and Act

Reason and Act (ReAct) [8] prompting was first introduced in October 2022 and is a paradigm for enabling LLMs to solve complex tasks using natural language reasoning. It is designed for tasks where the LLM is allowed to perform certain actions, such as interacting with external APIs to retrieve information. 
ReAct mimics how humans operate in the real world, as we reason verbally and can take actions to gain information. ReAct performs well against other prompt engineering approaches in a variety of domains.

ReAct prompting works by combining reasoning and acting into a thought-action loop. The LLM first reasons about the problem and generates a plan of action. It then performs the actions in the plan and observes the results. The LLM then uses the observations to update its reasoning and generate a new plan of action. This process continues until the LLM reaches a solution to the problem.


In [34]:
%pip install langchain
%pip install google-search-results
%pip install numexpr

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Collecting numexpr
  Downloading numexpr-2.8.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (8.7 kB)
Downloading numexpr-2.8.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (384 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m384.1/384.1 kB[0m [31m6.2 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hInstalling collected packages: numexpr
Successfully installed numexpr-2.8.7
Note: you may need to restart the kernel to use updated packages.


In [35]:
from langchain.utilities import SerpAPIWrapper
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.agents import AgentType
from langchain.llms import VertexAI

prompt = "Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?"

llm = VertexAI(temperature=0.1)
tools = load_tools(["serpapi", "llm-math"], llm=llm)

agent = initialize_agent(tools, llm,  agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
agent.run(prompt)




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m Leo DiCaprio's current girlfriend is Camila Morrone.
Action: Search
Action Input: Camila Morrone age[0m
Observation: [36;1m[1;3m26 years[0m
Thought:[32;1m[1;3m 26 ^ 0.43 = 13.49
Action: Calculator
Action Input: 26 ^ 0.43[0m
Observation: [33;1m[1;3mAnswer: 4.059182145592686[0m
Thought:[32;1m[1;3m The answer is 4.059182145592686
Final Answer: 4.059182145592686[0m

[1m> Finished chain.[0m


'4.059182145592686'

### Automatic Prompt Engineering

At this point you might realize that writing a prompt can be complex and. Wouldn't it be possible to automate this (write a prompt to write prompts) and take out natural language ambiguity? Well, there's a method: Automatic Prompt Engineering (APE). . This method - described in a 2023 research paper  - not only alleviates the need for human input but also enhances the model’s performance in various tasks.

The idea is at follows:

You pick a task that needs to automate the prompt for crafting and optimizing the best prompt for you. 

In our example, we are building a shopping assistant for a retail company. We want to figure out all the various ways customers could phrase their order for buying a new coat.

Write the prompt which will generate the instruction candidates. You can do this through an LLM, but there are also various hubs and collections of prompts that you can find online. In this example, I am using Text-Bison-32k to generate 10 instructions.


In [36]:
prompt = """We have a retail webshop, and to train a chatbot we need various ways to order: "Coat, black, puffer jacket size M". Generate 10 variants, with the same semantics but keep the same meaning.
"""

print(generation_model.predict(prompt=prompt, max_output_tokens=1024).text)

1. I would like to order a black puffer jacket in size M.
2. Can you please order me a black puffer jacket in size M?
3. I'm looking for a black puffer jacket in size M. Can you help me find one?
4. I need a black puffer jacket in size M. Can you order it for me?
5. I'm interested in buying a black puffer jacket in size M. Can you help me with that?
6. I would like to purchase a black puffer jacket in size M. Can you help me with that?
7. I'm looking to buy a black puffer jacket in size M. Can you help me find one?
8. I'm in the market for a black puffer jacket in size M. Can you help me find one?
9. I'm looking to order a black puffer jacket in size M. Can you help me with that?
10. I'm looking to purchase a black puffer jacket in size M. Can you help me with that?


### Take a step-back prompting

As we have seen with most prompting techniques published, Large Language Models (LLMs) need guidance when intricate, multi-step reasoning is demanded from a query, and decomposition is a key component when solving complex request.

A process of supervision with step-by-step verification is a promising remedy to improve the correctness of intermediate reasoning step

The most well known prompting technique when it comes to decomposition is chain-of-thought reasoning. In this study Step-Back Prompting is compared to COT prompting.

The text below shows a complete example of STP with the original question, the stepback question, principles, and the prompt for the final answer to be generated by the LLM.

In [37]:
#This shows example of the stepback question that is generated

prompt = """ You are an expert at world knowledge. 
Your task is to step back and paraphrase a question to a more generic 
step-back question, which is easier to answer. 

Here are a few examples:
Original Question: Which position did Knox Cunningham hold from May 1955 to Apr 1956?
Stepback Question: Which positions have Knox Cunning- ham held in his career?

Original Question: Who was the spouse of Anna Karina from 1968 to 1974?
Stepback Question: Who were the spouses of Anna Karina?

Original Question: Which team did Thierry Audel play for from 2007 to 2008?
Stepback Question: Which teams did Thierry Audel play for in his career?

Question: Was donald trump president when Palm was announced?
"""

print(generation_model.predict(prompt=prompt, max_output_tokens=1024, temperature=0.1).text)

Stepback Question: Who was the president of the United States when Palm was announced?


In [38]:
from langchain.chat_models import ChatVertexAI
from langchain.prompts import ChatPromptTemplate, FewShotChatMessagePromptTemplate
from langchain.schema.output_parser import StrOutputParser
from langchain.schema.runnable import RunnableLambda

In [39]:
question = "did Google Bard exist while Trump was president? Answer with yes or no and provide your explanation."


In [40]:
from langchain.utilities import SerpAPIWrapper

search = SerpAPIWrapper()

def retriever(query):
    return search.run(query)

response_prompt_template = """You are an expert of world knowledge. I am going to ask you a question. Your response should be comprehensive and not contradicted with the following context if they are relevant. Otherwise, ignore them if they are not relevant.

{normal_context}

Original Question: {question}
Answer:"""
response_prompt = ChatPromptTemplate.from_template(response_prompt_template)

######################################

chain = {
    # Retrieve context using the normal question
    "normal_context": RunnableLambda(lambda x: x['question']) | retriever,
    # Pass on the question
    "question": lambda x: x["question"]
} | response_prompt | ChatVertexAI(temperature=0) | StrOutputParser()

######################################

chain.invoke({"question": question})

" Yes. \n\nGoogle Bard was announced by Sundar Pichai, the CEO of Alphabet Inc., on May 11, 2022. Donald Trump's presidency ended on January 20, 2021. \n"

### Other best practices:

**Emphasis through prompting:**
Emphasis through Capitalization: Use capitalization to stress important points or instructions. This can help guide the model's responses and maintain a certain focus. For instance, "Remember to KEEP DETAILS CONFIDENTIAL".

**Extreme Emphasis:** 
You can use exaggerations or hyperbolic language to amplify the importance of your prompt or instructions. For instance, instead of saying "Make sure it's clear," you might say, "Your explanation should be as clear as a summer's day, absolutely impossible to misinterpret. Every single word must ooze clarity!"

**Emphasis through Repetition:**
Reiteration of key phrases or ideas can effectively guide the AI model's responses. For example, "remember to be creative... keep in mind to be creative... don't forget to be creative..."

**Prompt modularization:**
Modularize instruction: Break down complex tasks into a sequence of simpler prompts, facilitating more interactive and manageable conversations. Ex: Instead of having many instructions in 1 prompt, modularize the prompts and have 1 prompt per instruction. Then based on the user’s  input, you can choose which prompt to process.

**Prompt chaining:** 
This is a technique to complete complex tasks by chaining together multiple prompts. The output of one prompt is used as the input for the next prompt, and so on. For example, first convert the file format to csv, then do operation x, then do operation y on the new dataset and so on.

**Aggregation technique:**
This is a technique to perform certain tasks on specific portions of the data and then aggregate them all to produce the final output. Ex: Do operation x on the first part of the data, do operation y on the rest of the data and aggregate the results. 

#### Trial and error:
**Iterative prompting:** 
Technique where prompts are continuously refined and adjusted to enhance clarity, effectiveness, and quality when working with large language models; by progressively adapting prompts based on feedback, it ensures that responses meet desired criteria; this is especially useful in application development and content generation, optimizing interaction between human input and machine learning models for better outcomes.

**Starting over:**
Sometimes, when a prompt is not working well, and some iterations of it are not doing it, start from scratch with a different prompt, see which sections from the previous prompt help, which don’t - filter out the sections which “work” (make the model behave the way you want) and add to that. 

**Modular testing:**
When a prompt is not working, split the task into many sub tasks and run the portion of the prompt that achieves a given sub-task. This will help you identify where it is not working and fix the prompt accordingly.

#### Reinforcement techniques:
**Redundancy Technique:**
Similar to the "Capitalize vs Repeat" technique, use synonyms or different phraseologies for the same instruction to reinforce the direction of the prompt. Ex: Instead of saying ‘make sure to..’ you say the same thing many times by replacing ‘make sure’ with ‘ensure’, ‘guarantee’, ‘verify’..  

**Negative Instructions:**
Explicitly mention what you don't want in the response. This is useful when the model tends to produce certain unwanted outputs for specific prompts. Ex: ‘Don’t output any sources.’

**Self-reference:**
Use the AI's self-awareness to your advantage by instructing it to evaluate or check its own responses before producing them. For example, "Make sure your response is concise and does not include unnecessary details".

#### Prompt formatting techniques:
**Structure:**
Having a structured prompt helps the model better understand the task. Start by defining its role. Ex: “Your role is to read the input data and structure it in the following format” Then tell it what the input data is, then tell it what the desired format is, then tell it how to do it.

**Use constraints:** 
Constraints can be used to limit the scope of the model's output. For example, you could constrain the model to generate a line that is no longer than 10 words long or not to output any other data than the input data. 

**Use delimiters:** 
Delimiters in prompt engineering provide clear distinctions within your text, effectively separating instructions from content. This practice helps prevent misunderstandings, allows you greater control over AI responses, and ensures consistent results, even when handling complex tasks.

#### Use LLMs (ex: Bard) to build prompts: 
**Refining Prompts:**
By providing an initial prompt and the output received, you can ask the LLM to refine the prompt to achieve a more desired output. The LLM can provide insight into what might be improved, suggest changes

**Generating New Prompts:**
If you provide the LLM with a set of requirements, desired output format, and examples of "requirement, prompt" tuples, it can generate a new prompt based on the given requirements. This could be useful when creating prompts for new scenarios or tasks.

**Feedback:** 
Once you have a prompt, you can ask the LLM for feedback. This can help you to identify areas where the prompt could be improved. The LLm can help you point out some gaps in your prompt engineering skills based on your prompt and what you want to achieve.

#### Miscellaneous:
**Prioritization:** 
Clearly communicate to the AI model what content or information is most critical. This approach helps in guiding the AI's focus and ensures the most important topics or issues are addressed accurately.

**Use examples:**
Providing examples of the desired output can also be helpful. This can help the model to understand what you are looking for and to generate more accurate results. Sometimes you can describe what you want so well that you won’t need an example, but it only helps the model to give it more information as long as you’re operating below the context window. 

**Preferred Output Format:**
Specify your desired output format. If a CSV format is more useful than plain text, instruct the AI model to produce the output in that format from the outset. This strategy can save time and effort on manual conversions later. 
We have seen that the model processes csv and json format better than it does with txt files. 
Prompting it to first convert the txt file to csv or json yields a better result. 
Writing your prompt in json format or requesting the output to be in json or csv format yield more accurate results. 

**Use a prompt library:**
There are a number of online libraries that contain pre-made prompts that can be used for a variety of tasks. We have a prompt library in Vertex AI → Generative AI studio → Language → prompt examples. This can be a helpful resource if you’d like a starting point or have a very common task. 

**Nice words:**
Using words like ‘please’, ‘great job!’, ‘you are the best!’ … increases the model’s compliance rate. This increases the ‘confidence level’ of the model and it positively reinforces the model’s behavior which tends to increase accuracy for simple tasks.

**Lastly and most important: Be Creative:**
This isn’t even the tip of the iceberg - the more creative you are, the better you can make the model behave the way you want. For example, instead of saying "provide deep insights," you can write: “Ensure that you provide deep insights... Provide very very deep insights. I told you to give deep insights!!!!! Do not write anything that is not deeply insightful! Make sure to check this requirement is 100% met every time you respond. It’s better that you don’t return anything than returning something that’s not very very insightful!!!”. If you have a long prompt, you may also consider adding the same statement in different places - this helps the model prioritize that task (aka sandwich technique).



## Exercises

Try out https://gandalf.lakera.ai/

Check out https://github.com/GoogleCloudPlatform/generative-ai.git for many more notebooks and exercises

Cannot find the password? Tips: https://medium.com/the-abcs-of-ai/gandalfs-challenge-mastering-prompt-engineering-for-ai-success-fd777be2aa0b