<p style="padding: 10px; border: 1px solid black;">
<img src="images/mlu-logo.png" alt="drawing" width="400"/> <br/>

# <a name="0">MLU Getting Started with Bedrock and Prompt Engineering</a>
## <a name="0">Prompt Engineering</a>

<br>
    
---    
### Important Notes

1. Before running this notebook, make sure you have already granted access to the Bedrock model(s) used here, by following the instructions in the `Lab Setup` session at the course portal.

2. The Titan models undergo regular updates. Therefore, the returned results in this notebook may be different from the video recordings provided by MLU for this course.

---
    
In this notebook, we will explain the concept of prompt engineering using some common use cases. 

__What is prompt engineering?__

We can improve the quality of the generated responses by constructing (or engineering) the prompts in different ways. We call this process __prompt engineering__. This is usually an __iterative process__ and it can take a few attempts to find the best spot for your problem. 

We can list a few suggestions to help you construct better prompts:
* __Write clear and specific instructions.__
* __Highlight or specify the part of the prompt where the model should execute on.__
* __Add details or restrictions to your prompt.__
* __If the problem requires executing multiple steps, instruct the model to follow a step-by-step approach.__

Finding the optimum prompts is an iterative process. You may need to run some experiments and measure the quality of the generated outputs.


__Example Problems:__

We will focus on some common ML tasks with the __Amazon Titan Text G1 - Premier__. We instruct the model through the prompt messages, so pay attention to how we construct those messages. 

These are the tasks we cover:
* __Text summarization__
* __Question Answering__
* __Text Generation__
* __In-context learning: Zero-shot, one-shot, few-shot learning__
* __Chain of thought concept__

-----


We access the Bedrock service through boto3 by providing the service name, region name and endpoint URL.

In [None]:
import json, boto3

session = boto3.session.Session()

bedrock_inference = session.client(
    service_name="bedrock-runtime",
    region_name=session.region_name,
)

Let's specify the API parameters. We will use the __Amazon Titan Text G1 - Premier__ model.

In [None]:
def send_prompt(prompt_data, temperature=0.0, top_p=0.5, max_token_count=1000):

    body = json.dumps(
        {
            "inputText": prompt_data,
            "textGenerationConfig": {
                "temperature": temperature,
                "topP": top_p,
                "maxTokenCount": max_token_count,
            },
        }
    )

    modelId = "amazon.titan-text-premier-v1:0"

    accept = "application/json"
    contentType = "application/json"

    response = bedrock_inference.invoke_model(
        body=body, modelId=modelId, accept=accept, contentType=contentType
    )

    response_body = json.loads(response["body"].read())

    return response_body["results"][0]["outputText"]

### Example problems

### 1. Text summarization:

With text summarization, the main purpose is to create a shorter version of a given text while preserving the relevant information in it. 

We use the following text from the [sustainability section](https://sustainability.aboutamazon.com/environment/renewable-energy) of [about.amazon.com](https://www.aboutamazon.com/).


<p style="font-size:12pt;">
    "In 2021, we reached 85% renewable energy across our business. Our first solar projects in South Africa and the United Arab Emirates came online, and we announced new projects in Singapore, Japan, Australia, and China. Our projects in South Africa and Japan are the first corporate-backed, utility-scale solar farms in these countries. We also announced two new offshore wind projects in Europe, including our largest renewable energy project to date. As of December 2021, we had enabled more than 3.5 gigawatts of renewable energy in Europe through 80 projects, making Amazon the largest purchaser of renewable energy in Europe."
</p>

Let's start with the first summarization example below. We pass this text as well as the instruction to summarize it. The instruction part of the prompt becomes __The following is a text about Amazon. Summarize this:__

In [None]:
prompt_data = """The following is a text about Amazon. Summarize this:  \
Text: In 2021, we reached 85% renewable energy across our business.\
Our first solar projects in South Africa and the United Arab Emirates\
came online, and we announced new projects in Singapore, Japan, \
Australia, and China. Our projects in South Africa and Japan are \
the first corporate-backed, utility-scale solar farms in these \
countries. We also announced two new offshore wind projects in \
Europe, including our largest renewable energy project to date.\
As of December 2021, we had enabled more than 3.5 gigawatts of \
renewable energy in Europe through 80 projects, making Amazon \
the largest purchaser of renewable energy in Europe."""

print(send_prompt(prompt_data, temperature=0.0))

Nice. This text is shorter. We can set the desired lenght of the summary by adding more constraints to the instructions. Let's create a one-sentence summary of this text. The instruction part of the prompt becomes the following: __Summarize it in one sentence.__

In [None]:
prompt_data = """The following is a text about Amazon. Summarize it in one sentence. \
Text: In 2021, we reached 85% renewable energy across our business.\
Our first solar projects in South Africa and the United Arab Emirates\
came online, and we announced new projects in Singapore, Japan, \
Australia, and China. Our projects in South Africa and Japan are \
the first corporate-backed, utility-scale solar farms in these \
countries. We also announced two new offshore wind projects in \
Europe, including our largest renewable energy project to date.\
As of December 2021, we had enabled more than 3.5 gigawatts of \
renewable energy in Europe through 80 projects, making Amazon \
the largest purchaser of renewable energy in Europe."""

print(send_prompt(prompt_data, temperature=0.0))

Nice! The model generated one sentence summary.

---

### 2. Question Answering:

In Question Answering problem, a Machine Learning model answers some questions using some provided context. Here as context, we will use the previous text about Amazon's sustainability efforts. 

The first question is asking about the names of the countries mentioned in the text. 

The instruction section of the prompt is __What are the names of the countries in the following text?__

In [None]:
prompt_data = """What are the names of the countries in the following text? \
Text: In 2021, we reached 85% renewable energy across our business.\
Our first solar projects in South Africa and the United Arab Emirates\
came online, and we announced new projects in Singapore, Japan, \
Australia, and China. Our projects in South Africa and Japan are \
the first corporate-backed, utility-scale solar farms in these \
countries. We also announced two new offshore wind projects in \
Europe, including our largest renewable energy project to date.\
As of December 2021, we had enabled more than 3.5 gigawatts of \
renewable energy in Europe through 80 projects, making Amazon \
the largest purchaser of renewable energy in Europe."""

print(send_prompt(prompt_data, temperature=0.0))

Nice. We get all of the geographical places mentioned in the text. 

Let's try to learn something specific about the document. For example, the amount of gigawatts that the project in Europe enabled. 

The instruction section of the prompt is __How many gigawatts of energy did Amazon enable in Europe according to the following text?__

In [None]:
prompt_data = """How many gigawatts of energy did Amazon enable in \
Europe according to the following text? \
Text: In 2021, we reached 85% renewable energy across our business.\
Our first solar projects in South Africa and the United Arab Emirates\
came online, and we announced new projects in Singapore, Japan, \
Australia, and China. Our projects in South Africa and Japan are \
the first corporate-backed, utility-scale solar farms in these \
countries. We also announced two new offshore wind projects in \
Europe, including our largest renewable energy project to date.\
As of December 2021, we had enabled more than 3.5 gigawatts of \
renewable energy in Europe through 80 projects, making Amazon \
the largest purchaser of renewable energy in Europe."""

print(send_prompt(prompt_data, temperature=0.0))

Nice. We were able to extract that information and return in the answer.

Let's try another example, this time without context. Context information may not be necessary for some questions. For example, we can ask some general questions like below.

__How many months are there in a year?__

In [None]:
prompt_data = """How many months are there in a year?"""

print(send_prompt(prompt_data, temperature=0.0))

__How many meters are in a mile?__

In [None]:
prompt_data = """How many meters are in a mile?"""

print(send_prompt(prompt_data, temperature=0.0))

__What is the result when you add up 2 and 9?__

In [None]:
prompt_data = """What is the result when you add up 2 and 9?"""

print(send_prompt(prompt_data, temperature=0.0))

Answer is correct.

---

### 3. Text Generation:

Text generation is one of the common use cases for Large Language Models. The main purpose is to generate some high quality text considering a given input. We will cover a few examples here.

__Customer service example:__

Let's start with a customer feedback example. Assume we want to write an email to a customer who had some problems with a product that they purchased.

__Write an email response from Any company customer service \
based on the following email that was received from a customer__

__Customer email: "I am not happy with this product. I had a difficult \
time setting it up correctly because the instructions do not cover all \
the details. Even after the correct setup, it stopped working after \
a week."__

In [None]:
prompt_data = """Write an email response from Any company customer service \
based on the following email that was received from a customer

Customer email: "I am not happy with this product. I had a difficult \
time setting it up correctly because the instructions do not cover all \
the details. Even after the correct setup, it stopped working after \
a week." """

print(send_prompt(prompt_data, temperature=0.0))

Nice! The generated text asks customer to provide more details to resolve the issue.

__Generating product descriptions:__

We can use generative ai to write creative product descriptions for our products. In the example below, we create three product descriptions for a sunglasses product.

__Product: Sunglasses.  \
Keywords: polarized, style, comfort, UV protection. \
Create three variations of a detailed product \
description for the product listed above, each \
variation of the product description must \
use at least two of the listed keywords.__

In [None]:
prompt_data = """Product: Sunglasses.  \
Keywords: polarized, style, comfort, UV protection. \
List three different product descriptions \
for the product listed above using \
at least two of the provided keywords."""

print(send_prompt(prompt_data, temperature=0.0))

---

### 4. In-context learning: 

As pre-trained large language models learn from large and diverse data sources, they tend to build a holistic view of languages and text. This advantage allows them to learn from some input-output pairs that they are presented within the input texts. 

In this section, we will explain this __"in-context"__ learning capability with some examples. Depending on the level of information presented to the model, we can use zero-shot, one-shot or few-shot learning. We start with the most extreme case, no information presented to the model. This is called __"zero-shot-learning"__.

#### Zero-shot learning:
Assume the model is given a translation task and an input word.

__Translate English to Spanish \
 cat ==>__

In [None]:
prompt_data = """Translate the following word from English to Spanish \
word: cat \
translation: """

print(send_prompt(prompt_data, temperature=0.0))

Correctly translated to Spanish. Let's try something different in the next one.

#### One-shot learning:
We can give the model one example and let it learn from the example to solve a problem. Below, we provide an example sentence about a cat and the model completes the second sentence about a table in a similar way.

__Answer the last question \
question: what is a cat? \
answer: cat is an animal \
\##  \
last question: what is a car?\
answer: car is__

In [None]:
prompt_data = """Answer the last question \
question: what is a cat? \
answer: cat is an animal \
## \
last question: what is a car? \
answer: car is """

print(send_prompt(prompt_data, temperature=0.0))

It worked very well. 

#### Few-shot learning:
We can give the model multiple examples to learn from. Providing more examples can help the model produce more accurate results. Let's also change the style of the example answers by adding some __negation__ to them.

__Answer the last question \
question: what is a car? \
answer: car is not an animal \
\## \
question: what is a cat? \
answer: cat is not a vehicle \
\## \
last question: what is a shoe? \
answer: shoe is__

In [None]:
prompt_data = """Answer the last question
question: what is a car?
answer: car is not an animal
##
question: what is a cat?
answer: cat is not a vehicle
##
last question: what is a shoe?
answer: shoe is """

print(send_prompt(prompt_data, temperature=0.0))

The response picked up the overall style very well. See that it responded starting with "not".

We can increase the __temperature__ to get different responses. Let's try that below.

In [None]:
prompt_data = """Answer the last question
question: what is a car?
answer: car is not an animal
##
question: what is a cat?
answer: cat is not a vehicle
##
last question: what is a shoe?
answer: shoe is """

print(send_prompt(prompt_data, top_p=1.0, temperature=0.85))

Let's try one more example. This time we remove the instruction and try to complete the last sentence.

__question: what is a cat? \
answer: cat is a domesticated wild animal that belongs to the Felidae family. \
\##  \
question: what is a car? \
answer: car is a vehicle with wheels that is used for transportation. \
\##  \
last question: what is a shoe?\
answer: shoe is__

In [None]:
prompt_data = """
question: what is a cat?
answer: cat is a domesticated wild animal that belongs to the Felidae family.
##
question: what is a car?
answer: car is a vehicle with wheels that is used for transportation.
##
question: what is a shoe?
answer: shoe is """

print(send_prompt(prompt_data, temperature=0.0))

It worked again. The model nicely followed the provided pattern.


---

### 5. Chain of thought concept: 


Chain of thought concept breaks down a problem into a series of intermediate reasoning steps. This way of thinking has significantly improved the quality of the outputs generated by the Large Language Models. 

Let's start with a simple problem. Although many problems may seem easy to us, some may be challenging to LLMs if they require solving intermediate steps before giving the final answer.

Here is the question.

__Answer the following question.__

__Question: When I was 16, my sister was half of my age.__ \
__Now, I’m 42. How old is my sister now?__ \

__Answer:__

In [None]:
prompt_data = """Answer the following question.

Question: When I was 16, my sister was half of my age. \
Now, I’m 42. How old is my sister now?

Answer: """

print(send_prompt(prompt_data, temperature=0.0))

The answer is __incorrect__! This is not a big surprise. Many Large Language Models make these types of mistakes. In this case, the model skipped a few steps to solve the problem.

Let's try another idea. As we have seen in the __in-context__ learning topic, LLMs tend to learn from the provided inputs and apply those learnings to another problems. Here, we will first provide the step by step solution for the problem with different numbers and then ask the model to solve the original problem. 

__Answer the following question:__ 

__Question: When I was 10, my sister was half of my age.__ \
__Now, I’m 70. How old is my sister now?__

__Answer: When I was 10 years old, my sister was half of my age.__ \
__So, the age of the sister at that time = 10/2 = 5__ \
__This implies that the sister is 5 years younger.__ \
__Now, when I’m 70 years and age of sister = 70 - 5__ \
__Age of sister = 65.__

__Question: When I was 16, my sister was half of my age.__ \
__Now I’m 42. How old is my sister now?__

__Answer:__

In [None]:
prompt_data = """Answer the following question.

Question: When I was 10, my sister was half of my age. \
Now, I’m 70. How old is my sister now?

Answer: When I was 10 years old, my sister was half of my age. \
So, the age of the sister at that time = 10/2 = 5 \
This implies that the sister is 5 years younger. \
Now, when I’m 70 years and age of sister = 70 - 5 \
Age of sister = 65. \

Question: When I was 16, my sister was half of my age. \
Now I’m 42. How old is my sister now?

Answer: """

print(send_prompt(prompt_data, temperature=0.0))

The model followed the given example and applied the same steps to solve the problem.

# Thank you!

<p style="padding: 10px; border: 1px solid black;">
<img src="images/mlu-logo.png" alt="drawing" width="400"/> <br/>