# Advanced Prompt Engineering for Meta Llama 3
---

> *This notebook should work well with the **`Python 3`** kernel in SageMaker Studio on **`us-west-2`** region*

## What is Prompt Engineering?
Prompt engineering is an emerging discipline focused on developing and optimizing prompts to efficiently adapt language model to various tasks. By using various prompt engineering techniques, you can often get better responses from the foundation models (FM) without spending efforts and resources on finetuning them.

There are multiple advanced prompt engineering techniques, this notebook will covers:
1. **Few-shots example**: This technique involves providing the large language model with a few examples (normally, these can be between 3 and 5) of the desired task or output. It helps the model learn patterns from limited data, making it particularly useful when extensive datasets are unavailable. Few-shot prompting can enable the model to generalize from minimal input effectively.
2. **Chain of Thoughts (CoT)**: This technique encourages the language model to articulate its reasoning step-by-step, which can improve transparency and understanding of the model's thought and decision making process. By breaking down complex problems into more manageable parts, CoT prompting increases problem-solving accuracy.
3. **Self-consistency**: This technique enhances the performance, and consistency of the language models by generating multiple outputs for the same input and selecting the most consistent answer among them.
4. **ReAct (Reasoning and Acting)**: This technique combines reasoning with action by having the model generate both logical reasoning steps and specific actions to take based on those steps. This allows for more dynamic interactions where the model can adapt its responses based on ongoing interactions.
5. **Prompt chaining**: This technique involves linking (hence, chaining) multiple prompts together to create a coherent, consistent flow of interactions. By referencing previous inputs or outputs, prompt chaining encourages deeper interactions and provides more consistent responses from language model.


### Implementation
In this notebook, we will show how to implement various prompt engineering techniques with **Meta Llama 3** foundation model, using the **Amazon Bedrock API** with `Boto3` client.


<div class="alert alert-block alert-info">
    <b>Remark</b>: If this is your first time user, please make sure you are enable the access to Llama model on the Amazon Bedrock console.
</div>

### Set up
---

**Note**: You can ignore the error message during the `pip install`.

In [1]:
%pip install --upgrade --quiet boto3

Note: you may need to restart the kernel to use updated packages.


In [2]:
import boto3
from botocore.exceptions import ClientError
import ipywidgets as widgets
import json
import time

If you want to use other meta foundation model, here is the `model_id` for 8b and 405b:

- Llama 3.1 8B: `'meta.llama3-1-8b-instruct-v1:0'`
- Llama 3.1 70B: `'meta.llama3-1-70b-instruct-v1:0'`
- Llama 3.1 405B: `'meta.llama3-1-405b-instruct-v1:0'`
- Llama 3 8B: `'meta.llama3-8b-instruct-v1:0'`
- Llama 3 70B: `'meta.llama3-70b-instruct-v1:0'`

In [3]:
model_selection = widgets.Dropdown(
    options=[
        'meta.llama3-1-8b-instruct-v1:0',
        'meta.llama3-1-70b-instruct-v1:0',
        'meta.llama3-1-405b-instruct-v1:0',
        'meta.llama3-8b-instruct-v1:0',
        'meta.llama3-70b-instruct-v1:0'
    ],
    value='meta.llama3-8b-instruct-v1:0',
    description='Select model:',
    disabled=False,
)
model_selection

Dropdown(description='Select model:', index=3, options=('meta.llama3-1-8b-instruct-v1:0', 'meta.llama3-1-70b-i…

In [4]:
model_id = model_selection.value
print('You selected: {}'.format(model_id))

You selected: meta.llama3-8b-instruct-v1:0


In [5]:
boto_session = boto3.session.Session()
bedrock_runtime_client = boto_session.client(
    service_name='bedrock-runtime',
    region_name=boto_session.region_name,
)

For this notebook, we will use `converse` API from Boto3 client.

For more details please refer to boto3 documentation [here](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/converse.html) and the user guide [here](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-meta.html)

In [6]:
def generate_text(
    model_id: str,
    user_message: str,
    max_tokens: int=1024,
    temperature: float=0.2,
    top_p: float=0.8,
    verbose: bool=False
) -> dict:
    output_dict = dict()
    message_payload = [{
        'role': 'user',
        'content': [{
            'text': user_message
        }],
    }]
    inference_config = {
        'maxTokens': max_tokens,
        'temperature': temperature,
        'topP': top_p,
    }
    if verbose:
        print(' -- You are using: {}'.format(model_id))
        print(' -- Here is the inference configuration:\n{}'.format(json.dumps(inference_config, indent=2)))

    try:
        resp = bedrock_runtime_client.converse(
            modelId=model_id,
            messages=message_payload,
            inferenceConfig=inference_config,
            additionalModelRequestFields={}
        )
        output_dict['text'] = resp['output']['message']['content'][0]['text'].strip()
        output_dict['usage'] = resp['usage']
        output_dict['metrics'] = resp['metrics']

    except (ClientError, Exception) as e:
        print('Error: cannot invoke {model_id}. Reason: {error}'.format(model_id=model_id, error=e))

    return output_dict



## Few-shot prompting
---
This  technique involves providing the language model with a **few** specific examples of the desired tasks or outputs, generally resulting in more accurate, and consistent output. The user will supplies a few **input-output** pairs that illustrate the desired response.



First, let's look at the input-output (zero-shot) prompting for movie sentimental use case.

In [7]:
user_prompt = '''Please rate this comment from the scale of 1 to 5 stars.
Here is the comment: This movie is excellent.
'''

resp = generate_text(model_id, user_prompt)
print(resp['text'])

I would rate this comment 2 stars out of 5. The comment is very brief and lacks specific details or reasons why the movie is excellent. It's a very general and vague statement that doesn't provide much insight or value to others. A more detailed and thoughtful comment would be more helpful and deserving of a higher rating.


Let's continue with movie sentimental use case but with guidelines on how to rate the movie.

In [8]:
user_prompt = '''Please rate this comment from the scale of 1 to 5 stars
Probably the best movie of my life!: 5 star
I'm ok with this movie: 3 star
This movie is waste of time: 1 star

Here is the comment: This movie is excellent.
'''
resp = generate_text(model_id, user_prompt)
print(resp['text'])

Based on the scale provided, I would rate the comment "This movie is excellent." as 5 stars. The language used is very positive, indicating a high level of enthusiasm and praise for the movie. The comment is similar in tone to "Probably the best movie of my life!", which is also rated as 5 stars.


As observed from the output, language model provides more consistent, and more accurate rating.

## Chain-of-Thoughts (CoT)
---
This technique involves the language model to articulate its reasoning or follow specific prompt steps, which can improve transparency, help guide its thinking, and generate a more coherent and relevant response. By breaking down complex problems into more manageable steps, CoT prompting improves problem solving accuracy.

Let's start with an example of generic prompts.

In [9]:
user_prompt = '''Can you describe Tokyo Tower to your audience?
'''

# Let's generate 2 examples for comparison
for i in range(2):
    print('Round {}'.format(i+1))
    resp = generate_text(model_id, user_prompt, max_tokens=1024, temperature=.5)
    print(resp['text'])
    print(' ----- ')

Round 1
Tokyo Tower! It's an iconic landmark and a must-visit attraction in Tokyo, Japan. Completed in 1958, Tokyo Tower stands at an impressive 332.6 meters (1,091 feet) tall, making it one of the tallest structures in the world at the time of its completion.

The tower's design is a nod to the Eiffel Tower in Paris, with a similar lattice-like structure. However, Tokyo Tower has a unique, octagonal shape, with a distinctive white and orange color scheme. The tower's height is divided into three sections: the first section is a square base, the second section is a circular middle section, and the top section is a spherical observation deck.

Tokyo Tower is not only an engineering marvel but also a cultural icon. It was built to commemorate Japan's post-war reconstruction and has since become a symbol of Tokyo and Japan. The tower is also a popular spot for tourists and locals alike, offering breathtaking views of the city from its observation decks.

Visitors can take elevators to the

Now let's test with CoT prompting techniques.

In [10]:
user_prompt = '''Can you describe Tokyo Tower to your audience? Begin with:
1. What is it
2. When it was built, and how long it took them to build
3. Why it was built
4. End it with why should I visit this place, throw in fun facts to engage with the audiences.
'''

for i in range(2):
    print('Round {}'.format(i+1))
    resp = generate_text(model_id, user_prompt, max_tokens=1024, temperature=.5)
    print(resp['text'])
    print('-----\n\n')

Round 1
Tokyo Tower!

**What is it?**
Tokyo Tower is a iconic communications tower located in the heart of Tokyo, Japan. Standing at an impressive 332.6 meters (1,091 feet) tall, it's one of the tallest towers in the world when it was first built in 1958. The tower is a broadcasting tower, designed to transmit radio and television signals to the city.

**When was it built, and how long did it take?**
Construction on Tokyo Tower began in 1957 and took approximately two years to complete, with a total of 3,400 tons of steel used in its construction. The tower was officially opened on December 23, 1958.

**Why was it built?**
Tokyo Tower was built to serve as a broadcasting tower for Tokyo's growing radio and television industry. At the time, Japan was experiencing rapid economic growth, and the government wanted to create a symbol of Tokyo's modernization and technological advancements. The tower was also designed to be a iconic landmark, showcasing Japan's engineering and architectural 

As you can see from the responses, the language model responds to user in more structured and consistent approach. This way we can provide more control on the language model's output.

## Self-consistency 
---
The language models are probabilistic, so even with Chain-of-Thought technique, a single generation might produce incorrect or inconsistent results. With self-consistency, it selects the most frequent answer from multiple generations, hence improving the consistency to the user.

Let's examine the example for customer support use case.

In [11]:
user_prompt = '''
If a customer have ordered a laptop but received a tablet instead.
What are the steps you will take?
'''
for i in range(2):
    print('Round {}'.format(i+1))
    resp = generate_text(model_id, user_prompt)
    print(resp['text'])
    print('-----\n\n')
    time.sleep(2)

Round 1
A common customer service conundrum! Here are the steps I would take if a customer ordered a laptop but received a tablet instead:

1. **Acknowledge and apologize**: Respond promptly to the customer's concern, acknowledging their issue and apologizing for the mistake. This sets a positive tone for the resolution process.

Example: "Dear [Customer], thank you for reaching out to us about the issue with your recent order. I apologize for the mistake and assure you that we're here to help resolve it as quickly as possible."

2. **Verify the order and product**: Confirm the customer's order details, including the product ordered and the product received. This ensures accuracy and helps identify the root cause of the issue.

Example: "Can you please confirm your order number and the product you received? I'd like to verify the details to better assist you."

3. **Offer a solution**: Depending on the situation, I would offer one of the following solutions:
	* **Replace the product**:

In [12]:
user_prompt = '''
If a customer have ordered a laptop but received a tablet instead.
The support agent should first acknowledge the mistake,
guide the customer on how to return the tablet,
and ensure that the laptop is shipped promptly.
'''
for i in range(2):
    print('Round {}'.format(i+1))
    resp = generate_text(model_id, user_prompt)
    print(resp['text'])
    print('-----\n\n')
    time.sleep(2)

Round 1
Here's an example of how the support agent could handle the situation:

**Initial Response**

"Hello [Customer Name], thank you for reaching out to us about the issue with your recent order. I apologize for the mistake - it appears that you received a tablet instead of the laptop you ordered. I'm so sorry for the inconvenience this has caused.

Can you please confirm the order number and the tablet's serial number so I can look into this further? Additionally, would you be available to return the tablet to us at your earliest convenience?"

**Return Process**

"Thank you for providing the necessary information. I've gone ahead and initiated a return label for the tablet. You should receive an email with the return shipping instructions shortly. Please make sure to package the tablet securely and include all original accessories before shipping it back to us.

Once we receive the tablet, we'll process a refund for the full amount of the order. If you have any questions or concer

## ReAct (Reasoning and Acting)
---
This technique combines both reasoning and action planning within language models. This allows the models to generate both reasoning traces and task-specific actions, enhancing the models' ability to interact with external environments and adapt their responses accordingly. As a result, this technique can significantly improve the accuracy and reliability of models' responses.


In [13]:
user_prompt = '''
What should tech company do if it wants to expand its operations into Malaysia?
'''

resp = generate_text(model_id, user_prompt,)
print(resp['text'])

If a tech company wants to expand its operations into Malaysia, here are some steps it can consider:

1. Research and understand the market:
	* Study the Malaysian tech industry, its growth prospects, and the competitive landscape.
	* Identify potential customers, partners, and competitors.
	* Analyze the regulatory environment, tax laws, and labor laws.
2. Register a company:
	* Register a subsidiary or a branch office in Malaysia, depending on the company's structure and goals.
	* Obtain necessary licenses and permits, such as a business license, tax registration, and employment permit.
3. Set up a local presence:
	* Rent or lease office space in a suitable location, such as Kuala Lumpur, Penang, or Cyberjaya.
	* Hire local staff, including a country manager, sales and marketing team, and technical support staff.
	* Establish a local bank account and obtain a local currency (MYR) account.
4. Comply with regulations:
	* Register with the Malaysian Communications and Multimedia Commiss

In [14]:
user_prompt = '''
A tech company is considering expanding its operations into Malaysia. 
First, gather information about the market conditions in that country, 
then evaluate potential risks and benefits, and finally suggest an action plan.
'''
resp = generate_text(model_id, user_prompt,)
print(resp['text'])

**Market Conditions in Malaysia**

Malaysia is a rapidly growing economy with a population of over 32 million people. The country has a strong and diverse economy, with a mix of manufacturing, services, and agriculture sectors. Here are some key market conditions to consider:

1. **Economic Growth**: Malaysia has a GDP growth rate of around 4.5%, making it one of the fastest-growing economies in Southeast Asia.
2. **Labor Costs**: Labor costs in Malaysia are relatively low compared to other developed countries, with an average monthly salary of around $400-$500.
3. **Infrastructure**: Malaysia has a well-developed infrastructure, with modern airports, seaports, and transportation networks.
4. **Language**: The official language is Malay, but English is widely spoken, making it easier for foreign companies to operate.
5. **Regulations**: Malaysia has a relatively business-friendly regulatory environment, with a low tax rate of 24% and a simple process for setting up a business.
6. **Mar

## Prompt chaining
---
This technique involves breaking down a complex task into smaller, manageable subtasks, where the output of one prompt serves as the input for the next one. This technique enhaces the reliability and performance of language model to process information in more structured manner and on specific task at a time.

One common **use case** is writing a long document (i.e., proposal, marketing plan). Let's first observe the traditional prompting.

In [15]:
user_prompt = '''
Write a 300-words marketing plan for an eco-friendly product.
'''
resp = generate_text(model_id, user_prompt,)
print(resp['text'])

Marketing Plan: EcoCycle - The Revolutionary Compostable Bag

Executive Summary:
EcoCycle is a new, eco-friendly product that offers a sustainable alternative to traditional plastic bags. Our compostable bags are made from plant-based materials and are designed to reduce waste and minimize environmental impact. Our marketing plan aims to raise awareness about the importance of sustainable living and position EcoCycle as the go-to solution for environmentally conscious consumers.

Target Market:

* Demographics: Health-conscious individuals, families with young children, and environmentally aware consumers
* Psychographics: People who prioritize sustainability, reduce waste, and support eco-friendly products
* Online presence: Active on social media platforms, frequent online shoppers, and subscribers to eco-friendly newsletters

Marketing Objectives:

* Increase brand awareness by 20% within the first 6 months
* Drive sales by 30% within the first year
* Position EcoCycle as a leader i

Now, let's implement it with prompt chaining.

In [16]:
step1_prompt = 'Identify top 3 target audiences for an eco-friendly product.'
resp = generate_text(model_id, step1_prompt,)
output1_resp = resp['text']
print(output1_resp)

Based on market trends and consumer behavior, here are the top 3 target audiences for an eco-friendly product:

1. Millennials (born between 1981 and 1996):
	* This age group is highly concerned about the environment and is more likely to prioritize sustainability in their purchasing decisions.
	* Millennials are also tech-savvy and active on social media, making them a prime target for eco-friendly brands that can leverage digital marketing and influencer partnerships.
	* Key characteristics: values-driven, socially conscious, and willing to pay a premium for sustainable products.
2. Gen Z (born between 1997 and 2012):
	* Gen Z is even more environmentally conscious than millennials, with 75% of them saying they're willing to make lifestyle changes to reduce their carbon footprint.
	* This age group is also highly influenced by social media and online reviews, making them a prime target for eco-friendly brands that can showcase their products' eco-credentials and customer testimonials

In [17]:
step2_prompt = '''
Based on the identified target audience, suggest effective marketing channels.
Here are the target audience: 
{target_audience}
'''.format(target_audience=output1_resp)
resp = generate_text(model_id, step2_prompt,)
output2_resp = resp['text']
print(output2_resp)

Based on the identified target audiences, here are some effective marketing channels to consider:

**Millennials (born between 1981 and 1996)**

1. Social Media: Leverage Instagram, Facebook, and Twitter to share engaging content, influencer partnerships, and user-generated content that showcases the eco-friendly product's benefits and sustainability features.
2. Influencer Marketing: Partner with eco-conscious influencers and bloggers who have a strong following among millennials. They can share their personal experiences with the product and promote it to their audience.
3. Email Marketing: Build an email list and send targeted campaigns to millennials who have shown interest in eco-friendly products or have purchased from the brand before.
4. Content Marketing: Create blog posts, videos, and podcasts that discuss sustainability, environmental issues, and the benefits of eco-friendly products. Share this content on the brand's website and social media channels.
5. Online Advertising:

In [18]:
step3_prompt = '''
Develop a messaging strategy that emphasizes sustainability for this audience.
Here are the target audience and channel:
{step2_resp}
'''.format(step2_resp=output2_resp)
resp = generate_text(model_id, step3_prompt,)
output3_resp = resp['text']
print(output3_resp)

Based on the target audiences and channels, here's a messaging strategy that emphasizes sustainability for each audience:

**Millennials:**

* Messaging theme: "Join the movement towards a more sustainable future"
* Key messages:
	+ Our eco-friendly products are designed to reduce waste and minimize environmental impact
	+ By choosing our products, you're supporting a brand that cares about the planet
	+ Share your own sustainability journey and inspire others to do the same
* Visuals: Use images and videos that showcase the product's eco-credentials, such as recyclable packaging, biodegradable materials, and reduced carbon footprint.
* Influencer partnerships: Partner with eco-conscious influencers who have a strong following among millennials and can share their personal experiences with the product.
* Content marketing: Create blog posts, videos, and podcasts that discuss sustainability, environmental issues, and the benefits of eco-friendly products.

**Gen Z:**

* Messaging theme:

## Bonus: Tree-of-Thoughts (ToT)
---
Tree-of-Thoughts (ToT) prompting is more recent, innovative approach in prompt engineering designed to enhance the problem-solving capabilities of the large language models. This technique explores multiple reasoning paths simultaneously, hence improving the models' ability to tackle complex tasks that requires strategic planning and evaluation.

Let's consider the example of birthday party planning.

In [19]:
tot_prompt = '''
1. Plan a surprise birthday party for a friend.
Choice 1: A small gathering at a local park.
Choice 2: A fancy dinner at a restaurant.
Choice 3: A themed party at home.
Choice 4: A weekend getaway.
2. For each choice, consider:
- Guest list
- Decorations
- Activities
3. Weigh the pros and cons of each idea before finalizing the plan.

Based on the pros and cons, finalize one plan in your <response> tag, 
put the rest of your comparison and the reason why you chose the plan in your <thought_process> tag.
'''

resp = generate_text(model_id, tot_prompt)
print(resp['text'])

<response> Choice 3: A themed party at home </response>

<thought_process>

Let's weigh the pros and cons of each option:

**Choice 1: A small gathering at a local park**

Pros: Easy to organize, casual atmosphere, and a change of scenery.
Cons: Weather dependent, limited space, and might not be as memorable.

**Choice 2: A fancy dinner at a restaurant**

Pros: Upscale atmosphere, no cleanup required, and a unique experience.
Cons: Expensive, limited guest list, and might feel impersonal.

**Choice 4: A weekend getaway**

Pros: A unique and exciting experience, chance to relax and unwind, and create lifelong memories.
Cons: Expensive, requires a lot of planning, and might be overwhelming.

After considering the pros and cons, I think Choice 3: A themed party at home is the best option. Here's why:

* Guest list: We can invite a smaller group of close friends and family, making it more intimate and personalized.
* Decorations: We can create a unique and themed atmosphere with balloons, 

## In summary
--- 
Prompt engineering is an important field for optimizing how you apply, develop, and understand large language models. At its core, it is about designing prompts or interactions to expand what large language model can do to our objectives, address their weakness like hallucination, and gain insights into their functioning. 

In a production environment, you should experiment multiple prompt techniques, and may need to combine multiple techniques to achieve your use cases' objectives.