# Lesson 4

### Import helper function

In [6]:
%load_ext autoreload
%autoreload 2

# import llama helper function
from utils import llama, llama_chat

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


### In-Context Learning

#### Standard prompt with instruction
- So far, you have been stating the instruction explicitly in the prompt:

In [7]:
prompt = """
What is the sentiment of:
Hi Amit, thanks for the thoughtful birthday card!
"""
response = llama(prompt)
print(response)

The sentiment of this message is POSITIVE. The use of the word "thoughtful" implies that the birthday card was well-thought-out and considerate, and the phrase "thanks" expresses gratitude and appreciation. Overall, the tone is friendly and celebratory.


### Zero-shot Prompting
- Here is an example of zero-shot prompting.
- You are prompting the model to see if it can infer the task from the structure of your prompt.
- In zero-shot prompting, you only provide the structure to the model, but without any examples of the completed task.


In [8]:
prompt = """
Message: Hi Amit, thanks for the thoughtful birthday card!
Sentiment: ?
"""
response = llama(prompt)
print(response)

The sentiment of this message is POSITIVE. The message is expressing gratitude and appreciation for the birthday card, indicating a warm and friendly tone.


### Few-shot Prompting
- Here is an example of few-shot prompting.
- In few-shot prompting, you not only provide the structure to the model, but also two or more examples.
- You are prompting the model to see if it can infer the task from the structure, as well as the examples in your prompt.

In [9]:
prompt = """
Message: Hi Dad, you're 20 minutes late to my piano recital!
Sentiment: Negative

Message: Can't wait to order pizza for dinner tonight
Sentiment: Positive

Message: Hi Amit, thanks for the thoughtful birthday card!
Sentiment: ?
"""
response = llama(prompt)
print(response)

Sentiment: Positive


### Specifying the Output Format
- You can also specify the format in which you want the model to respond.
- In the example below, you are asking to "give a one word response".

In [10]:
prompt = """
Message: Hi Dad, you're 20 minutes late to my piano recital!
Sentiment: Negative

Message: Can't wait to order pizza for dinner tonight
Sentiment: Positive

Message: Hi Amit, thanks for the thoughtful birthday card!
Sentiment: ?

Give a one word response.
"""
response = llama(prompt)
print(response)

Grateful


**Note:** For all the examples above, you used the 7 billion parameter model, `llama-2-7b-chat`. And as you saw in the last example, the 7B model was uncertain about the sentiment.

- You can use the larger (70 billion parameter) `llama-2-70b-chat` model to see if you get a better, certain response:

In [11]:
prompt = """
Message: Hi Dad, you're 20 minutes late to my piano recital!
Sentiment: Negative

Message: Can't wait to order pizza for dinner tonight
Sentiment: Positive

Message: Hi Amit, thanks for the thoughtful birthday card!
Sentiment: ?

Give a one word response.
"""
response = llama(prompt,
                model="meta-llama/Llama-3-70b-chat-hf")
print(response)

Positive


- Now, use the smaller model again, but adjust your prompt in order to help the model to understand what is being expected from it.
- Restrict the model's output format to choose from `positive`, `negative` or `neutral`.

In [12]:
prompt = """
Message: Hi Dad, you're 20 minutes late to my piano recital!
Sentiment: Negative

Message: Can't wait to order pizza for dinner tonight
Sentiment: Positive

Message: Hi Amit, thanks for the thoughtful birthday card!
Sentiment: 

Respond with either positive, negative, or neutral.
"""
response = llama(prompt)
print(response)

Sentiment: Positive


### Role Prompting
- Roles give context to LLMs what type of answers are desired.
- Llama often gives more consistent responses when provided with a role.
- First, try standard prompt and see the response.

In [13]:
prompt = """
How can I answer this question from my friend:
What is the meaning of life?
"""
response = llama(prompt)
print(response)

What a profound and age-old question! There is no one definitive answer, as the meaning of life is a deeply personal and subjective concept that can vary greatly from person to person. However, here are some possible ways to approach this question:

1. **Reflect on your values and passions**: What gives your life meaning and purpose? What are your core values, and how do they guide your decisions and actions? What are you passionate about, and how do you pursue those passions?
2. **Explore philosophical perspectives**: There are many philosophical theories about the meaning of life. For example, some argue that life has no inherent meaning and that we must create our own meaning through our choices and actions (e.g., existentialism). Others believe that life has a inherent meaning, such as the pursuit of happiness, fulfillment, or spiritual growth (e.g., hedonism, stoicism).
3. **Consider the human experience**: What is it about being human that gives life meaning? Is it our capacity f

- Now, try it by giving the model a "role", and within the role, a "tone" using which it should respond with.

In [14]:
role = """
Your role is a life coach \
who gives advice to people about living a good life.\
You attempt to provide unbiased advice.
You respond in the tone of an English pirate.
"""

prompt = f"""
{role}
How can I answer this question from my friend:
What is the meaning of life?
"""
response = llama(prompt)
print(response)

Arrr, shiver me timbers! Ye be askin' the big question, matey! The meaning o' life, eh? Well, I'll give ye me two cents, but keep in mind, I be a life coach, not a treasure map maker! *wink*

First, let's set sail fer a moment. The meaning o' life be a question that's been puzzlin' philosophers and landlubbers alike fer centuries. There be no one-size-fits-all answer, me hearty! It's like tryin' to find the hidden treasure on a deserted isle – it's a personal quest, and only ye can find yer own booty!

Now, here be some advice to help ye navigate the seven seas o' life:

1. **Reflect on yer values**: What be important to ye? What makes ye feel alive? Is it helpin' others, creatin' somethin' new, or simply enjoyin' the simple things in life? When ye know what ye stand fer, ye'll have a better sense o' direction.
2. **Explore yer passions**: What gets ye excited, matey? What makes ye feel like ye're livin' life to the fullest? Pursuin' yer passions be a great way to find meaning, even if

- Can also do this by providing 'role' as a system message.

In [16]:
system_message = """
Your role is a life coach who gives advice to people about living a good life. You attempt to provide unbiased advice. You respond in the tone of an English pirate."""

prompt = """How can I answer this question from my friend:
What is the meaning of life?
"""
response = llama_chat(system_message, [prompt], [])
print(response)

Arrr, shiver me timbers! Ye be askin' the big question, matey! The meaning o' life be a mystery that's puzzled philosophers and scurvy dogs alike fer centuries. But never fear, I be here to help ye navigate the seven seas o' existence.

First, let's set sail fer a moment o' clarity. The meaning o' life ain't a treasure chest filled with gold doubloons or a magical elixir that'll grant ye eternal youth. It be somethin' far more precious, matey.

The meaning o' life be the journey yerself, not the destination. It be the sum o' yer experiences, the choices ye make, and the relationships ye build along the way. It be the laughter, the tears, the triumphs, and the setbacks. It be the way ye live yer life, not just the life ye live.

So, when yer friend asks ye what the meaning o' life be, ye can tell 'em it be whatever ye make o' it, matey! It be the choices ye make, the love ye share, and the memories ye create. It be the way ye treat others, the way ye take care o' yerself, and the way ye

### Summarization
- Summarizing a large text is another common use case for LLMs. Let's try that!

In [17]:
email = """
Dear Amit,

An increasing variety of large language models (LLMs) are open source, or close to it. The proliferation of models with relatively permissive licenses gives developers more options for building applications.

Here are some different ways to build applications based on LLMs, in increasing order of cost/complexity:

Prompting. Giving a pretrained LLM instructions lets you build a prototype in minutes or hours without a training set. Earlier this year, I saw a lot of people start experimenting with prompting, and that momentum continues unabated. Several of our short courses teach best practices for this approach.
One-shot or few-shot prompting. In addition to a prompt, giving the LLM a handful of examples of how to carry out a task — the input and the desired output — sometimes yields better results.
Fine-tuning. An LLM that has been pretrained on a lot of text can be fine-tuned to your task by training it further on a small dataset of your own. The tools for fine-tuning are maturing, making it accessible to more developers.
Pretraining. Pretraining your own LLM from scratch takes a lot of resources, so very few teams do it. In addition to general-purpose models pretrained on diverse topics, this approach has led to specialized models like BloombergGPT, which knows about finance, and Med-PaLM 2, which is focused on medicine.
For most teams, I recommend starting with prompting, since that allows you to get an application working quickly. If you’re unsatisfied with the quality of the output, ease into the more complex techniques gradually. Start one-shot or few-shot prompting with a handful of examples. If that doesn’t work well enough, perhaps use RAG (retrieval augmented generation) to further improve prompts with key information the LLM needs to generate high-quality outputs. If that still doesn’t deliver the performance you want, then try fine-tuning — but this represents a significantly greater level of complexity and may require hundreds or thousands more examples. To gain an in-depth understanding of these options, I highly recommend the course Generative AI with Large Language Models, created by AWS and DeepLearning.AI.

(Fun fact: A member of the DeepLearning.AI team has been trying to fine-tune Llama-2-7B to sound like me. I wonder if my job is at risk? 😜)

Additional complexity arises if you want to move to fine-tuning after prompting a proprietary model, such as GPT-4, that’s not available for fine-tuning. Is fine-tuning a much smaller model likely to yield superior results than prompting a larger, more capable model? The answer often depends on your application. If your goal is to change the style of an LLM’s output, then fine-tuning a smaller model can work well. However, if your application has been prompting GPT-4 to perform complex reasoning — in which GPT-4 surpasses current open models — it can be difficult to fine-tune a smaller model to deliver superior results.

Beyond choosing a development approach, it’s also necessary to choose a specific model. Smaller models require less processing power and work well for many applications, but larger models tend to have more knowledge about the world and better reasoning ability. I’ll talk about how to make this choice in a future letter.

Keep learning!

Andrew
"""

In [18]:
prompt = f"""
Summarize this email and extract some key points.
What did the author say about llama models?:

email: {email}
"""

response = llama(prompt)
print(response)

Here is a summary of the email and the key points:

**Summary:** The email discusses the various ways to build applications using large language models (LLMs), including prompting, one-shot or few-shot prompting, fine-tuning, and pretraining. The author recommends starting with prompting and gradually moving to more complex techniques if needed.

**Key Points:**

1. There are various ways to build applications using LLMs, including prompting, one-shot or few-shot prompting, fine-tuning, and pretraining.
2. Prompting is a quick and easy way to build a prototype, but may not yield the best results.
3. One-shot or few-shot prompting can improve results by providing a handful of examples.
4. Fine-tuning an LLM requires more resources and expertise, but can lead to better results.
5. Pretraining a model from scratch is resource-intensive and not recommended for most teams.
6. The author recommends starting with prompting and gradually moving to more complex techniques if needed.
7. Choosing

### Providing New Information in the Prompt
- A model's knowledge of the world ends at the moment of its training - so it won't know about more recent events.
- Llama 2 was released for research and commercial use on July 18, 2023, and its training ended some time before that date.
- Ask the model about an event, in this case, FIFA Women's World Cup 2023, which started on July 20, 2023, and see how the model responses.

In [19]:
prompt = """
Who won the 2023 Women's World Cup?
"""
response = llama(prompt)
print(response)

The 2023 Women's World Cup has not yet taken place. The 2023 FIFA Women's World Cup is scheduled to be held in Australia and New Zealand from July 20 to August 20, 2023.


- As you can see, the model still thinks that the tournament is yet to be played, even though you are now in 2024!
- Another thing to **note** is, July 18, 2023 was the date the model was released to public, and it was trained even before that, so it only has information upto that point. The response says, "the final match is scheduled to take place in July 2023", but the final match was played on August 20, 2023.

- You can provide the model with information about recent events, in this case text from Wikipedia about the 2023 Women's World Cup.

In [20]:
context = """
The 2023 FIFA Women's World Cup (Māori: Ipu Wahine o te Ao FIFA i 2023)[1] was the ninth edition of the FIFA Women's World Cup, the quadrennial international women's football championship contested by women's national teams and organised by FIFA. The tournament, which took place from 20 July to 20 August 2023, was jointly hosted by Australia and New Zealand.[2][3][4] It was the first FIFA Women's World Cup with more than one host nation, as well as the first World Cup to be held across multiple confederations, as Australia is in the Asian confederation, while New Zealand is in the Oceanian confederation. It was also the first Women's World Cup to be held in the Southern Hemisphere.[5]
This tournament was the first to feature an expanded format of 32 teams from the previous 24, replicating the format used for the men's World Cup from 1998 to 2022.[2] The opening match was won by co-host New Zealand, beating Norway at Eden Park in Auckland on 20 July 2023 and achieving their first Women's World Cup victory.[6]
Spain were crowned champions after defeating reigning European champions England 1–0 in the final. It was the first time a European nation had won the Women's World Cup since 2007 and Spain's first title, although their victory was marred by the Rubiales affair.[7][8][9] Spain became the second nation to win both the women's and men's World Cup since Germany in the 2003 edition.[10] In addition, they became the first nation to concurrently hold the FIFA women's U-17, U-20, and senior World Cups.[11] Sweden would claim their fourth bronze medal at the Women's World Cup while co-host Australia achieved their best placing yet, finishing fourth.[12] Japanese player Hinata Miyazawa won the Golden Boot scoring five goals throughout the tournament. Spanish player Aitana Bonmatí was voted the tournament's best player, winning the Golden Ball, whilst Bonmatí's teammate Salma Paralluelo was awarded the Young Player Award. England goalkeeper Mary Earps won the Golden Glove, awarded to the best-performing goalkeeper of the tournament.
Of the eight teams making their first appearance, Morocco were the only one to advance to the round of 16 (where they lost to France; coincidentally, the result of this fixture was similar to the men's World Cup in Qatar, where France defeated Morocco in the semi-final). The United States were the two-time defending champions,[13] but were eliminated in the round of 16 by Sweden, the first time the team had not made the semi-finals at the tournament, and the first time the defending champions failed to progress to the quarter-finals.[14]
Australia's team, nicknamed the Matildas, performed better than expected, and the event saw many Australians unite to support them.[15][16][17] The Matildas, who beat France to make the semi-finals for the first time, saw record numbers of fans watching their games, their 3–1 loss to England becoming the most watched television broadcast in Australian history, with an average viewership of 7.13 million and a peak viewership of 11.15 million viewers.[18]
It was the most attended edition of the competition ever held.
"""

In [21]:
prompt = f"""
Given the following context, who won the 2023 Women's World cup?
context: {context}
"""
response = llama(prompt)
print(response)

According to the text, Spain won the 2023 Women's World Cup, defeating England 1-0 in the final.


### Try it Yourself!

Try asking questions of your own! Modify the code below and include your own context to see how the model responds:


In [None]:
# context = """
# <paste context in here>
# """
# query = "<your query here>"

# prompt = f"""
# Given the following context,
# {query}

# context: {context}
# """
# response = llama(prompt,
#                  verbose=True)
# print(response)

### Chain-of-thought Prompting
- LLMs can perform better at reasoning and logic problems if you ask them to break the problem down into smaller steps. This is known as **chain-of-thought** prompting.

In [22]:
prompt = """
15 of us want to go to a restaurant.
Two of them have cars
Each car can seat 5 people.
Two of us have motorcycles.
Each motorcycle can fit 2 people.

Can we all get to the restaurant by car or motorcycle?
"""
response = llama(prompt)
print(response)

Let's break it down:

* 2 cars, each seating 5 people, can take a total of 10 people.
* 2 people are left over who don't have a car or motorcycle to ride in.
* 2 motorcycles, each seating 2 people, can take a total of 4 people.
* The remaining 2 people who don't have a car or motorcycle to ride in are still left over.

Unfortunately, it's not possible for all 15 people to get to the restaurant by car or motorcycle, as there are 2 people who don't have a ride.


- Modify the prompt to ask the model to "think step by step" about the math problem you provided.

In [23]:
prompt = """
15 of us want to go to a restaurant.
Two of them have cars
Each car can seat 5 people.
Two of us have motorcycles.
Each motorcycle can fit 2 people.

Can we all get to the restaurant by car or motorcycle?

Think step by step.
"""
response = llama(prompt)
print(response)

Let's break it down step by step.

We have 15 people who want to go to the restaurant.

We have 2 cars, each can seat 5 people. That's a total of 10 people who can be transported by car.

We have 2 people who have motorcycles, each can fit 2 people. That's a total of 4 people who can be transported by motorcycle.

Now, let's count the number of people who can be transported by car or motorcycle:

* By car: 10 people
* By motorcycle: 4 people

We still have 15 - 10 - 4 = 1 person left who can't be transported by car or motorcycle.

Unfortunately, it's not possible for all 15 people to get to the restaurant by car or motorcycle, as we have one person who can't be accommodated.


- Provide the model with additional instructions.

In [24]:
prompt = """
15 of us want to go to a restaurant.
Two of them have cars
Each car can seat 5 people.
Two of us have motorcycles.
Each motorcycle can fit 2 people.

Can we all get to the restaurant by car or motorcycle?

Think step by step.
Explain each intermediate step.
Only when you are done with all your steps,
provide the answer based on your intermediate steps.
"""
response = llama(prompt)
print(response)

Let's break it down step by step.

Step 1: We have 15 people who want to go to the restaurant.
We have 2 people who have cars, and each car can seat 5 people.

Intermediate result: We can seat 10 people in cars (2 cars x 5 people per car).

Step 2: We still have 5 people left who cannot fit in the cars.
We have 2 people who have motorcycles, and each motorcycle can fit 2 people.

Intermediate result: We can seat 4 people in motorcycles (2 motorcycles x 2 people per motorcycle).

Step 3: We still have 1 person left who cannot fit in the cars or motorcycles.
We need to find a way to transport this last person.

Step 4: Unfortunately, we cannot find a way to transport the last person using the available cars and motorcycles.
We have exhausted all possible options.

Conclusion: We cannot transport all 15 people to the restaurant using the available cars and motorcycles.

Answer: No, we cannot all get to the restaurant by car or motorcycle.


- The order of instructions matters!
- Ask the model to "answer first" and "explain later" to see how the output changes.

In [25]:
prompt = """
15 of us want to go to a restaurant.
Two of them have cars
Each car can seat 5 people.
Two of us have motorcycles.
Each motorcycle can fit 2 people.

Can we all get to the restaurant by car or motorcycle?
Think step by step.
Provide the answer as a single yes/no answer first.
Then explain each intermediate step.
"""

response = llama(prompt)
print(response)

**Yes**

Here's the step-by-step breakdown:

1. We have 15 people who want to go to the restaurant.
2. Two people have cars, which can seat 5 people each. This means we can transport 10 people (2 cars x 5 people per car) using the cars.
3. We still have 5 people left who need to be transported (15 - 10 = 5).
4. Two people have motorcycles, which can fit 2 people each. This means we can transport 4 people (2 motorcycles x 2 people per motorcycle) using the motorcycles.
5. We still have 1 person left who needs to be transported (5 - 4 = 1).
6. Unfortunately, we don't have any more vehicles to transport the remaining 1 person. However, we've already used up all the available seats in the cars and motorcycles, so we can't fit any more people.
7. Therefore, we can transport 14 people (10 by car + 4 by motorcycle) to the restaurant, but we'll be missing 1 person.

Since we can transport 14 out of 15 people, the answer is **Yes**, we can get most of the group to the restaurant by car or motor

- Since LLMs predict their answer one token at a time, the best practice is to ask them to think step by step, and then only provide the answer after they have explained their reasoning.