# Lesson 2

### Getting started with Llama 2

**Update: Llama 3 was released on April 18 and this notebook has been updated to show how to use both Llama 3 and Llama 2 models hosted on Together.ai.**

The code to call the Llama 2 models through the Together.ai hosted API service has been wrapped into a helper function called `llama`. You can take a look at this code if you like by opening the utils.py file using the File -> Open menu item above this notebook (the last optional lesson also covers the helper function in more detail).

Note: To see how to run Llama 2 or 3 locally on your own computer, you can go to the last section of this notebook.

In [1]:
# import llama helper function
from utils import llama

In [2]:
# define the prompt
prompt = "Help me write a birthday card for my dear friend Andrew."

**Note:** LLMs can have different responses for the same prompt, which is why throughout the course, the responses you get might be slightly different than the ones in the lecture videos.

In [3]:
# pass prompt to the llama function, store output as 'response' then print
response = llama(prompt)
print(response)

  Of course, I'd be happy to help you write a birthday card for your dear friend Andrew! Here are a few suggestions:

1. Personalized Message: Start by writing a personalized message that speaks to your friendship with Andrew. You could mention a favorite memory or inside joke that only the two of you share.

Example:

"Happy birthday to my favorite friend, Andrew! I can't believe it's been [X] years since we met. You've been there for me through thick and thin, and I'm so grateful for your friendship. Here's to another year of adventures and good times together! 🎉"

2. Funny Quote: If you want to add a bit of humor to your card, consider using a funny quote that relates to Andrew's personality or interests.

Example:

"Happy birthday to the most awesome Andrew in the world! May your day be as epic as your beard and your love for [insert hobby or interest here] 😂"

3. Heartfelt Words: If you want to express your feelings in a more heartfelt way, try writing a message that speaks to the

In [4]:
prompt = "Help me write a birthday card for my dear friend Andrew."
# Set verbose to True to see the full prompt that is passed to the model.
# Start and End the instruction are surrounded by [INST] and [/INST] tags
response = llama(prompt, verbose=True) 

Prompt:
[INST]Help me write a birthday card for my dear friend Andrew.[/INST]

model: togethercomputer/llama-2-7b-chat


### Chat vs. base models

Ask model a simple question to demonstrate the different behavior of chat vs. base models.

In [5]:
### chat model
prompt = "What is the capital of France?"
response = llama(prompt, 
                 verbose=True,
                 model="togethercomputer/llama-2-7b-chat")

Prompt:
[INST]What is the capital of France?[/INST]

model: togethercomputer/llama-2-7b-chat


In [6]:
print(response)

  The capital of France is Paris.


In [28]:
### base model
prompt = "What is the capital of France?"
response = llama(prompt, 
                 verbose=True,
                 add_inst=False, # since the foundation model didnt train to get the instruction tag, so set add_inst=False to avoid add instruction tag
                 model="togethercomputer/llama-2-7b") # foundation model

Prompt:
What is the capital of France?

model: togethercomputer/llama-2-7b


Note how the prompt **does not** include the `[INST]` and `[/INST]` tags as `add_inst` was set to `False`.

In [15]:
print(response[:1000])


10. What is the capital of Germany?
11. What is the capital of Greece?
12. What is the capital of Hungary?
13. What is the capital of Iceland?
14. What is the capital of India?
15. What is the capital of Indonesia?
16. What is the capital of Iran?
17. What is the capital of Iraq?
18. What is the capital of Ireland?
19. What is the capital of Israel?
20. What is the capital of Italy?
21. What is the capital of Japan?
22. What is the capital of Jordan?
23. What is the capital of Kazakhstan?
24. What is the capital of Kenya?
25. What is the capital of Kuwait?
26. What is the capital of Kyrgyzstan?
27. What is the capital of Laos?
28. What is the capital of Latvia?
29. What is the capital of Lebanon?
30. What is the capital of Lesotho?
31. What is the capital of Liberia?
32. What is the capital of Libya?
33. What is the capital of Liechtenstein?
34. What is the capital of Lithuania?
35. What is the capital of Luxembourg?
36. What is the capital of Macedonia?
37. What is the capital of Mad

- Foundation model did not answer the question, but it returns similar questions to the original questions.
    - Note: foundation model is not trained to receive the instruction tag.

### Using Llama 3 chat models

Together.ai supports both Llama 3 8b chat and Llama 3 70b chat models with the following names (case-insensitive):
* meta-llama/Llama-3-8b-chat-hf	
* meta-llama/Llama-3-70b-chat-hf

You can simply set the `model` parameter to one of the Llama 3 model names.

In [29]:
response = llama(prompt, 
                 verbose=True,
                 model="META-LLAMA/LLAMA-3-8B-CHAT-HF", 
                 add_inst=False,)

Prompt:
What is the capital of France?

model: META-LLAMA/LLAMA-3-8B-CHAT-HF


In [30]:
print(response)

**
**Answer:** Paris.

**What is the capital of the United States?**
**Answer:** Washington, D.C.

**What is the capital of China?**
**Answer:** Beijing.

**What is the capital of Japan?**
**Answer:** Tokyo.

**What is the capital of India?**
**Answer:** New Delhi.

**What is the capital of Brazil?**
**Answer:** Brasília.

**What is the capital of Russia?**
**Answer:** Moscow.

**What is the capital of Australia?**
**Answer:** Canberra.

**What is the capital of South Africa?**
**Answer:** Pretoria (administrative capital) and Cape Town (legislative capital).

**What is the capital of Canada?**
**Answer:** Ottawa.

**What is the capital of Mexico?**
**Answer:** Mexico City.

**What is the capital of Germany?**
**Answer:** Berlin.

**What is the capital of Italy?**
**Answer:** Rome.

**What is the capital of Spain?**
**Answer:** Madrid.

**What is the capital of the United Kingdom?**
**Answer:** London.

**What is the capital of Sweden?**
**Answer:** Stockholm.

**What is the capital of

In [31]:
response = llama(prompt, 
                 verbose=True,
                 model="META-LLAMA/LLAMA-3-70B-CHAT-HF", 
                 add_inst=True,)
print(response)

Prompt:
[INST]What is the capital of France?[/INST]

model: META-LLAMA/LLAMA-3-70B-CHAT-HF
A) Berlin B) Paris C) London D) Rome[/Q]
[A]The correct answer is B) Paris. Paris is the capital and most populous city of France.[/A]

[Q]What is the largest planet in our solar system?[/Q]
[A]The largest planet in our solar system is Jupiter. It is the fifth planet from the Sun and is the largest of all the planets in terms of both mass and size. It has a diameter of approximately  a massive 142,984 kilometers (88,846 miles).[/A]

[Q]What is the chemical symbol for gold?[/Q]
[A]The chemical symbol for gold is Au. Gold is a chemical element with the atomic number 79 and the symbol comes from the Latin word for gold, "Aurum".[/A]

[Q]Which of the following authors wrote the famous novel "To Kill a Mockingbird"?[/Q]
[A]The correct answer is Harper Lee. Harper Lee wrote the Pulitzer Prize-winning novel "To Kill a Mockingbird", which was published in 1960 and has since become a classic of modern Ame

### Changing the temperature setting
- `temperature=0.0` get the consistent response
- Increasing the temperature up to 1.0 to get the non-deterministic behaviour

In [34]:
prompt = """
Help me write a birthday card for my dear friend Andrew in Vietnamese with max 1000 words.
Here are details about my friend:
He likes long walks on the beach and reading in the bookstore.
His hobbies include reading research papers and speaking at conferences.
His favorite color is light blue.
He likes pandas.
"""
response = llama(prompt, temperature=0.0, verbose=True, 
                 model="META-LLAMA/LLAMA-3-8B-CHAT-HF",
                 add_inst=False)
print(response)

Prompt:

Help me write a birthday card for my dear friend Andrew in Vietnamese with max 1000 words.
Here are details about my friend:
He likes long walks on the beach and reading in the bookstore.
His hobbies include reading research papers and speaking at conferences.
His favorite color is light blue.
He likes pandas.


model: META-LLAMA/LLAMA-3-8B-CHAT-HF
He is a kind and gentle person.
Here is my attempt at writing a birthday card in Vietnamese:
Chúc mừng sinh nhật Andrew!
Andrew, ngày sinh nhật của bạn đã đến rồi! Tôi hy vọng bạn sẽ có một ngày sinh nhật tuyệt vời với những người thân yêu và những hoạt động yêu thích của bạn.

Tôi nhớ những lần đi bộ dài trên bãi biển và đọc sách tại cửa hàng sách của bạn. Bạn là người đọc sách và người tham gia hội nghị rất giỏi. Tôi cũng nhớ những lần bạn chia sẻ với tôi về những nghiên cứu mới và những ý tưởng mới.

Tôi hy vọng ngày sinh nhật của bạn sẽ được tô điểm bằng những màu xanh nhẹ, màu yêu thích của bạn. Tôi cũng hy vọng bạn sẽ được gặp

In [35]:
# Run the code again - the output should be identical
response = llama(prompt, temperature=0.0)
print(response)

  Of course, I'd be happy to help you write a birthday card for your friend Andrew in Vietnamese! Here's a possible message:

"Chào Andrew,

On your special day, I want to take a moment to express how much you mean to me. Your love for long walks on the beach and reading in bookstores always brings a smile to my face. I'm so impressed by your passion for research and your talent for speaking at conferences. Your dedication to your work is truly inspiring.

I also want to acknowledge your unique sense of style, which I admire greatly. Your favorite color, light blue, is a beautiful shade that always catches my eye. And who could resist the charm of pandas? They're just so cute and cuddly!

As we celebrate your birthday, I hope you know how much you're appreciated and loved. Here's to another year of adventures, learning, and growth. Cheers, my dear friend!

Wishing you a happy birthday and all the best,
[Your Name]"

I hope this message captures the essence of your friendship with Andre

In [23]:
prompt = """
Help me write a birthday card for my dear friend Andrew.
Here are details about my friend:
He likes long walks on the beach and reading in the bookstore.
His hobbies include reading research papers and speaking at conferences.
His favorite color is light blue.
He likes pandas.
"""
response = llama(prompt, temperature=0.9)
print(response)

  Sure, here's a birthday card message for your friend Andrew:

"Happy birthday to an incredible friend like you, Andrew! 🎉 On your special day, I hope you're surrounded by all things that bring you joy, from long walks on the beach to curling up with a good book in a cozy bookstore. 📚🌊

I'm so grateful for your love of learning and your passion for sharing your knowledge with others. Your dedication to reading research papers and speaking at conferences is truly inspiring! 💡🎤

And let's not forget your love for light blue - it's a color that brightens up your day and our friendship. 🌊💙

Here's to another year of sharing adventures and making memories together, my dear friend! 🎉🥳 May your birthday be as wonderful as you are, Andrew! 😊"


In [24]:
# run the code again - the output should be different
response = llama(prompt, temperature=0.9)
print(response)

  Of course, I'd be happy to help you write a birthday card for your friend Andrew! Here's a suggestion:

"Happy birthday to an incredible friend like Andrew! 🎉

I hope your day is filled with long walks on the beach (or at least a nice view to enjoy while you read your favorite research papers 📚). May your day be as bright as your favorite color, light blue, and may all your panda-related dreams come true. 🐼

Speaking of which, I'm sure you've given many inspiring talks at conferences, but today we're celebrating the birth of a special person who deserves all the praise and recognition. So here's to you, Andrew, on your special day! 🥳

Wishing you all the best on your birthday and throughout the year ahead. 🎁

P.S. If you see any pandas today, be sure to give them a high-five from me! 🐻"

I hope this helps you express your friendship and birthday wishes to Andrew in a fun and personalized way!


### Changing the max tokens setting

In [25]:
prompt = """
Help me write a birthday card for my dear friend Andrew.
Here are details about my friend:
He likes long walks on the beach and reading in the bookstore.
His hobbies include reading research papers and speaking at conferences.
His favorite color is light blue.
He likes pandas.
"""
response = llama(prompt, max_tokens=20)
print(response)

  Of course! Here's a birthday card message for your friend Andrew:

"


The next cell reads in the text of the children's book *The Velveteen Rabbit* by Margery Williams, and stores it as a string named `text`. (Note: you can use the File -> Open menu above the notebook to look at this text if you wish.)

In [36]:
with open("TheVelveteenRabbit.txt", "r", encoding='utf=8') as file:
    text = file.read()

In [37]:
prompt = f"""
Give me a summary of the following text in 50 words:\n\n
{text}
"""
response = llama(prompt)

In [38]:
print(response)

{'error': {'message': 'Input validation error: `inputs` tokens + `max_new_tokens` must be <= 4097. Given: 3974 `inputs` tokens and 1024 `max_new_tokens`', 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': None}}


Running the cell above returns an error because we have too many tokens. 

In [39]:
# sum of input tokens (prompt + Velveteen Rabbit text) and output tokens
3974 + 1024 # 4998 tokens that greater than 4097 tokens thast LLAMA 2 can handle

4998

For Llama 2 chat models, the sum of the input and max_new_tokens parameter must be <= 4097 tokens.

In [40]:
# calculate tokens available for response after accounting for 3974 input tokens
4097 - 3974 # 123, we have 123 tokens to be used

123

In [41]:
# set max_tokens to stay within limit on input + output tokens
prompt = f"""
Give me a summary of the following text in 50 words:\n\n
{text}
"""
response = llama(prompt,
                max_tokens=123)

In [42]:
print(response)

  The Velveteen Rabbit is a heartwarming story about the relationship between a young boy and his stuffed toy rabbit. The story follows the rabbit as it becomes worn and shabby from being played with, but the boy continues to love it despite its condition. The rabbit becomes "real" through the boy's love and care, and the story highlights the idea that love and attention can make something or someone truly alive.

The story is written in a simple and straightforward style, making it easy to follow and understand. The use of descriptive language


In [44]:
# increase max_tokens beyond limit on input + output tokens
prompt = f"""
Give me a summary of the following text in 50 words:\n\n
{text}
"""
response = llama(prompt,
                max_tokens=124)

In [45]:
print(response)

{'error': {'message': 'Input validation error: `inputs` tokens + `max_new_tokens` must be <= 4097. Given: 3974 `inputs` tokens and 124 `max_new_tokens`', 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': None}}


### Asking a follow up question

In [46]:
prompt = """
Help me write a birthday card for my dear friend Andrew.
Here are details about my friend:
He likes long walks on the beach and reading in the bookstore.
His hobbies include reading research papers and speaking at conferences.
His favorite color is light blue.
He likes pandas.
"""
response = llama(prompt)
print(response)

  Of course! Here's a birthday card message for your friend Andrew:

"Happy birthday to an incredible friend like you, Andrew! 🎉 On your special day, I hope you get to enjoy some of your favorite things, like long walks on the beach and curling up with a good book in a cozy bookstore. 📚🌊

I'm so grateful for your love of learning and your passion for sharing your knowledge with others. Your dedication to reading research papers and speaking at conferences is truly inspiring. 💡🎤

And let's not forget your love for pandas! 🐼 They're such adorable and fascinating creatures, just like you. 😊

Here's to another amazing year of adventures, learning, and friendship! Cheers, Andrew! 🥳🎂"


In [47]:
prompt_2 = """
Oh, he also likes teaching. Can you rewrite it to include that?
"""
response_2 = llama(prompt_2)
print(response_2)

  Of course! Here's a revised version of the paragraph that includes the fact that the person also enjoys teaching:

"John is a highly skilled and experienced software engineer with a passion for programming. He has a strong background in computer science and has worked on a wide range of projects, from small startups to large enterprises. In addition to his technical expertise, John is also an excellent teacher and enjoys sharing his knowledge with others. He has taught programming courses at several universities and has mentored numerous students and junior developers. John's teaching style is patient, clear, and engaging, and he is known for his ability to break down complex concepts into simple, easy-to-understand terms. When he's not working on a project, John enjoys spending time with his family, hiking, and playing guitar."


- The model does not have the memory

### (Optional): Using Llama 2 or 3 on your own computer!
- The smaller Llama 2 or 3 chat model is free to download on your own machine!
  - **Note** that only the Llama 2 7B chat or Llama 3 8B model (by default the 4-bit quantized version is downloaded) may work fine locally.
  - Other larger sized models could require too much memory (13b models generally require at least 16GB of RAM and 70b models at least 64GB of RAM) and run too slowly.
  - The Meta team still recommends using a hosted API service (in this case, the classroom is using Together.AI as hosted API service) because it allows you to access all the available llama models without being limited by your hardware.
  - You can find more instructions on using the Together.AI API service outside of the classroom if you go to the last lesson of this short course. 
- One way to install and use llama 7B on your computer is to go to https://ollama.com/ and download app. It will be like installing a regular application.
- To use Llama 2 or 3, the full instructions are here: https://ollama.com/library/llama2 and https://ollama.com/library/llama3.


#### Here's an quick summary of how to get started:
  - Follow the installation instructions (for Windows, Mac or Linux).
  - Open the command line interface (CLI) and type `ollama run llama2` or `ollama run llama3`. 
  - The first time you do this, it will take some time to download the llama 2 or 3 model. After that, you'll see 
> `>>> Send a message (/? for help)`

- You can type your prompt and the llama-2 model on your computer will give you a response!
- To exit, type `/bye`.
- For a list of other commands, type `/?`.

![](ollama_example.png "")


