# Lesson 2

### Getting started with Llama 2

**Update: Llama 3 was released on April 18 and this notebook has been updated to show how to use both Llama 3 and Llama 2 models hosted on Together.ai.**

The code to call the Llama 2 models through the Together.ai hosted API service has been wrapped into a helper function called `llama`. You can take a look at this code if you like by opening the utils.py file using the File -> Open menu item above this notebook (the last optional lesson also covers the helper function in more detail).

Note: To see how to run Llama 2 or 3 locally on your own computer, you can go to the last section of this notebook.

In [1]:
# import llama helper function
from utils import llama

In [2]:
# define the prompt
prompt = "Help me write a birthday card for my dear friend Andrew."

**Note:** LLMs can have different responses for the same prompt, which is why throughout the course, the responses you get might be slightly different than the ones in the lecture videos.

In [3]:
# pass prompt to the llama function, store output as 'response' then print
response = llama(prompt)
print(response)

  Of course, I'd be happy to help you write a birthday card for your dear friend Andrew! Here are a few suggestions:

1. Personalized Message: Start by writing a personalized message that speaks to your friendship with Andrew. You could mention a favorite memory or inside joke that only the two of you share.

Example:

"Happy birthday to my favorite friend, Andrew! I can't believe it's been [X] years since we met. You've been there for me through thick and thin, and I'm so grateful for your friendship. Here's to another year of adventures and good times together! 🎉"

2. Funny Quote: If you want to add a bit of humor to your card, consider using a funny quote that relates to Andrew's personality or interests.

Example:

"Happy birthday to the most awesome Andrew in the world! May your day be as epic as your beard and your love for [insert hobby or interest here] 😂"

3. Heartfelt Words: If you want to express your feelings in a more heartfelt way, try writing a message that speaks to the

In [4]:
# Set verbose to True to see the full prompt that is passed to the model.
prompt = "Help me write a birthday card for my dear friend Andrew."
response = llama(prompt, verbose=True)

Prompt:
[INST]Help me write a birthday card for my dear friend Andrew.[/INST]

model: togethercomputer/llama-2-7b-chat


### Chat vs. base models

Ask model a simple question to demonstrate the different behavior of chat vs. base models.

In [5]:
### chat model
prompt = "What is the capital of France?"
response = llama(prompt, 
                 verbose=True,
                 model="togethercomputer/llama-2-7b-chat")

Prompt:
[INST]What is the capital of France?[/INST]

model: togethercomputer/llama-2-7b-chat


In [6]:
print(response)

  The capital of France is Paris.


In [7]:
### base model
prompt = "What is the capital of France?"
response = llama(prompt, 
                 verbose=True,
                 add_inst=False,
                 model="togethercomputer/llama-2-7b")

Prompt:
What is the capital of France?

model: togethercomputer/llama-2-7b


Note how the prompt **does not** include the `[INST]` and `[/INST]` tags as `add_inst` was set to `False`.

In [8]:
print(response)


10. What is the capital of Germany?
11. What is the capital of Greece?
12. What is the capital of Hungary?
13. What is the capital of Iceland?
14. What is the capital of India?
15. What is the capital of Indonesia?
16. What is the capital of Iran?
17. What is the capital of Iraq?
18. What is the capital of Ireland?
19. What is the capital of Israel?
20. What is the capital of Italy?
21. What is the capital of Japan?
22. What is the capital of Jordan?
23. What is the capital of Kazakhstan?
24. What is the capital of Kenya?
25. What is the capital of Kuwait?
26. What is the capital of Kyrgyzstan?
27. What is the capital of Laos?
28. What is the capital of Latvia?
29. What is the capital of Lebanon?
30. What is the capital of Lesotho?
31. What is the capital of Liberia?
32. What is the capital of Libya?
33. What is the capital of Liechtenstein?
34. What is the capital of Lithuania?
35. What is the capital of Luxembourg?
36. What is the capital of Macedonia?
37. What is the capital of Mad

### Using Llama 3 chat models

Together.ai supports both Llama 3 8b chat and Llama 3 70b chat models with the following names (case-insensitive):
* meta-llama/Llama-3-8b-chat-hf	
* meta-llama/Llama-3-70b-chat-hf

You can simply set the `model` parameter to one of the Llama 3 model names.

In [4]:
response = llama(prompt, 
                 verbose=True,
                 model="META-LLAMA/LLAMA-3-8B-CHAT-HF", 
                 add_inst=False,)

Prompt:
Help me write a birthday card for my dear friend Andrew.

model: META-LLAMA/LLAMA-3-8B-CHAT-HF


In [9]:
print(response)


10. What is the capital of Germany?
11. What is the capital of Greece?
12. What is the capital of Hungary?
13. What is the capital of Iceland?
14. What is the capital of India?
15. What is the capital of Indonesia?
16. What is the capital of Iran?
17. What is the capital of Iraq?
18. What is the capital of Ireland?
19. What is the capital of Israel?
20. What is the capital of Italy?
21. What is the capital of Japan?
22. What is the capital of Jordan?
23. What is the capital of Kazakhstan?
24. What is the capital of Kenya?
25. What is the capital of Kuwait?
26. What is the capital of Kyrgyzstan?
27. What is the capital of Laos?
28. What is the capital of Latvia?
29. What is the capital of Lebanon?
30. What is the capital of Lesotho?
31. What is the capital of Liberia?
32. What is the capital of Libya?
33. What is the capital of Liechtenstein?
34. What is the capital of Lithuania?
35. What is the capital of Luxembourg?
36. What is the capital of Macedonia?
37. What is the capital of Mad

In [11]:
response = llama(prompt, 
                 verbose=True,
                 model="META-LLAMA/LLAMA-3-70B-CHAT-HF", 
                 add_inst=False,)
print(response)

Prompt:
What is the capital of France?

model: META-LLAMA/LLAMA-3-70B-CHAT-HF
 Paris
What is the capital of Germany? Berlin
What is the capital of Italy? Rome
What is the capital of Spain? Madrid
What is the capital of Portugal? Lisbon
What is the capital of Switzerland? Bern
What is the capital of Austria? Vienna
What is the capital of Belgium? Brussels
What is the capital of Netherlands? Amsterdam
What is the capital of Denmark? Copenhagen
What is the capital of Norway? Oslo
What is the capital of Sweden? Stockholm
What is the capital of Finland? Helsinki
What is the capital of Greece? Athens
What is the capital of Turkey? Ankara
What is the capital of Poland? Warsaw
What is the capital of Czech Republic? Prague
What is the capital of Hungary? Budapest
What is the capital of Slovakia? Bratislava
What is the capital of Slovenia? Ljubljana
What is the capital of Croatia? Zagreb
What is the capital of Bulgaria? Sofia
What is the capital of Romania? Bucharest
What is the capital of Serbi

### Changing the temperature setting

In [12]:
prompt = """
Help me write a birthday card for my dear friend Andrew.
Here are details about my friend:
He likes long walks on the beach and reading in the bookstore.
His hobbies include reading research papers and speaking at conferences.
His favorite color is light blue.
He likes pandas.
"""
response = llama(prompt, temperature=0.0)
print(response)

  Of course! Here's a birthday card message for your friend Andrew:

"Happy birthday to an incredible friend like you, Andrew! 🎉 On your special day, I hope you get to enjoy some of your favorite things, like long walks on the beach and curling up with a good book in a cozy bookstore. 📚🌊

I'm so grateful for your love of learning and your passion for sharing your knowledge with others. Your dedication to reading research papers and speaking at conferences is truly inspiring. 💡🎤

And let's not forget your love for pandas! 🐼 They're such adorable and fascinating creatures, just like you. 😊

Here's to another amazing year of adventures, learning, and friendship! Cheers, Andrew! 🥳🎂"


In [13]:
# Run the code again - the output should be identical
response = llama(prompt, temperature=0.0)
print(response)

  Of course! Here's a birthday card message for your friend Andrew:

"Happy birthday to an incredible friend like you, Andrew! 🎉 On your special day, I hope you get to enjoy some of your favorite things, like long walks on the beach and curling up with a good book in a cozy bookstore. 📚🌊

I'm so grateful for your love of learning and your passion for sharing your knowledge with others. Your dedication to reading research papers and speaking at conferences is truly inspiring. 💡🎤

And let's not forget your love for pandas! 🐼 They're such adorable and fascinating creatures, just like you. 😊

Here's to another amazing year of adventures, learning, and friendship! Cheers, Andrew! 🥳🎂"


In [14]:
prompt = """
Help me write a birthday card for my dear friend Andrew.
Here are details about my friend:
He likes long walks on the beach and reading in the bookstore.
His hobbies include reading research papers and speaking at conferences.
His favorite color is light blue.
He likes pandas.
"""
response = llama(prompt, temperature=0.9)
print(response)

  Of course! Here's a birthday card message for your friend Andrew:

Dear Andrew,

Happy birthday to our dear friend! 🎉 On your special day, I hope you take a moment to relax and enjoy a leisurely walk on the beach. Or perhaps spend some time curled up with a good book in a cozy bookstore. After all, reading is one of your favorite hobbies! 📚

But we all know that your true passion is sharing your knowledge and insights with the world. 💡 Your hobbies include reading research papers and speaking at conferences – impressive! 🎯 And who knows, maybe one day you'll be speaking at a conference in a light blue shirt – your favorite color! 😃

And let's not forget your love for adorable pandas! 🐼 They're so cute, aren't they? Well, you're pretty great too, Andrew! 😊

Here's to another amazing year of learning, growing, and making the world a better place. Cheers, my friend! 🥳

Warmly,
[Your Name]


In [15]:
# run the code again - the output should be different
response = llama(prompt, temperature=0.9)
print(response)

  Of course, I'd be happy to help you write a birthday card for your dear friend Andrew! Here's a suggestion:

Dear Andrew,

Happy birthday to an amazing friend like you! 🎉 As we celebrate another year of your life, I can't help but think of all the things that make you unique and special. 😊

I know you enjoy taking long walks on the beach and indulging in some good bookstore reading time. It's no surprise that you're always so well-read and knowledgeable about the latest research findings - after all, you have a passion for learning that's unmatched! 📚🔍

But let's not forget about your other hobbies too! I'm sure you've given some impressive presentations at conferences and have a natural talent for speaking in front of crowds. 💬 And your love for light blue is always a delightful touch. 🌊

Oh, and have I mentioned your adoration for pandas? 🐼 They're just so cute and cuddly! It's no wonder you're always smiling when you talk about them. 😊

In short, Andrew, you're an incredible frien

### Changing the max tokens setting

In [16]:
prompt = """
Help me write a birthday card for my dear friend Andrew.
Here are details about my friend:
He likes long walks on the beach and reading in the bookstore.
His hobbies include reading research papers and speaking at conferences.
His favorite color is light blue.
He likes pandas.
"""
response = llama(prompt,max_tokens=20)
print(response)

  Of course! Here's a birthday card message for your friend Andrew:

"


The next cell reads in the text of the children's book *The Velveteen Rabbit* by Margery Williams, and stores it as a string named `text`. (Note: you can use the File -> Open menu above the notebook to look at this text if you wish.)

In [24]:
# with open("/home/centrox_ai/Desktop/ABDULLAH/llama2/rag/1.pdf", "r", encoding='utf=8') as file:
#     text = file.read()

import PyPDF2

pdf_path = "/home/centrox_ai/Desktop/ABDULLAH/llama2/rag/1.pdf"

with open(pdf_path, "rb") as file:
    reader = PyPDF2.PdfReader(file)
    text = ""
    num_pages = len(reader.pages)
    for page_num in range(num_pages):
        page = reader.pages[page_num]
        text += page.extract_text()

# print(text)


In [37]:
print(text[10000:15000])

an only be proven for the time when the Romans had
commenced to undertake maritime wars. From these pontifical
records were compiled the so-called annales Maximi , or chief [xiv]
annals, whose name permits the belief that briefer compilations
were also in existence. There were likewise commentaries
preserved in the priestly colleges, which contained ritualistic
formulæ, as well as attempted explanations of the origins of
usages and ceremonies.
Apart from these annals and commentaries there existed but
little historical material before the close of the third century B. C.
There was no Roman literature; no trace remains of any narrative
poetry, nor of family chronicles. Brief funerary inscriptions,
like that of Scipio Barbatus, appear in the course of the third
century, and laudatory funeral orations giving the records of
family achievements seem to have come into vogue about the
end of the same century.
However, the knowledge of writing made possible the
inscription upon stone or other 

In [38]:
prompt = f"""
Give me a summary of the following text in 50 words:\n\n
{text[7000:13000]}
"""
response = llama(prompt,model="META-LLAMA/LLAMA-3-70B-CHAT-HF")

In [39]:
print(response)

assistant

Here is a summary of the text in 50 words:

The text is an introduction to a history of Rome, discussing the sources of early Roman history, including annals, pontifical records, and Greek historians. It highlights the differences in modern writers' treatment of early Roman centuries and the credibility of ancient accounts.assistant

Here is a rewritten summary in 50 words:

The introduction to a history of Rome explores the sources of early Roman history, including annals, pontifical records, and Greek historians like Fabius Pictor and Diodorus. It notes the varying credibility of ancient accounts and the differences in modern writers' interpretations of early Roman centuries.assistant

Here is a rewritten summary in 50 words:

The introduction to a history of Rome examines the sources of early Roman history, including annals, pontifical records, and Greek historians. It discusses the credibility of ancient accounts and the differences in modern writers' interpretations of 

Running the cell above returns an error because we have too many tokens. 

In [None]:
# sum of input tokens (prompt + Velveteen Rabbit text) and output tokens
3974 + 1024

For Llama 2 chat models, the sum of the input and max_new_tokens parameter must be <= 4097 tokens.

In [None]:
# calculate tokens available for response after accounting for 3974 input tokens
4097 - 3974

In [None]:
# set max_tokens to stay within limit on input + output tokens
prompt = f"""
Give me a summary of the following text in 50 words:\n\n
{text}
"""
response = llama(prompt,
                max_tokens=123)

In [None]:
print(response)

In [None]:
# increase max_tokens beyond limit on input + output tokens
prompt = f"""
Give me a summary of the following text in 50 words:\n\n
{text}
"""
response = llama(prompt,
                max_tokens=124)

In [None]:
print(response)

### Asking a follow up question

In [42]:
prompt = """
Help me write a birthday card for my dear friend Andrew.
Here are details about my friend:
He likes long walks on the beach and reading in the bookstore.
His hobbies include reading research papers and speaking at conferences.
His favorite color is light blue.
He likes pandas.
"""
response = llama(prompt,model="META-LLAMA/LLAMA-3-70B-CHAT-HF")
print(response)

Here's a birthday card idea for your friend Andrew:

**Front of the Card:**
Happy Birthday, Andrew!

**Inside of the Card:**
Wishing a very happy birthday to my dear friend Andrew!

As you celebrate another year of life, I hope you get to take a long, leisurely walk on the beach, feeling the warm sun on your face and the cool breeze in your hair. And, of course, I hope you find some time to curl up with a good book in your favorite bookstore.

I'm so grateful to have a friend like you who is always curious and passionate about learning. Your love for reading research papers and sharing your knowledge with others at conferences is truly inspiring.

Here's to another amazing year ahead! May it be filled with joy, love, and all things light blue (your favorite color, of course!).

P.S. I hope your special day is as adorable as a panda's smile.

Feel free to modify it as per your taste and preferences!assistantstyleTypeassistantassistant

Here's a birthday card idea for your friend Andrew:

In [43]:
prompt_2 = """
Oh, he also likes teaching. Can you rewrite it to include that?
"""
response_2 = llama(prompt_2,model="META-LLAMA/LLAMA-3-70B-CHAT-HF")
print(response_2)

Here is a rewritten version of the paragraph:

Dr. Smith is a renowned expert in the field of environmental science. With over 20 years of experience, he has made significant contributions to our understanding of climate change and its impact on ecosystems. In addition to his research, Dr. Smith is also passionate about teaching and mentoring the next generation of scientists. He has taught courses on environmental policy, sustainability, and ecology, and has supervised numerous graduate students and postdoctoral researchers. His dedication to education has earned him several teaching awards, and his students praise him for his ability to make complex concepts accessible and engaging.assistant

Here is a rewritten version of the paragraph:

Dr. Smith is a renowned expert in the field of environmental science. With over 20 years of experience, he has made significant contributions to our understanding of climate change and its impact on ecosystems. In addition to his research, Dr. Smith

### (Optional): Using Llama 2 or 3 on your own computer!
- The smaller Llama 2 or 3 chat model is free to download on your own machine!
  - **Note** that only the Llama 2 7B chat or Llama 3 8B model (by default the 4-bit quantized version is downloaded) may work fine locally.
  - Other larger sized models could require too much memory (13b models generally require at least 16GB of RAM and 70b models at least 64GB of RAM) and run too slowly.
  - The Meta team still recommends using a hosted API service (in this case, the classroom is using Together.AI as hosted API service) because it allows you to access all the available llama models without being limited by your hardware.
  - You can find more instructions on using the Together.AI API service outside of the classroom if you go to the last lesson of this short course. 
- One way to install and use llama 7B on your computer is to go to https://ollama.com/ and download app. It will be like installing a regular application.
- To use Llama 2 or 3, the full instructions are here: https://ollama.com/library/llama2 and https://ollama.com/library/llama3.


#### Here's an quick summary of how to get started:
  - Follow the installation instructions (for Windows, Mac or Linux).
  - Open the command line interface (CLI) and type `ollama run llama2` or `ollama run llama3`. 
  - The first time you do this, it will take some time to download the llama 2 or 3 model. After that, you'll see 
> `>>> Send a message (/? for help)`

- You can type your prompt and the llama-2 model on your computer will give you a response!
- To exit, type `/bye`.
- For a list of other commands, type `/?`.

![](ollama_example.png "")


