In [1]:
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, GenerationConfig, pipeline

import os
import torch

In [2]:
# This recipe requires a HuggingFace token to access the llama models from the hub. You will need to create a HuggingFace login
# and generate a token for use at https://huggingface.co/settings/tokens. You will also need to visit the model card at 
# https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct and accept the terms to use the model. Once you have generated the token,
# save it in an environment variable named HUGGINGFACE_TOKEN and read it in the recipe as shown in the snippet below.
from huggingface_hub import login
hf_token = os.environ.get('HUGGINGFACE_TOKEN')
login(token=hf_token)


VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [3]:
model_name = "meta-llama/Meta-Llama-3.1-8B-Instruct"
quantization_config = BitsAndBytesConfig(load_in_4bit=True,
                                         bnb_4bit_compute_dtype=torch.bfloat16,
                                         bnb_4bit_use_double_quant=True,
                                         bnb_4bit_quant_type= "nf4"
                                         )

In [4]:
model = AutoModelForCausalLM.from_pretrained(
                                        	model_name, 
                                            device_map="auto", 
                                            torch_dtype=torch.bfloat16, 
                                            quantization_config=quantization_config)
tokenizer = AutoTokenizer.from_pretrained(model_name)

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

In [5]:
pipe = pipeline("text-generation", 
                model=model, 
                tokenizer=tokenizer, 
                max_new_tokens=256, 
                pad_token_id = tokenizer.eos_token_id,
                eos_token_id=model.config.eos_token_id,
                num_beams=4,
                early_stopping=True,
                repetition_penalty=1.4)

In [6]:
prompt = [
    {"role": "user", "content": "What is your favourite country?"},
    {"role": "assistant", "content": "Well, I am quite fascinated with Peru."},
    {"role": "user", "content": "What can you tell me about Peru?"}
]

In [7]:
outputs = pipe(
    prompt,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1]['content'])

Peru! A country with a rich history, diverse culture, and breathtaking landscapes. Here are some interesting facts about Peru:

1. **Location**: Peru is located in western South America, bordering the Pacific Ocean to the west, Ecuador and Colombia to the north, Brazil and Bolivia to the east, and Chile to the south.
2. **History**: Peru has a long and complex history, with various civilizations rising and falling over the centuries. The Inca Empire, which flourished from the 13th to the 16th century, is one of the most famous and influential empires in Peruvian history.
3. **Machu Picchu**: One of the Seven Wonders of the World, Machu Picchu is an Inca citadel located on a mountain ridge above the Urubamba Valley. It's a must-visit destination for any traveler to Peru.
4. **Food**: Peruvian cuisine is a fusion of indigenous, Spanish, African, and Asian influences. Popular dishes include ceviche (raw fish marinated in citrus juices), lomo saltado (stir-fried beef), and ají de gallina (

In [8]:
prompt = [
    {"role": "user", "content": "Mary is twice as old as Sarah presently. Sarah is 6 years old.?"},
    {"role": "assistant", "content": "Well, what can I help you with?"},
    {"role": "user", "content": "Can you tell me in a step by step way on how old Mary will be after 5 years?"}
]

In [9]:
outputs = pipe(
    prompt,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1]['content'])

Here's the step-by-step solution:

**Step 1: Determine Sarah's current age**
Sarah is 6 years old.

**Step 2: Determine Mary's current age**
Since Mary is twice as old as Sarah, and Sarah is 6 years old, we can multiply Sarah's age by 2 to find Mary's age:
Mary's age = 2 x Sarah's age
Mary's age = 2 x 6
Mary's age = 12 years old

**Step 3: Calculate Mary's age after 5 years**
To find out how old Mary will be after 5 years, we add 5 to her current age:
Mary's age after 5 years = Mary's current age + 5
Mary's age after 5 years = 12 + 5
Mary's age after 5 years = 17 years old

Therefore, Mary will be 17 years old after 5 years.
