## Preliminaries

For Windows users, type the following command in Command Prompt:

```
setx HF_TOKEN "your_token_here"
```

For macOS users, type the following command in Terminal:

```
export HF_TOKEN="your_token_here"
```

In [1]:
import os
token = os.getenv("HF_TOKEN")
token[:3]+'...'

'hf_...'

## Load Data

In [2]:
from src.util.json_io import *

train_data = load_jsonlines('data/gsm8k/train.jsonl')
test_data = load_jsonlines('data/gsm8k/test.jsonl')

## Load Model

In [3]:
import transformers
import torch

model_id = "meta-llama/Meta-Llama-3-8B-Instruct"

pipeline = transformers.pipeline(
    "text-generation", 
    model=model_id, 
    model_kwargs={
        "torch_dtype": torch.bfloat16,
        'use_auth_token': token
    }, 
    device_map="auto"
)



Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [4]:
question = "123+456="
context = pipeline(
    question, 
    max_length=len(question)+2,
    truncation=True, 
    pad_token_id=pipeline.tokenizer.eos_token_id
)[0]['generated_text']

print(context)

  attn_output = torch.nn.functional.scaled_dot_product_attention(


123+456=479` which is `true


In [5]:
import random
for i in range(10):
    num1 = random.randint(100, 999)
    num2 = random.randint(100, 999)
    question = f"{num1} + {num2} ="
    context = pipeline(
        question, 
        max_length=len(question)+2,
        truncation=True, 
        pad_token_id=pipeline.tokenizer.eos_token_id
    )[0]['generated_text']

    print(context)

199 + 741 = 842.
Answer: The answer is
692 + 187 = 879.
Final Answer: The final
654 + 684 = 1348
Final Answer: The
511 + 815 = 1526
Answer: B)
282 + 463 = 745
745 + 345 =
697 + 822 = 1521.
Final Answer: The
173 + 317 = 380, so 380 is the
993 + 421 = 1424. 1424 -


You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset


788 + 139 = 1,027
Answer: 
786 + 759 = 1445.
Final Answer: The


In [6]:
from src.util.gsm8k_io import *

prompt = nshot_prompt(train_data, 8)  # 8-shot prompt
question = "Janet\u2019s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?"

question = prompt+question
print(question)

Question: Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?
Answer: Natalia sold 48/2 = <<48/2=24>>24 clips in May.
Natalia sold 48+24 = <<48+24=72>>72 clips altogether in April and May.
#### 72

Question: Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?
Answer: Weng earns 12/60 = $<<12/60=0.2>>0.2 per minute.
Working 50 minutes, she earned 0.2 x 50 = $<<0.2*50=10>>10.
#### 10

Question: Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?
Answer: In the beginning, Betty has only 100 / 2 = $<<100/2=50>>50.
Betty's grandparents gave her 15 * 2 = $<<15*2=30>>30.
This means, Betty needs 100 - 50 - 30 - 15 = $<<100-50

In [7]:
context = pipeline(
    question, 
    max_length=len(question)+10,
    truncation=True, 
    pad_token_id=pipeline.tokenizer.eos_token_id
)[0]['generated_text']

print(context)

Question: Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?
Answer: Natalia sold 48/2 = <<48/2=24>>24 clips in May.
Natalia sold 48+24 = <<48+24=72>>72 clips altogether in April and May.
#### 72

Question: Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?
Answer: Weng earns 12/60 = $<<12/60=0.2>>0.2 per minute.
Working 50 minutes, she earned 0.2 x 50 = $<<0.2*50=10>>10.
#### 10

Question: Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?
Answer: In the beginning, Betty has only 100 / 2 = $<<100/2=50>>50.
Betty's grandparents gave her 15 * 2 = $<<15*2=30>>30.
This means, Betty needs 100 - 50 - 30 - 15 = $<<100-50

In [8]:
def generate_text(question, max_len=200):
    full_text = question
    for _ in range(max_len):
        response = pipeline(
            full_text,
            max_length=len(full_text) + 1, # generate one more token
            truncation=True,
            pad_token_id=pipeline.tokenizer.eos_token_id
        )[0]['generated_text']
        
        full_text = response

        if len(full_text) >= 2 and full_text[-2:] == "\n\n":
            break

    return full_text

response = generate_text(question)

print(response)

KeyboardInterrupt: 

In [None]:
for qa in test_data[:10]:
    question = qa['question']
    context = generate_text(prompt+question)
    print(context)
    print()