## Importing all the required libraries

In [1]:
import torch
from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer, AutoConfig
import os
from datasets import load_dataset, DatasetDict
os.environ['TRANSFORMERS_CACHE'] = '/scratch/workspace/wenlongzhao_umass_edu-analyze/transformers_cache'



## Downloading and caching the model at the above TRANSFORMERS_CACHE

In [26]:
hf_token = os.getenv("hf_token")
model_name = "meta-llama/Llama-3.2-1B-Instruct"
config = AutoConfig.from_pretrained(model_name, token=hf_token)
tokenizer = AutoTokenizer.from_pretrained(model_name, token=hf_token, config=config,cache_dir='/scratch/workspace/wenlongzhao_umass_edu-analyze/transformers_cache')
model = AutoModelForCausalLM.from_pretrained(model_name, token=hf_token, config=config,cache_dir='/scratch/workspace/wenlongzhao_umass_edu-analyze/transformers_cache')
tokenizer.pad_token_id = tokenizer.eos_token_id

#### Downlaoding the GSM8K dataset from HF and then splitting the train into train and val. Saving this new split along with the original test split in a new dataset Dict

In [3]:
ds = load_dataset("openai/gsm8k", "main")
train_val_split = ds['train'].train_test_split(test_size=0.2)
train_feedback_split = train_val_split['train'].train_test_split(test_size=0.85)
final_dataset = DatasetDict({
    'train': train_feedback_split['train'],
    'feedback':train_feedback_split['test'],
    'val': train_val_split['test'],  # Rename the 'test' split from train_test_split to 'val'
    'test': ds['test']  # Keep the original test set
})

README.md:   0%|          | 0.00/7.94k [00:00<?, ?B/s]

In [4]:
final_dataset

DatasetDict({
    train: Dataset({
        features: ['question', 'answer'],
        num_rows: 896
    })
    feedback: Dataset({
        features: ['question', 'answer'],
        num_rows: 5082
    })
    val: Dataset({
        features: ['question', 'answer'],
        num_rows: 1495
    })
    test: Dataset({
        features: ['question', 'answer'],
        num_rows: 1319
    })
})

## Saving the final dataset split into our datasets directory

In [5]:
final_dataset.save_to_disk("../datasets/gsm8k/")

Saving the dataset (0/1 shards):   0%|          | 0/896 [00:00<?, ? examples/s]

Saving the dataset (0/1 shards):   0%|          | 0/5082 [00:00<?, ? examples/s]

Saving the dataset (0/1 shards):   0%|          | 0/1495 [00:00<?, ? examples/s]

Saving the dataset (0/1 shards):   0%|          | 0/1319 [00:00<?, ? examples/s]

#### Sample generation code

In [29]:
model.eval()

# Input prompt
input_text = final_dataset["train"]["question"][0]

# Tokenize input
inputs = tokenizer(input_text, return_tensors="pt")

# Generate output
with torch.no_grad():
    output = model.generate(**inputs, max_length=500, num_return_sequences=1,  pad_token_id=tokenizer.pad_token_id)

# Decode the output
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

Sue works in a factory and every 30 minutes, a machine she oversees produces 30 cans of soda. How many cans of soda can one machine produce in 8 hours? First, we need to convert 8 hours to minutes, then multiply the number of minutes in 8 hours by 30, and finally divide by 30 to get the number of cans of soda the machine can produce in 8 hours.
## Step 1: Convert 8 hours to minutes
There are 60 minutes in an hour, so 8 hours is equal to 8 * 60 = 480 minutes.

## Step 2: Multiply the number of minutes in 8 hours by 30
Since the machine produces 30 cans of soda every 30 minutes, we multiply 480 minutes by 30 cans.

## Step 3: Calculate the total number of cans
480 * 30 = 14400 cans.

## Step 4: Divide the total number of cans by 30
To find out how many cans the machine can produce in 8 hours, we divide 14400 by 30.

## Step 5: Calculate the result
14400 / 30 = 480.

The final answer is: $\boxed{480}$


In [30]:
final_dataset["train"]["answer"][0]

'Since there are 2 sets of 30 minutes in an hour, then in 8 hours there are 8 x 2 = <<8*2=16>>16 sets of 30 minutes.\nHence, a machine that Sue oversees can produce 30 cans x 16 = <<30*16=480>>480 cans of soda in 8 hours.\n#### 480'

In [2]:
from datasets import load_dataset

ds = load_dataset("reasoning-machines/gsm-hard")

README.md:   0%|          | 0.00/1.51k [00:00<?, ?B/s]

gsmhardv2.jsonl:   0%|          | 0.00/1.14M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/1319 [00:00<?, ? examples/s]

In [3]:
ds

DatasetDict({
    train: Dataset({
        features: ['input', 'code', 'target'],
        num_rows: 1319
    })
})