In [14]:
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, TextStreamer
import torch
import evaluate
from utils import save_response

# Llama 3 - 8B Instruct Model

In [2]:
model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]



In [3]:
messages = [
    {"role": "system", "content": f"""You are a AI writer who creates stories in various genres. We are writing stories for a mobile application that helps users walk/run and train for marathons.  
The genre and description will come by user input called Prompt.
The stories are meant to be played during a training session in order to help the user train for marathons. 
The story should have embedded cues for speed changes and breaks where approriate depending on the Training Style."""},
    {"role": "user", "content": f"""Can you help me write a story with the following details:
Prompt: A zombie horror story set in New York City.
Distance: 2 miles
Average Walking Speed: 2 mph
Average Running Speed: 5 mph
Training Style: Short-distance sprint"""},
{"role": "system", "content": f"""Here's a zombie horror story set in New York City for a 0.5-mile training session with a short-distance sprint training style.

**Story: "The Brooklyn Bridge Apocalypse"**

You're a survivor of a zombie outbreak in New York City. You've managed to escape the initial chaos and find yourself on the Brooklyn Bridge, trying to make your way to safety. As you look around, you see the Manhattan skyline in the distance, but you know it's crawling with undead.

You start your journey, your heart racing with every step. You can hear the distant moans and groans of the zombies, and you know you need to move fast. You break into a sprint, your feet pounding the pavement as you make your way across the bridge.

**Cue: Speed up to 5 mph**

As you run, you notice the zombies are getting closer. You can see them shambling towards you, their eyes fixed on you with an unnatural hunger. You pick up speed, your breathing heavy as you try to outrun the undead.

Suddenly, you hear a loud thud behind you. You glance back to see a zombie stumbling onto the bridge, its arms outstretched. You don't hesitate, pushing yourself to run even faster.

**Cue: Sprint for 0.25 miles**

You're almost halfway across the bridge when you see a group of survivors huddled together, trying to escape the chaos. You spot a safe haven and make a beeline for it.

As you approach, you notice the survivors are armed and ready to defend themselves. They welcome you with open arms, and you take a moment to catch your breath.

**Cue: Walk for 0.15 miles**

But your respite is short-lived. The zombies are closing in, and the survivors need your help to fend them off. You grab a nearby bat and join the fight, swinging it wildly at the undead.

**Cue: Sprint for 0.1 miles**

You manage to fend off the zombies, but you know you need to keep moving. You grab a few supplies and continue your journey across the bridge, the Manhattan skyline looming ahead.

As you near the end of the bridge, you see a glimmer of hope – a safe zone, marked by a bright light. You sprint towards it, the zombies hot on your heels.

**Cue: Sprint to the finish**

You burst through the gates, exhausted but alive. You've made it to safety, but you know the fight is far from over."""}, 
{"role": "user", "content": f"""Prompt: A action packed adventure story based off of Indiana Jones, include outrunning deadly traps, a chase scene with a treasure hunter and a villain, jungles and ancient ruins.
Distance: 13.1 miles
Average Walking Speed: 2.5 mph
Average Running Speed: 7.2 mph
Training Style: Half Marathon"""}
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

In [4]:
hyperparams = {
    "max_new_tokens": 512, 
    "temperature":0.6, 
    "top_p":0.9, 
    "do_sample": True
}
streamer = TextStreamer(tokenizer)
outputs = model.generate(
    input_ids,
    streamer=streamer, 
    max_new_tokens=hyperparams["max_new_tokens"],
    eos_token_id=terminators,
    do_sample=hyperparams["do_sample"],
    temperature=hyperparams["temperature"],
    top_p=hyperparams["top_p"],
)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation.
  attn_output = torch.nn.functional.scaled_dot_product_attention(


In [5]:
response = outputs[0][input_ids.shape[-1]:]
output = tokenizer.decode(response, skip_special_tokens=True)
print(output)

Here's an action-packed adventure story based on Indiana Jones for a 13.1-mile half-marathon training session with a half-marathon training style.

**Story: "The Quest for the Golden Idol"**

You're an archaeologist on a mission to uncover the secrets of a long-lost civilization. Your guide, a local expert named Juan, leads you deep into the jungle, where you're searching for the fabled Golden Idol. Legend has it that the idol holds the key to unlocking the secrets of the ancient ruins.

As you trek through the dense foliage, you hear the distant rumble of a waterfall. Juan warns you to stay close, as the path ahead is treacherous. Suddenly, the ground gives way beneath your feet, and you find yourself sliding down a steep incline.

**Cue: Walk for 0.5 miles**

You slide to a stop, shaken but unharmed. Juan helps you up, and you continue on your quest. As you round a bend, you spot a clearing, and in the center, you see the entrance to an ancient temple.

You enter the temple, and Juan

In [6]:
save_response("LLAMA3-8B-INSTRUCT", hyperparams, output)

In [16]:
# Recursive test for unfinished prompts
messages.append({"role": "system", "content": f"""{output}"""})
messages.append({"role": "user", "content": f"""Can you continue the story. It seems to have been truncated due to the max new token length of 512. I changed this to 1000. You do not need to use all the tokens but fit the output within the new max token limit.  """})
print(messages)

[{'role': 'system', 'content': 'You are a AI writer who creates stories in various genres. We are writing stories for a mobile application that helps users walk/run and train for marathons.  \nThe genre and description will come by user input called Prompt.\nThe stories are meant to be played during a training session in order to help the user train for marathons. \nThe story should have embedded cues for speed changes and breaks where approriate depending on the Training Style.'}, {'role': 'user', 'content': 'Can you help me write a story with the following details:\nPrompt: A zombie horror story set in New York City.\nDistance: 2 miles\nAverage Walking Speed: 2 mph\nAverage Running Speed: 5 mph\nTraining Style: Short-distance sprint'}, {'role': 'system', 'content': 'Here\'s a zombie horror story set in New York City for a 0.5-mile training session with a short-distance sprint training style.\n\n**Story: "The Brooklyn Bridge Apocalypse"**\n\nYou\'re a survivor of a zombie outbreak in 

In [17]:

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

In [18]:
hyperparams = {
    "max_new_tokens": 1000, 
    "temperature":0.6, 
    "top_p":0.9, 
    "do_sample": True
}
streamer = TextStreamer(tokenizer)
outputs = model.generate(
    input_ids,
    streamer=streamer, 
    max_new_tokens=hyperparams["max_new_tokens"],
    eos_token_id=terminators,
    do_sample=hyperparams["do_sample"],
    temperature=hyperparams["temperature"],
    top_p=hyperparams["top_p"],
)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation.


<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a AI writer who creates stories in various genres. We are writing stories for a mobile application that helps users walk/run and train for marathons.  
The genre and description will come by user input called Prompt.
The stories are meant to be played during a training session in order to help the user train for marathons. 
The story should have embedded cues for speed changes and breaks where approriate depending on the Training Style.<|eot_id|><|start_header_id|>user<|end_header_id|>

Can you help me write a story with the following details:
Prompt: A zombie horror story set in New York City.
Distance: 2 miles
Average Walking Speed: 2 mph
Average Running Speed: 5 mph
Training Style: Short-distance sprint<|eot_id|><|start_header_id|>system<|end_header_id|>

Here's a zombie horror story set in New York City for a 0.5-mile training session with a short-distance sprint training style.

**Story: "The Brooklyn Bridge Apoc

In [19]:
response = outputs[0][input_ids.shape[-1]:]
output = tokenizer.decode(response, skip_special_tokens=True)
print(output)

I apologize for the truncation. Here is the continuation of the story, within the new 1000-token limit:

You know you need to come up with a plan to outsmart them. Juan whispers a plan in your ear, and you nod, ready to put it into action.

**Cue: Run for 1.5 miles**

You make a break for the pedestal, dodging Sophia's henchmen as you go. Juan provides cover fire, taking down several of the henchmen with his trusty rifle.

As you reach the pedestal, you see that it's surrounded by a ring of fire. You know you need to get the idol without getting burned.

**Cue: Walk for 0.5 miles**

You carefully make your way around the ring of fire, using your knowledge of ancient civilizations to find a weakness in the ancient mechanism. You trigger a hidden switch, and the pedestal begins to glow.

Sophia appears, her eyes blazing with anger. "You'll never get away with this!" she screams.

**Cue: Run for 2 miles**

You grab the Golden Idol and make a run for it, Sophia and her henchmen hot on your

In [20]:
save_response("LLAMA3-8B-INSTRUCT", hyperparams, output)

In [7]:
# perplexity = evaluate.load("perplexity", module_type="metric")
# input_texts = ["lorem ipsum", "Happy Birthday!", "Bienvenue"]
# results = perplexity.compute(model_id=model_id,
#                              add_start_token=False,
#                              predictions=input_texts, 
#                              device="cuda")
# print(list(results.keys()))

In [8]:
# print(round(results["mean_perplexity"], 2))