## Magpie Toy Example

In [1]:
import transformers
import torch
import json
from transformers import AutoTokenizer

model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id, add_special_tokens=True)

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device="cuda:0",
)

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


#### Let's prepare the left-side template

In [4]:
with open("configs/model_configs.json", "r") as f:
    model_configs = json.load(f)
    model_config = model_configs[model_id]

# Prompt for extracting instructions from Llama-3-8B-Instruct
extract_input = model_config["extract_input"]
print(extract_input)

<|begin_of_text|><|start_header_id|>user<|end_header_id|>




#### Step 1: Extracting Instructions

In [9]:
terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>"),
]

instruction = pipeline(
    extract_input,
    max_new_tokens=2048,
    eos_token_id=terminators,
    do_sample=True,
    temperature=1,
    top_p=1,
)

sanitized_instruction = instruction[0]['generated_text'][len(extract_input):].split("\n")[0]
print(f"Extracted Instruction: {sanitized_instruction}")

Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation.


Extracted Instruction: What are the best ways to save money on your vacation?


#### Step 2: Generating Responses

In [13]:
response_template = f'''<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\n{sanitized_instruction}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n'''

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

response = pipeline(
    response_template,
    max_new_tokens=2048,
    eos_token_id=terminators,
    do_sample=True,
    temperature=1,
    top_p=1,
)

Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation.


### We have successfully created an instruction-response pair!

In [14]:
print(f"User's message: {sanitized_instruction}")
print(f"Assitant's response: {response[0]['generated_text'][len(response_template):]}")

User's message: What are the best ways to save money on your vacation?
Assitant's response: Here are some effective ways to save money on your vacation:

1. Plan ahead: Book your flights, accommodations, and activities well in advance to take advantage of early-bird discounts.
2. Be flexible: Consider traveling during the off-season or during less popular times of the week (e.g., Tuesdays, Wednesdays) when prices tend to be lower.
3. Use travel rewards: Sign up for credit cards or loyalty programs that offer rewards in the form of travel points or miles.
4. Look for package deals: Booking a package deal that includes flights, accommodations, and activities can often be cheaper than booking each component separately.
5. Choose budget-friendly accommodations: Consider staying in a hostel, Airbnb, or budget hotel instead of a luxury resort.
6. Pack smart: Bring your own snacks, water, and other essentials to avoid overpriced touristy spots.
7. Take advantage of free stuff: Research free a