In [1]:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, BitsAndBytesConfig

2025-06-19 12:42:25.388685: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


# Load local model, tokenizer, and build generator

In [2]:
model_path = "/media/anna/Samsung_T5/Llama_models/Llama-3.2-1B-Instruct"
bnb_config = BitsAndBytesConfig(load_in_8bit=True) #quantize to speed up for demo

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, quantization_config=bnb_config, device_map="auto")
generator = pipeline("text-generation", model=model, tokenizer=tokenizer)

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


# Format prompts

In [11]:
def build_prompt(system_msg: str, user_msg: str) -> str:
    return (
        "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n"
        f"{system_msg}\n"
        "<|start_header_id|>user<|end_header_id|>\n"
        f"{user_msg}\n"
        "<|start_header_id|>assistant<|end_header_id|>\n"
    )

system_prompt_1 = "You are a helpful assistant that is teaching a 5th grade class."
system_prompt_2 = "You are a helpful assistant that is teaching a university level class."
user_input = "Why is the sky blue?"

full_prompt = build_prompt(system_prompt, user_input)

# Compare outputs for different prompts

In [12]:
response = generator(full_prompt_1, max_new_tokens=256, do_sample=True, temperature=0.7)
print(response[0]["generated_text"])


Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are a helpful assistant that is teaching a university level class.
<|start_header_id|>user<|end_header_id|>
Why is the sky blue?
<|start_header_id|>assistant<|end_header_id|>
The color of the sky is a fundamental aspect of our visual experience and understanding of the world around us, and it has been extensively studied and debated by scientists, philosophers, and artists. 

The sky appears blue to us because of the way that light behaves when it passes from the Earth to atmosphere. The sky is filled with a layer of gases, mainly nitrogen and oxygen, and carbon dioxide that scatter the light in all directions, and this is known as Rayleigh scattering. 

Here is a simple explanation of what is happening:

When sunlight enters the atmosphere, it travels a distance of approximately 1.5 kilometers, then it encounters the atmosphere and it gets scattered. Due to the small size of the molecules of the gas, it is scattered in al

In [None]:
response = generator(full_prompt_2, max_new_tokens=256, do_sample=True, temperature=0.7)
print(response[0]["generated_text"])

# More complex prompt

In [21]:
system_prompt = '''You are a helpful assistant that is teaching a university level class. 
Please format your output as a json object with title and response fields'''
user_input = "Why is the sky blue?"

full_prompt = build_prompt(system_prompt, user_input)

In [22]:
# Run generation
response = generator(full_prompt, max_new_tokens=256, do_sample=True, temperature=0.7)
print(response[0]["generated_text"])

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are a helpful assistant that is teaching a university level class. 
Please format your output as a json object with title and response fields
<|start_header_id|>user<|end_header_id|>
Why is the sky blue?
<|start_header_id|>assistant<|end_header_id|>
{
  "Title": "Why the Sky is Blue",
  "Description": "The sky appears blue due to a phenomenon known as Rayleigh scattering, which causes shorter wavelengths of blue light to scatter more than other colors, resulting in the perception of blue sky color.",
 "Explanation": " Rayleigh scattering occurs when light interacts with the small molecules of gases in atmosphere, particles in the sky. The shorter blue light is scattered by more than other colors, producing blue color.",
 "Example": "The blue color of the sky is not always the same throughout the day, as it changes with atmospheric conditions such as pollution and dust particles in the atmosphere, which scatter light in dif

In [27]:
system_prompt = '''You are a helpful assistant that is teaching a university level class. 
Please format your output as a json object with fields for title, response, and relevant metadata like a timestamp and keywords.'''
user_input = "Why is the sky blue?"

full_prompt = build_prompt(system_prompt, user_input)

In [28]:
# Run generation
response = generator(full_prompt, max_new_tokens=256, do_sample=True, temperature=0.7)
print(response[0]["generated_text"])

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are a helpful assistant that is teaching a university level class. 
Please format your output as a json object with fields for title, response, and relevant metadata like a timestamp and keywords.
<|start_header_id|>user<|end_header_id|>
Why is the sky blue?
<|start_header_id|>assistant<|end_header_id|>
**Response:**
{
  "title": "Why is the sky blue?",
  "date": "2021-01-01-01 00 00:00:00:00",
  "keywords": ["blue sky, color, atmospheric science, physics"],
  "metadata": {
    "author": "John Doe",
    "description": "A brief explanation of why the sky appears blue to us",
    "tags": ["optics, light scattering, visibility"]
}
