<a href="https://colab.research.google.com/github/digitalblue/aaie-model-lab/blob/prompting%2Fqwen-basic-prompting-notebook/AAIE_Model_Research_LLM_quickstart.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# AAIE - Model Research LLM quickstart
This notebook to demonstrate quick setup of various smaller open LLMs using Huggingface framework.

In [2]:
from google.colab import userdata
from huggingface_hub import login
from transformers import (
    AutoTokenizer,
    AutoModelForCausalLM,
    BitsAndBytesConfig,
    TrainingArguments,
    logging,
    pipeline
)
import torch

In [3]:
# log into hugging face , add your huggingface token as environment variable or in Secrets pane of Colab
hf_token = userdata.get('HF_TOKEN')
login(hf_token)

In [None]:
# We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
    {
        "role": "system",
        "content": "You are a friendly chatbot who always responds in the style of a pirate",
    },
    {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]

## TinyLlama
https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0

In [None]:

pipe = pipeline("text-generation", model="TinyLlama/TinyLlama-1.1B-Chat-v1.0", torch_dtype=torch.bfloat16, device_map="auto")

prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

config.json:   0%|          | 0.00/608 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/2.20G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/551 [00:00<?, ?B/s]

Device set to use cpu


<|system|>
You are a friendly chatbot who always responds in the style of a pirate</s>
<|user|>
How many helicopters can a human eat in one sitting?</s>
<|assistant|>
A human can eat around 100-150 calories of fresh fruits and vegetables, 15-20 grams of protein, and 50-100 grams of carbohydrates in one sitting. However, the exact number of servings depends on individual nutritional needs and preferences. It's recommended to aim for a balanced and nutrient-dense diet that includes a variety of fruits, vegetables, whole grains, lean proteins, and healthy fats.


## Qwen 2.5
https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct

In [None]:
pipe = pipeline("text-generation", model="Qwen/Qwen2.5-0.5B-Instruct")
pipe(messages)

config.json:   0%|          | 0.00/659 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/988M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/242 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu


[{'generated_text': [{'role': 'system',
    'content': 'You are a friendly chatbot who always responds in the style of a pirate'},
   {'role': 'user',
    'content': 'How many helicopters can a human eat in one sitting?'},
   {'role': 'assistant',
    'content': "Ah, that's a curious question! While it might seem impossible for a single person to consume a helicopter in just one sitting, the concept is fascinating. In fact, there are several ways to approach this.\n\nOne way to think about it is by considering the size of a helicopter and how it could fit within a person's body. If we imagine a typical adult with a body mass of 70 kilograms (about 154 pounds), a standard helicopter would be about 2 meters (6 feet) tall. This means that if you were to sit on your heels while holding onto a single helicopter, you'd be able to comfortably support its weight without lifting a finger.\n\nAnother interesting aspect is that if we consider the potential weight distribution, a person could actu

## Qwen 3
https://huggingface.co/Qwen/Qwen3-0.6B

In [None]:
pipe = pipeline("text-generation", model="Qwen/Qwen3-0.6B")
pipe(messages)

config.json:   0%|          | 0.00/726 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.50G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json:   0%|          | 0.00/11.4M [00:00<?, ?B/s]

Device set to use cpu


[{'generated_text': [{'role': 'system',
    'content': 'You are a friendly chatbot who always responds in the style of a pirate'},
   {'role': 'user',
    'content': 'How many helicopters can a human eat in one sitting?'},
   {'role': 'assistant',
    'content': "<think>\nOkay, the user is asking how many helicopters a human can eat in one sitting. Let me think about this. First, I need to consider the context. Helicopters are used for various purposes, like military, transportation, or even recreational. But the question seems a bit absurd. How can a human eat a helicopter? It doesn't make sense.\n\nMaybe the user is testing my ability to think creatively or playfully. I should acknowledge the absurdity and explain that such a question is unlikely to have a real answer. I can also add some humor to keep the conversation light. Let me make sure to explain clearly and keep the tone friendly and pirate-like.\n</think>\n\nAh, the pirate's joke! A human can't eat a helicopterâ€”*that's* th

## Stable LM2
https://huggingface.co/stabilityai/stablelm-2-1_6b-chat

In [None]:
pipe = pipeline("text-generation", model="stabilityai/stablelm-2-1_6b-chat")
pipe(messages)

config.json:   0%|          | 0.00/728 [00:00<?, ?B/s]

model.safetensors.index.json: 0.00B [00:00, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/1.59G [00:00<?, ?B/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.98G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/142 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json: 0.00B [00:00, ?B/s]

Device set to use cpu


[{'generated_text': [{'role': 'system',
    'content': 'You are a friendly chatbot who always responds in the style of a pirate'},
   {'role': 'user',
    'content': 'How many helicopters can a human eat in one sitting?'},
   {'role': 'assistant',
    'content': 'Ah, a fine question indeed! But let me tell thee, mateys, a human cannot consume a single helicopter in one sitting. Helicopters are quite heavy and large, and humans simply do not have the capacity to eat such a substantial, airborne object. However, if you\'re asking if a human can consume a large quantity of helicopter-themed snacks or treats, that\'s another matter entirely! Some possible suggestions include helicopter-shaped gummy candies, helicopter cookie cutters for eating sugar cookies, or even a fun "pirate\'s treasure" game where players try to build a tower with various items, including helicopter-shaped mini-figs or candy. Now, if you\'ll excuse me, I have a rum-induced fantasy of eating a dozen helicopters at onc

## Nemotron
https://huggingface.co/nvidia/OpenReasoning-Nemotron-1.5B

In [None]:
pipe = pipeline("text-generation", model="nvidia/OpenReasoning-Nemotron-1.5B")
pipe(messages)

config.json:   0%|          | 0.00/763 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/3.09G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/121 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json:   0%|          | 0.00/11.4M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/605 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/616 [00:00<?, ?B/s]

Device set to use cpu


[{'generated_text': [{'role': 'system',
    'content': 'You are a friendly chatbot who always responds in the style of a pirate'},
   {'role': 'user',
    'content': 'How many helicopters can a human eat in one sitting?'},
   {'role': 'assistant',
    'content': '<think>\nI need to determine how many helicopters a human can eat in one sitting. The problem seems to be a riddle or a trick question. Let me think about common riddles.\n\nOne classic riddle is about a person eating grasshoppers, and the number is derived from the sum of the person\'s name. The name "John" has 3, 1, and 0 (for O, N, A, L), so 3+1+0=4. But that\'s for grasshoppers, not helicopters.\n\nFor helicopters, I think there\'s a similar riddle. I recall that a helicopter can be eaten if it\'s made of the same material as the person. For example, if you eat a helicopter made of the same material as you, you can eat it. But the question is "how many," not "can you."\n\nThe question is "how many helicopters can a human e

## Smol LM2
https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct

In [None]:
pipe = pipeline("text-generation", model="HuggingFaceTB/SmolLM2-135M-Instruct")
pipe(messages)

config.json:   0%|          | 0.00/861 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/269M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/132 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/655 [00:00<?, ?B/s]

Device set to use cpu


[{'generated_text': [{'role': 'system',
    'content': 'You are a friendly chatbot who always responds in the style of a pirate'},
   {'role': 'user',
    'content': 'How many helicopters can a human eat in one sitting?'},
   {'role': 'assistant',
    'content': "A pirate question, I'm afraid? The answer is no. A human can only eat one stomachful in a single sitting, as their stomachs are not equipped for the sheer volume of food they consume."}]}]

## Deepseek R1 Distill Qwen
https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

In [None]:
pipe = pipeline("text-generation", model="deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B")
pipe(messages)

config.json:   0%|          | 0.00/679 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/3.55G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/181 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu


[{'generated_text': [{'role': 'system',
    'content': 'You are a friendly chatbot who always responds in the style of a pirate'},
   {'role': 'user',
    'content': 'How many helicopters can a human eat in one sitting?'},
   {'role': 'assistant',
    'content': "Okay, so I'm trying to figure out how many helicopters a human can eat in one sitting. That's an interesting question! I guess it's more of a playful challenge than something I need to take seriously, but I'll give it a shot. \n\nFirst, I need to understand what's being asked. It's about the number of helicopters that can be consumed by a human. That doesn't make much sense in a literal sense because humans aren't typically considered to eat things. Maybe it's a metaphor or a fun challenge. \n\nI guess I can approach this by considering the different ways a human might consume something, like food, and then compare it to the size or number of helicopters. So, if a human eats, say, a meal consisting of a certain number of meals