<a href="https://colab.research.google.com/github/vilasha/ollama-sandbox/blob/master/src/tools/Tokenizers_and_templates.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Tokenizers and message templates: how they are different from model to model

Works on CPU runtime, doesn't require GPU

First some setup

In [1]:
from google.colab import userdata
from huggingface_hub import login
from transformers import AutoTokenizer

# Log in to Hugging Face

hf_token = userdata.get('HF_TOKEN')
if hf_token and hf_token.startswith("hf_"):
  print("HF key looks good so far")
else:
  print("HF key is not set - please click the key in the left sidebar")
login(hf_token, add_to_git_credential=True)

LLAMA = "meta-llama/Llama-3.2-1B-Instruct"
PHI4 = "microsoft/Phi-4-mini-instruct"
DEEPSEEK = "deepseek-ai/DeepSeek-V3.1"
QWEN_CODER = "Qwen/Qwen2.5-Coder-7B-Instruct"

llama_tokenizer = AutoTokenizer.from_pretrained(LLAMA)
phi4_tokenizer = AutoTokenizer.from_pretrained(PHI4)
deep_seek_tokenizer = AutoTokenizer.from_pretrained(DEEPSEEK)
qwen_coder_tokenizer = AutoTokenizer.from_pretrained(QWEN_CODER)

HF key looks good so far


How tokenization is different

In [2]:
text = "I always tell new hires \"Don't think of me as your boss. Think of me as your friend who can fire you.\""

print("Llama-3.2:")
tokens = llama_tokenizer.encode(text)
print(tokens)
print(llama_tokenizer.batch_decode(tokens))

print("\nPhi-4:")
tokens = phi4_tokenizer.encode(text)
print(tokens)
print(phi4_tokenizer.batch_decode(tokens))

print("\nDeepSeek:")
tokens = deep_seek_tokenizer.encode(text)
print(tokens)
print(deep_seek_tokenizer.batch_decode(tokens))

print("\nQwen2.5-Coder:")
tokens = qwen_coder_tokenizer.encode(text)
print(tokens)
print(qwen_coder_tokenizer.batch_decode(tokens))

Llama-3.2:
[128000, 40, 2744, 3371, 502, 73041, 330, 8161, 956, 1781, 315, 757, 439, 701, 13697, 13, 21834, 315, 757, 439, 701, 4333, 889, 649, 4027, 499, 1210]
['<|begin_of_text|>', 'I', ' always', ' tell', ' new', ' hires', ' "', 'Don', "'t", ' think', ' of', ' me', ' as', ' your', ' boss', '.', ' Think', ' of', ' me', ' as', ' your', ' friend', ' who', ' can', ' fire', ' you', '."']

Phi-4:
[40, 3324, 5485, 620, 115789, 392, 31559, 2411, 328, 668, 472, 634, 23377, 13, 24672, 328, 668, 472, 634, 5168, 1218, 665, 6452, 481, 3692]
['I', ' always', ' tell', ' new', ' hires', ' "', "Don't", ' think', ' of', ' me', ' as', ' your', ' boss', '.', ' Think', ' of', ' me', ' as', ' your', ' friend', ' who', ' can', ' fire', ' you', '."']

DeepSeek:
[0, 43, 3165, 4575, 1017, 106117, 582, 13222, 1664, 2118, 294, 678, 412, 782, 23699, 16, 22326, 294, 678, 412, 782, 6117, 995, 588, 5902, 440, 2148]
['<｜begin▁of▁sentence｜>', 'I', ' always', ' tell', ' new', ' hires', ' "', 'Don', "'t", ' think', ' 

How message templates are different

In [3]:
messages = [
    {"role": "system", "content": "We are telling jokes here"},
    {"role": "user", "content": "Knock knock"},
    {"role": "assistant", "content": "Who’s there?"},
    {"role": "user", "content": "Nobel"},
    {"role": "assistant", "content": "Nobel who?"},
    {"role": "user", "content": "No bell, that’s why I knocked!"}
  ]

print("Llama-3.2:")
prompt = llama_tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print(prompt)

print("\nPhi-4:")
prompt = phi4_tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print(prompt)

print("\nDeepSeek:")
prompt = deep_seek_tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print(prompt)

print("\nQwen2.5-Coder:")
prompt = qwen_coder_tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print(prompt)

Llama-3.2:
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Cutting Knowledge Date: December 2023
Today Date: 01 Dec 2025

We are telling jokes here<|eot_id|><|start_header_id|>user<|end_header_id|>

Knock knock<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Who’s there?<|eot_id|><|start_header_id|>user<|end_header_id|>

Nobel<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Nobel who?<|eot_id|><|start_header_id|>user<|end_header_id|>

No bell, that’s why I knocked!<|eot_id|><|start_header_id|>assistant<|end_header_id|>



Phi-4:
<|system|>We are telling jokes here<|end|><|user|>Knock knock<|end|><|assistant|>Who’s there?<|end|><|user|>Nobel<|end|><|assistant|>Nobel who?<|end|><|user|>No bell, that’s why I knocked!<|end|><|assistant|>

DeepSeek:
<｜begin▁of▁sentence｜>We are telling jokes here<｜User｜>Knock knock<｜Assistant｜></think>Who’s there?<｜end▁of▁sentence｜><｜User｜>Nobel<｜Assistant｜></think>Nobel who?<｜end▁of▁sentence｜><｜User｜>No bell, that’s why I knocke