# Exploring Chat Templates with SmolLM2

This notebook demonstrates how to use chat templates with the `SmolLM2` model. Chat templates help structure interactions between users and AI models, ensuring consistent and contextually appropriate responses.

In [9]:
!pip install transformers datasets trl huggingface_hub

Collecting fsspec<=2024.9.0,>=2023.1.0 (from fsspec[http]<=2024.9.0,>=2023.1.0->datasets)
  Using cached fsspec-2024.9.0-py3-none-any.whl.metadata (11 kB)
Using cached fsspec-2024.9.0-py3-none-any.whl (179 kB)
Installing collected packages: fsspec
  Attempting uninstall: fsspec
    Found existing installation: fsspec 2024.10.0
    Uninstalling fsspec-2024.10.0:
      Successfully uninstalled fsspec-2024.10.0
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
gcsfs 2024.12.0 requires fsspec==2024.12.0, but you have fsspec 2024.9.0 which is incompatible.[0m[31m
[0mSuccessfully installed fsspec-2024.9.0


In [8]:
! python --version

Python 3.10.12


In [13]:
# Install the requirements in Google Colab
# !pip install transformers datasets trl huggingface_hub

# Authenticate to Hugging Face
from huggingface_hub import login

login()
print("Successfully Login")

# for convenience you can create an environment variable containing your hub token as HF_TOKEN

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

Successfully Login


In [11]:
# Import necessary libraries
from transformers import AutoModelForCausalLM, AutoTokenizer
from trl import setup_chat_format
import torch

## SmolLM2 Chat Template

Let's explore how to use a chat template with the `SmolLM2` model. We'll define a simple conversation and apply the chat template.

In [16]:
# Dynamically set the device
device = (
    "cuda"
    if torch.cuda.is_available()
    else "mps" if torch.backends.mps.is_available() else "cpu"
)

model_name = "HuggingFaceTB/SmolLM2-135M"
model = AutoModelForCausalLM.from_pretrained(
    pretrained_model_name_or_path=model_name
).to(device)
tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path=model_name)
model, tokenizer = setup_chat_format(model=model, tokenizer=tokenizer)

In [31]:
# Define messages for SmolLM2
messages = [
    {"role": "user", "content": "Hello, how are you?"},
    {
        "role": "assistant",
        "content": "I'm doing well, thank you! How can I assist you today?",
    },
    {"role": "user", "content": "Can you tell me a joke?"},
    {
        "role": "assistant",
        "content": "Sure, here's one: Why don't scientists trust atoms? Because they make up everything!",
    },

]

# Apply chat template without tokenization

The tokenizer represents the conversation as a string with special tokens to describe the role of the user and the assistant.


In [32]:
input_text = tokenizer.apply_chat_template(messages, tokenize=False)

print("Conversation with template:", input_text)

Conversation with template: <|im_start|>user
Hello, how are you?<|im_end|>
<|im_start|>assistant
I'm doing well, thank you! How can I assist you today?<|im_end|>
<|im_start|>user
Can you tell me a joke?<|im_end|>
<|im_start|>assistant
Sure, here's one: Why don't scientists trust atoms? Because they make up everything!<|im_end|>



# Decode the conversation

Note that the conversation is represented as above but with a further assistant message.


In [None]:
input_text = tokenizer.apply_chat_template(
    messages, tokenize=True, add_generation_prompt=True
)

print("Conversation decoded:", tokenizer.decode(token_ids=input_text))

Conversation decoded: <|im_start|>user
Hello, how are you?<|im_end|>
<|im_start|>assistant
I'm doing well, thank you! How can I assist you today?<|im_end|>
<|im_start|>assistant



# Tokenize the conversation

Of course, the tokenizer also tokenizes the conversation and special token as ids that relate to the model's vocabulary.



In [19]:
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)

print("Conversation tokenized:", input_text)

Conversation tokenized: [1, 4093, 198, 19556, 28, 638, 359, 346, 47, 2, 198, 1, 520, 9531, 198, 57, 5248, 2567, 876, 28, 9984, 346, 17, 1073, 416, 339, 4237, 346, 1834, 47, 2, 198, 1, 520, 9531, 198]


<div style='background-color: lightblue; padding: 10px; border-radius: 5px; margin-bottom: 20px; color:black'>
    <h2 style='margin: 0;color:blue'>Exercise: Process a dataset for SFT</h2>
    <p>Take a dataset from the Hugging Face hub and process it for SFT. </p>
    <p><b>Difficulty Levels</b></p>
    <p>🐢 Convert the `HuggingFaceTB/smoltalk` dataset into chatml format.</p>
    <p>🐕 Convert the `openai/gsm8k` dataset into chatml format.</p>
</div>

In [48]:
from IPython.core.display import display, HTML

display(
    HTML(
        """<iframe
  src="https://huggingface.co/datasets/HuggingFaceTB/smoltalk/embed/viewer/all/train?row=0"
  frameborder="0"
  width="100%"
  height="360px"
></iframe>
"""
    )
)

In [50]:
from datasets import load_dataset

ds = load_dataset("HuggingFaceTB/smoltalk", "everyday-conversations")
ds = ds['train']

def process_dataset(sample):
    # TODO: 🐢 Convert the sample into a chat format
    # use the tokenizer's method to apply the chat template
    origin_messages = sample['messages']
    message_encode = tokenizer.apply_chat_template(origin_messages, tokenize=False)
    sample['chatml'] = message_encode
    return sample

ds = ds.map(process_dataset)


Map:   0%|          | 0/2260 [00:00<?, ? examples/s]

In [51]:
import pandas as pd
df = pd.DataFrame(ds)
display(df.head())

Unnamed: 0,full_topic,messages,chatml
0,Travel/Vacation destinations/Beach resorts,"[{'content': 'Hi there', 'role': 'user'}, {'co...",<|im_start|>user\nHi there<|im_end|>\n<|im_sta...
1,Work/Career development/Mentorship,"[{'content': 'Hi', 'role': 'user'}, {'content'...",<|im_start|>user\nHi<|im_end|>\n<|im_start|>as...
2,Shopping/Window shopping/Window shopping etiqu...,"[{'content': 'Hi', 'role': 'user'}, {'content'...",<|im_start|>user\nHi<|im_end|>\n<|im_start|>as...
3,Cooking/Cooking for others/Food gifting,"[{'content': 'Hi there', 'role': 'user'}, {'co...",<|im_start|>user\nHi there<|im_end|>\n<|im_sta...
4,Weather/Climate change/Climate change impacts,"[{'content': 'Hi', 'role': 'user'}, {'content'...",<|im_start|>user\nHi<|im_end|>\n<|im_start|>as...


In [49]:
display(
    HTML(
        """<iframe
  src="https://huggingface.co/datasets/openai/gsm8k/embed/viewer/main/train"
  frameborder="0"
  width="100%"
  height="360px"
></iframe>
"""
    )
)

In [52]:
ds = load_dataset("openai/gsm8k", "main")


def process_dataset(sample):
    # TODO: 🐕 Convert the sample into a chat format

    # 1. create a message format with the role and content

    # 2. apply the chat template to the samples using the tokenizer's method
    qa_list = []
    question = sample['question']
    answer = sample['answer']
    qa_list.append({"role": "user", "content": question})
    qa_list.append({"role": "assistant", "content": answer})
    sample['chatml'] = tokenizer.apply_chat_template(qa_list, tokenize=False)

    return sample

ds = ds['train']
ds = ds.map(process_dataset)

README.md:   0%|          | 0.00/7.94k [00:00<?, ?B/s]

train-00000-of-00001.parquet:   0%|          | 0.00/2.31M [00:00<?, ?B/s]

test-00000-of-00001.parquet:   0%|          | 0.00/419k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/7473 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/1319 [00:00<?, ? examples/s]

Map:   0%|          | 0/7473 [00:00<?, ? examples/s]

In [53]:
df = pd.DataFrame(ds)
display(df.head())

Unnamed: 0,question,answer,chatml
0,Natalia sold clips to 48 of her friends in Apr...,Natalia sold 48/2 = <<48/2=24>>24 clips in May...,<|im_start|>user\nNatalia sold clips to 48 of ...
1,Weng earns $12 an hour for babysitting. Yester...,Weng earns 12/60 = $<<12/60=0.2>>0.2 per minut...,<|im_start|>user\nWeng earns $12 an hour for b...
2,Betty is saving money for a new wallet which c...,"In the beginning, Betty has only 100 / 2 = $<<...",<|im_start|>user\nBetty is saving money for a ...
3,"Julie is reading a 120-page book. Yesterday, s...",Maila read 12 x 2 = <<12*2=24>>24 pages today....,<|im_start|>user\nJulie is reading a 120-page ...
4,James writes a 3-page letter to 2 different fr...,He writes each friend 3*2=<<3*2=6>>6 pages a w...,<|im_start|>user\nJames writes a 3-page letter...


## Conclusion

This notebook demonstrated how to apply chat templates to different models, `SmolLM2`. By structuring interactions with chat templates, we can ensure that AI models provide consistent and contextually relevant responses.

In the exercise you tried out converting a dataset into chatml format. Luckily, TRL will do this for you, but it's useful to understand what's going on under the hood.