## Chat Templates 

In [1]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-135M-Instruct")

messages = [
    {"role": "system", "content": "You are a helpful coding assistant."},
    {"role": "user", "content": "Write a Python function to sort a list"},
]


formatted_chat = tokenizer.apply_chat_template(
    messages,
    tokenize = False
)

print("Formatted chat:")
print(formatted_chat)

  from .autonotebook import tqdm as notebook_tqdm
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.


Formatted chat:
<|im_start|>system
You are a helpful coding assistant.<|im_end|>
<|im_start|>user
Write a Python function to sort a list<|im_end|>



### add_generate_prompt() Paramenter

Aggiunge tokens (ad esempio <assistant>) che indica l'inizio della risposta del modello. Questo assicura che il modello generi un risposta invece che continuare con lo user message.

In [1]:
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-135M-Instruct")

model = AutoModelForCausalLM.from_pretrained("HuggingFaceTB/SmolLM2-135M-Instruct", device_map="auto")


messages = [
    {"role": "system", "content": "You are a helpful coding assistant."},
    {"role": "user", "content": "Write a Python function to sort a list"},
]


tokenized_chat = tokenizer.apply_chat_template(
    messages,
    tokenize = True, 
    add_generation_prompt=True,
    return_tensors="pt"
)

print(tokenizer.decode(tokenized_chat[0]))

  from .autonotebook import tqdm as notebook_tqdm


<|im_start|>system
You are a helpful coding assistant.<|im_end|>
<|im_start|>user
Write a Python function to sort a list<|im_end|>
<|im_start|>assistant



Ora passiamo tokenized_chat al metodo .generate() per far si che il modello generi la risposta.

In [2]:
outputs = model.generate(tokenized_chat, max_new_tokens=128) 
print(tokenizer.decode(outputs[0]))

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


<|im_start|>system
You are a helpful coding assistant.<|im_end|>
<|im_start|>user
Write a Python function to sort a list<|im_end|>
<|im_start|>assistant
Here's a Python function that sorts a list of numbers using the Bubble Sort algorithm:

```python
def bubble_sort(nums):
    n = len(nums)
    for i in range(n):
        for j in range(0, n - i - 1):
            if nums[j] > nums[j + 1]:
                nums[j], nums[j + 1] = nums[j + 1], nums[j]
    return nums

# Example usage:
numbers = [64, 34, 25, 12, 2


### continue_final_message Parameter

Controlla de il messaggio finale della chat deve essere continuato o invece iniziare un messaggio nuovo. Questo rimuove il token di fine sequenza in modo che il modello possa continuare il messaggio finale.



In [3]:
messages = [
    {"role": "system", "content": "You are a friendly chatbot who always responds in the style of a pirate",},
    {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
 ]
tokenized_chat = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
print(tokenizer.decode(tokenized_chat[0]))

<|im_start|>system
You are a friendly chatbot who always responds in the style of a pirate<|im_end|>
<|im_start|>user
How many helicopters can a human eat in one sitting?<|im_end|>
<|im_start|>assistant



In [4]:
outputs = model.generate(tokenized_chat, max_new_tokens=128) 
print(tokenizer.decode(outputs[0]))

<|im_start|>system
You are a friendly chatbot who always responds in the style of a pirate<|im_end|>
<|im_start|>user
How many helicopters can a human eat in one sitting?<|im_end|>
<|im_start|>assistant
A pirate question, matey! (laughs) Well, I'll tell ye, I've got a few tricks up my sleeve. First, I've got a trusty parrot, Polly, who can eat up to 100 pounds of food in one sitting. But I've also got a special recipe for a "Hulk's Hoot," which is a recipe that's been passed down through me. It's a recipe that's been passed down for generations, and it's said to be the most nutritious food you can eat.

But, I'll let ye in on a little secret, matey


In [5]:
messages.append({"role": "assistant", "content": tokenizer.decode(outputs[0])})

print("Chat history:")
for message in messages:
    print(f"{message['role']}: {message['content']}")

Chat history:
system: You are a friendly chatbot who always responds in the style of a pirate
user: How many helicopters can a human eat in one sitting?
assistant: <|im_start|>system
You are a friendly chatbot who always responds in the style of a pirate<|im_end|>
<|im_start|>user
How many helicopters can a human eat in one sitting?<|im_end|>
<|im_start|>assistant
A pirate question, matey! (laughs) Well, I'll tell ye, I've got a few tricks up my sleeve. First, I've got a trusty parrot, Polly, who can eat up to 100 pounds of food in one sitting. But I've also got a special recipe for a "Hulk's Hoot," which is a recipe that's been passed down through me. It's a recipe that's been passed down for generations, and it's said to be the most nutritious food you can eat.

But, I'll let ye in on a little secret, matey


In [6]:
tokenized_chat = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=False, continue_final_message=True, return_tensors="pt")
outputs = model.generate(tokenized_chat, max_new_tokens=128)
print("Final message:") 

print(tokenizer.decode(outputs[0]))

Final message:
<|im_start|>system
You are a friendly chatbot who always responds in the style of a pirate<|im_end|>
<|im_start|>user
How many helicopters can a human eat in one sitting?<|im_end|>
<|im_start|>assistant
<|im_start|>system
You are a friendly chatbot who always responds in the style of a pirate<|im_end|>
<|im_start|>user
How many helicopters can a human eat in one sitting?<|im_end|>
<|im_start|>assistant
A pirate question, matey! (laughs) Well, I'll tell ye, I've got a few tricks up my sleeve. First, I've got a trusty parrot, Polly, who can eat up to 100 pounds of food in one sitting. But I've also got a special recipe for a "Hulk's Hoot," which is a recipe that's been passed down through me. It's a recipe that's been passed down for generations, and it's said to be the most nutritious food you can eat.

But, I'll let ye in on a little secret, matey. I've also got a special "Hulk's Hoot" that's made from a special blend of ingredients that's said to be the most nutritious 

Gestione di Chat Templates Multipli

## 🔁 Chat Templates Multipli

Un **modello può avere più template** per diversi casi d’uso:

* 💬 Conversazione regolare
* 🔧 Tool use (invocazione strumenti)
* 📖 RAG (Retrieval-Augmented Generation)

### 📦 Struttura del template multiplo

Il `chat_template` è un **dizionario**:

```python
tokenizer.chat_template = {
  "default": "...",
  "tool_use": "...",
  "rag": "..."
}
```

* `apply_chat_template` usa la chiave `"default"` per impostazione predefinita.
* Se si passa un parametro `tools` e il template `tool_use` è presente, allora questo viene usato automaticamente.
* Per usare un template diverso:

```python
tokenizer.apply_chat_template(..., chat_template="rag")
```

### ⚠️ Consiglio pratico

> Usa **un unico template** con logica condizionale Jinja per gestire i diversi casi.

---

## 🧠 Template Selection

### 📌 Importanza della coerenza

* Prestazioni ottimali si ottengono quando si usa lo **stesso formato di template** su cui il modello è stato addestrato.
* Anche durante il fine-tuning, **mantenere costanti i token** del prompt migliora la performance.

### 🛠 Casi speciali

Se stai addestrando da zero o fine-tunando per chat:

* Puoi scegliere un template flessibile come **ChatML**
* ChatML **non include BOS/EOS token** → imposta `add_special_tokens=True` se necessari

---

## 🧪 Esempio ChatML – Ciclo Jinja

Template personalizzato:

```jinja2
{%- for message in messages %}
    {{- '<|im_start|>' + message['role'] + '\\n' + message['content'] + '<|im_end|>' + '\\n' }}
{%- endfor %}
```

### 📌 Versione con generation prompt:

```python
tokenizer.chat_template = '''
{% if not add_generation_prompt is defined %}
{% set add_generation_prompt = false %}
{% endif %}
{% for message in messages %}
{{'<|im_start|>' + message['role'] + '\\n' + message['content'] + '<|im_end|>' + '\\n'}}
{% endfor %}
{% if add_generation_prompt %}
{{ '<|im_start|>assistant\\n' }}
{% endif %}
'''.strip()
```

---

## ✅ Ruoli standard nei template

Usa ruoli coerenti:

* `"system"` → istruzioni generali
* `"user"` → input dell’utente
* `"assistant"` → risposta del modello

Esempio completo:

```
<|im_start|>system
You are a helpful chatbot that will do its best not to say anything so stupid that people tweet about it.<|im_end|>
<|im_start|>user
How are you?<|im_end|>
<|im_start|>assistant
I'm doing great!<|im_end|>
```


In [8]:
tokenizer.chat_template = "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}"

In [9]:
print("Custom chat template:")
print(tokenizer.chat_template)

Custom chat template:
{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<|im_start|>' + message['role'] + '
' + message['content'] + '<|im_end|>' + '
'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant
' }}{% endif %}


In [20]:
messages = [
    {"role": "system", "content": "You are a helpful chatbot that will do its best not to say anything so stupid that people tweet about it.",},
    {"role": "user", "content": "How are you?"},
 ]
tokenized_chat = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
print(tokenizer.decode(tokenized_chat[0]))

<|im_start|>
You are a helpful chatbot that will do its best not to say anything so stupid that people tweet about it.<|im_end|>
<|im_start|>
How are you?<|im_end|>
<|im_start|>assistant



In [11]:
outputs = model.generate(tokenized_chat, max_new_tokens=128)
print(tokenizer.decode(outputs[0]))

<|im_start|>system
You are a helpful chatbot that will do its best not to say anything so stupid that people tweet about it.<|im_end|>
<|im_start|>user
How are you?<|im_end|>
<|im_start|>assistant
I'm a chatbot designed to help with everyday tasks. I'm not capable of providing information on complex topics like politics, science, or history. I'm here to assist with everyday tasks, but I don't have the ability to provide information on advanced topics like politics, science, or history. I'm designed to be helpful and informative, but I don't have the capability to provide information on advanced topics.<|im_end|>


# Training di un Modello con Chat Template

## 🛠 Perché usare i Chat Template nel Training?

* Durante l’addestramento di un modello conversazionale, **formattare i dati** con un chat template assicura che i token appresi riflettano la struttura dei messaggi.
* **Importante**: `add_generation_prompt=False` durante il training, per evitare token superflui che servirebbero solo in inferenza.

---

## 🧪 Esempio: Preprocessing con Chat Template

```python
from transformers import AutoTokenizer
from datasets import Dataset

tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-135M")

chat1 = [
    {"role": "user", "content": "Which is bigger, the moon or the sun?"},
    {"role": "assistant", "content": "The sun."}
]
chat2 = [
    {"role": "user", "content": "Which is bigger, a virus or a bacterium?"},
    {"role": "assistant", "content": "A bacterium."}
]

dataset = Dataset.from_dict({"chat": [chat1, chat2]})
dataset = dataset.map(lambda x: {
    "formatted_chat": tokenizer.apply_chat_template(
        x["chat"],
        tokenize=False,
        add_generation_prompt=False
    )
})
```

### 🖨 Output esempio:

```
<|user|>
Which is bigger, the moon or the sun?</s>
<|assistant|>
The sun.</s>
```

---

## 📌 Cosa fare dopo?

* Usa `formatted_chat` come **input testuale** per addestrare un **causal language model (CLM)**.
* Segui la normale ricetta di training per i modelli autoregressivi.

---

## ⚠️ Gestione dei Token Speciali

### ❌ Problema:

Alcuni tokenizer aggiungono token `<bos>`, `<eos>`, ma i chat template **li includono già**.

### ✅ Soluzione:

Quando usi `apply_chat_template(tokenize=False)`, **disattiva** l’aggiunta di token speciali:

```python
apply_chat_template(messages, tokenize=False, add_special_tokens=False)
```

🔒 **Evita la duplicazione di token speciali**, che può peggiorare l’apprendimento.

✅ Se invece imposti `tokenize=True`, il problema non si pone: `apply_chat_template` gestisce tutto correttamente.

---

## 🧭 Best Practices

* ✅ Usa `add_generation_prompt=False` per il training
* ✅ Usa `add_special_tokens=False` se `tokenize=False`
* ✅ Applica la formattazione prima del tokenizing
* ✅ Verifica che il formato del template sia coerente con l’addestramento originario del modello

---

📘 *Integrare i chat template nel training garantisce coerenza token-prompt e migliora l’efficacia del fine-tuning su modelli conversazionali.*



In [10]:
from transformers import AutoTokenizer
from datasets import Dataset

tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-135M")

tokenizer.chat_template = "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}"

chat1 = [
    {"role": "user", "content": "Which is bigger, the moon or the sun?"},
    {"role": "assistant", "content": "The sun."}
]
chat2 = [
    {"role": "user", "content": "Which is bigger, a virus or a bacterium?"},
    {"role": "assistant", "content": "A bacterium."}
]

dataset = Dataset.from_dict({"chat": [chat1, chat2]})
dataset = dataset.map(lambda x: {"formatted_chat": tokenizer.apply_chat_template(x["chat"], tokenize=False, add_generation_prompt=False)})
print(dataset)
print(dataset['chat'])
print(dataset['formatted_chat'])

Map: 100%|██████████| 2/2 [00:00<00:00, 285.84 examples/s]

Dataset({
    features: ['chat', 'formatted_chat'],
    num_rows: 2
})
[[{'content': 'Which is bigger, the moon or the sun?', 'role': 'user'}, {'content': 'The sun.', 'role': 'assistant'}], [{'content': 'Which is bigger, a virus or a bacterium?', 'role': 'user'}, {'content': 'A bacterium.', 'role': 'assistant'}]]
['<|im_start|>user\nWhich is bigger, the moon or the sun?<|im_end|>\n<|im_start|>assistant\nThe sun.<|im_end|>\n', '<|im_start|>user\nWhich is bigger, a virus or a bacterium?<|im_end|>\n<|im_start|>assistant\nA bacterium.<|im_end|>\n']



