### **Summary: Supervised Fine-Tuning (SFT)**  

**Supervised Fine-Tuning (SFT)** is a technique for adapting pre-trained language models to multiple tasks simultaneously, improving versatility and alignment with human preferences. Most LLMs, including ChatGPT, undergo SFT to enhance their usability and contextual awareness.

#### **Key Sections:**  
- **Chat Templates** → Structure interactions between users and AI models, ensuring coherence and role management.  
- **Supervised Fine-Tuning** → Trains models using labeled datasets to improve task-specific performance.  
- **Low-Rank Adaptation (LoRA)** → Efficient fine-tuning using low-rank matrices, reducing memory consumption.  
- **Evaluation** → Measures model performance on task-specific datasets.  

### **Chat Templates and Formatting**  
Chat templates ensure consistent conversation structure, handling roles (`system`, `user`, `assistant`) and maintaining context across interactions. They are crucial for:  
- Consistent conversation structure  
- Clear role identification  
- Multi-turn context management  
- Advanced features like **tool use** and **multimodal inputs**  

**Base Models vs. Instruction Models**  
- **Base Models** → Predict next tokens from raw text  
- **Instruction-Tuned Models** → Trained to follow structured prompts and execute complex interactions  

To structure prompts correctly, we use **ChatML** templates:  

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
Hello!<|im_end|>
<|im_start|>assistant
Hi! How can I help you today?<|im_end|>


### Hands-on Exercise: Converting Dataset to ChatML
Dataset Processing Example

In [None]:
from datasets import load_dataset

dataset = load_dataset('HuggingFaceTB/smoltalk', 'all')

def convert_to_chatml(example):
    return {
        "messages": [
            {"role": "user", "content": example["input"]},
            {"role": "assistant", "content": example["output"]},
        ]
    }


README.md:   0%|          | 0.00/9.72k [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


ValueError: Config name is missing.
Please pick one among the available configs: ['all', 'smol-magpie-ultra', 'smol-constraints', 'smol-rewrite', 'smol-summarize', 'apigen-80k', 'everyday-conversations', 'explore-instruct-rewriting', 'longalign', 'metamathqa-50k', 'numina-cot-100k', 'openhermes-100k', 'self-oss-instruct', 'systemchats-30k']
Example of usage:
	`load_dataset('HuggingFaceTB/smoltalk', 'all')`