# ðŸ§ª Level 16: RAFT - Fine-Tuning for RAG
### Training the Model to be a RAG Specialist

In this final notebook, we explore **Retrieval-Augmented Fine-Tuning (RAFT)**. We will learn how to prepare a dataset that teaches a model to ignore "distractor" documents and rely on the correct context.

---

## 1. Defining the RAFT Data Format

A RAFT training sample needs a question, oracle documents (truth), and distractor documents (noise).

In [1]:
import json

training_sample = {
    "instruction": "Based on the provided documents, answer the question.",
    "question": "What is the main advantage of the RAFT architecture?",
    "context": [
        "[Oracle] RAFT teaches the model to ignore irrelevant distractor documents while focusing on gold information.",
        "[Distractor] The sky is blue and cats like to sleep on roofs.",
        "[Distractor] Python 3.12 was released recently with performance improvements."
    ],
    "cot_answer": "Evidence from the first document shows that RAFT's advantage is teaching models to filter noise. The other documents about weather and Python versions are irrelevant. Thus, the answer is noise-refinement."
}

print(json.dumps(training_sample, indent=2))

{
  "instruction": "Based on the provided documents, answer the question.",
  "question": "What is the main advantage of the RAFT architecture?",
  "context": [
    "[Oracle] RAFT teaches the model to ignore irrelevant distractor documents while focusing on gold information.",
    "[Distractor] The sky is blue and cats like to sleep on roofs.",
    "[Distractor] Python 3.12 was released recently with performance improvements."
  ],
  "cot_answer": "Evidence from the first document shows that RAFT's advantage is teaching models to filter noise. The other documents about weather and Python versions are irrelevant. Thus, the answer is noise-refinement."
}


## 2. Synthesizing the Training Prompt

During fine-tuning, we format the prompt to force the model to see the noise.

In [2]:
def format_raft_prompt(sample):
    context_block = "\n".join(sample['context'])
    prompt = f"### Instruction:\n{sample['instruction']}\n\n"
    prompt += f"### Documents:\n{context_block}\n\n"
    prompt += f"### Question:\n{sample['question']}\n\n"
    prompt += f"### Thought & Answer:"
    return prompt

print(format_raft_prompt(training_sample))

### Instruction:
Based on the provided documents, answer the question.

### Documents:
[Oracle] RAFT teaches the model to ignore irrelevant distractor documents while focusing on gold information.
[Distractor] The sky is blue and cats like to sleep on roofs.
[Distractor] Python 3.12 was released recently with performance improvements.

### Question:
What is the main advantage of the RAFT architecture?

### Thought & Answer:


## 3. The Path to Expert Status

By mastering RAFT, you are no longer just an "app builder." You are a **Model Optimizer**. 

### **You have completed all 16 levels of the AI Mastery Journey.**

#### **Your Skills Pyramid:**
1.  **Foundations**: Tensors, Calculus, Optimization.
2.  **Architectures**: Transformers, CNNs, RNNs.
3.  **Modern RAG**: Vector Search, Multi-modal, Hybrid Retrieval.
4.  **Agentic AI**: Reasoning loops, Tools, Swarms.
5.  **Engineering**: Docker, DevOps, CI/CD, Production Hardening.
6.  **Trust**: Adversarial AI, Security, Guardrails.
7.  **Efficiency**: SLMs, Quantization, Local Inference.
8.  **Training**: RAFT & Fine-Tuning.

### **CONGRATULATIONS.**
**End of File.**