# üßÆ LMFast: Math Reasoning Specialty

**Train an SLM to be a math genius!**

## What You'll Learn
- Fine-tune on math datasets (GSM8K, MathInstruct)
- Format data with Chain-of-Thought (CoT)
- Use `ReasoningConfig` for specialized training
- Evaluate using pass@1 metrics

## Why Math?
Math reasoning is a proxy for general intelligence. Small models that excel at math often reason better in other domains.

**Time to complete:** ~20 minutes

## 1Ô∏è‚É£ Setup

In [None]:
!pip install -q lmfast[all] datasets

import lmfast
lmfast.setup_colab_env()

## 2Ô∏è‚É£ Data Preparation (CoT)

We need data that shows the *steps*, not just the answer.

In [None]:
from datasets import load_dataset

# Load a small subset of GSM8K (Grade School Math)
# Real training would use the full set
dataset = load_dataset("gsm8k", "main", split="train[:100]")

print("Example:")
print(f"Q: {dataset[0]['question']}")
print(f"A: {dataset[0]['answer']}")

## 3Ô∏è‚É£ Formatting for Training

We format it as a conversation where the assistant provides the CoT.

In [None]:
def format_math(example):
    # Standard Alpaca/Chat format
    return {
        "text": f"### Question:\n{example['question']}\n\n### Solution:\nLet's think step by step.\n{example['answer']}"
    }

train_ds = dataset.map(format_math)
print(train_ds[0]['text'])

## 4Ô∏è‚É£ Training with Math Optimization

Math training often benefits from:
- Low learning rate decay
- Packing multiple short examples
- NEFTune (optional for stability)

In [None]:
from lmfast import train

print("üöÄ Starting Math Fine-Tuning...")

trainer = train(
    model="HuggingFaceTB/SmolLM-135M",
    dataset=train_ds,
    output_dir="./math_solver",
    max_steps=50,
    learning_rate=2e-4,
    packing=True,  # Pack shorter math problems for efficiency
    neftune_noise_alpha=5, # Improves generalization
)

print("‚úÖ Training initiated...")

## 5Ô∏è‚É£ Test with ThinkingAgent

Use our new `ThinkingAgent` to check performance.

In [None]:
from lmfast.reasoning import ThinkingAgent, reason
from lmfast.inference import SLMServer

# Load our trained model
model = SLMServer("./math_solver")

problem = "Janet has 5 apples. She gives 2 to Tom and buys 3 more. How many?"

# Use Best-of-N to boost accuracy further
answer = reason(
    model_fn=lambda p: model.generate(p, max_new_tokens=100), 
    problem=problem,
    method="best_of_n", 
    n=3
)

print(f"Question: {problem}")
print(f"Model Answer: {answer}")

## üéâ Summary

You've learned how to:
- ‚úÖ Prepare math datasets with CoT
- ‚úÖ Fine-tune for reasoning
- ‚úÖ Combine fine-tuning with test-time compute

### Next Steps
- `10_reasoning_agents.ipynb`: Dive deeper into inference strategies.