# DeepFabric Training Metrics Demo

This notebook demonstrates automatic training metrics logging with DeepFabric.

## Setup

First, start a simple HTTP server to capture the metrics payloads. In a terminal, run:

```bash
# Option 1: Python (built-in, but only logs requests, not bodies)
python -m http.server 8888

# Option 2: npx http-echo-server (shows request bodies)
npx http-echo-server 8888

# Option 3: Python with request body logging (recommended)
python examples/mock_metrics_server.py
```

In [None]:
# Set environment variables BEFORE importing deepfabric
import os

# Point to our local test server (or your zrok URL for Colab testing)
os.environ["DEEPFABRIC_API_URL"] = "http://localhost:8888"

# Optional: Set API key via env var (or be prompted interactively)
# os.environ["DEEPFABRIC_API_KEY"] = "your-api-key"

In [None]:
# Check the training status
!deepfabric training status

In [None]:
import torch
from datasets import Dataset
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    DataCollatorForLanguageModeling,
    Trainer,
    TrainingArguments,
)

from deepfabric.training import DeepFabricCallback

print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")

In [None]:
# Use a tiny model for fast testing
# sshleifer/tiny-gpt2 is only ~500KB!
MODEL_NAME = "sshleifer/tiny-gpt2"

print(f"Loading model: {MODEL_NAME}")
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)

# Set pad token
tokenizer.pad_token = tokenizer.eos_token
model.config.pad_token_id = tokenizer.pad_token_id

print(f"Model parameters: {model.num_parameters():,}")

In [None]:
# Create a tiny synthetic dataset
texts = [
    "The quick brown fox jumps over the lazy dog.",
    "Machine learning is a subset of artificial intelligence.",
    "Python is a popular programming language.",
    "Deep learning models require large amounts of data.",
    "Natural language processing enables computers to understand text.",
    "Neural networks are inspired by the human brain.",
    "Training large models requires significant compute resources.",
    "Fine-tuning adapts pre-trained models to specific tasks.",
] * 10  # Repeat to have more samples

# Tokenize
def tokenize_function(examples):
    return tokenizer(
        examples["text"],
        truncation=True,
        max_length=64,
        padding="max_length",
    )

dataset = Dataset.from_dict({"text": texts})
tokenized_dataset = dataset.map(tokenize_function, batched=True, remove_columns=["text"])

print(f"Dataset size: {len(tokenized_dataset)} samples")

In [None]:
# Training arguments - very short training for demo
training_args = TrainingArguments(
    output_dir="./demo_output",
    num_train_epochs=2,
    per_device_train_batch_size=4,
    per_device_eval_batch_size=4,
    warmup_steps=5,
    learning_rate=5e-4,
    logging_steps=5,  # Log every 5 steps
    save_strategy="no",  # Don't save checkpoints for demo
    report_to=[],  # Disable default reporters (wandb, etc.)
    # Use CPU for simplicity (change to "cuda" or "mps" if available)
    use_cpu=True,
)

print("Training arguments configured")

In [None]:
# Data collator for language modeling
data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer,
    mlm=False,  # Causal LM, not masked LM
)

# Create Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset,
    data_collator=data_collator,
)

# Add DeepFabric callback for metrics logging
# This will prompt for API key if not set in environment
trainer.add_callback(DeepFabricCallback(trainer))

# Check what callbacks are registered
print("Registered callbacks:")
for cb in trainer.callback_handler.callbacks:
    print(f"  - {type(cb).__name__}")

In [None]:
# Train!
# Watch your HTTP server terminal - you should see metrics being sent
print("Starting training...")
print("Check your HTTP server for incoming metrics!\n")

trainer.train()

print("\nTraining complete!")

In [None]:
# Clean up
import shutil

shutil.rmtree("./demo_output", ignore_errors=True)
print("Cleaned up demo output directory")

## Expected Metrics

You should see POST requests to your server with payloads like:

### Run Start Event
```json
{
  "event_type": "run_start",
  "run_id": "uuid-here",
  "model_name": "sshleifer/tiny-gpt2",
  "training_config": {
    "num_train_epochs": 2,
    "per_device_train_batch_size": 4,
    "learning_rate": 0.0005
  }
}
```

### Metrics Batch
```json
{
  "metrics": [
    {
      "run_id": "uuid-here",
      "global_step": 5,
      "epoch": 0.25,
      "metrics": {
        "loss": 4.123,
        "learning_rate": 0.0005,
        "grad_norm": 1.234
      }
    }
  ]
}
```

### Run End Event
```json
{
  "event_type": "run_end",
  "run_id": "uuid-here",
  "final_step": 40,
  "final_epoch": 2.0
}
```