# Model Diff Tutorial

This notebook provides an interactive tutorial for the Model Diff library.

## Overview

Model Diff is a library for comparing model checkpoints to understand:
- What changed during training
- Which layers updated the most
- Training convergence patterns
- Model architecture differences

## Installation

```bash
pip install model-diff
```

In [None]:
# Import the library
from model_diff import (
    ModelDiff,
    DiffConfig,
    CheckpointLoader,
)

print("Model Diff loaded successfully!")

## Creating Sample Checkpoints

Let's create some sample checkpoints to demonstrate the comparison functionality.

In [None]:
# Create sample checkpoints
checkpoint_epoch_0 = {
    "encoder.layer1.weight": [[0.1, 0.2], [0.3, 0.4]],
    "encoder.layer1.bias": [0.0, 0.0],
    "encoder.layer2.weight": [[0.5, 0.5], [0.5, 0.5]],
    "decoder.layer1.weight": [[0.2, 0.2], [0.2, 0.2]],
}

checkpoint_epoch_10 = {
    "encoder.layer1.weight": [[0.15, 0.25], [0.35, 0.45]],
    "encoder.layer1.bias": [0.01, -0.01],
    "encoder.layer2.weight": [[0.55, 0.48], [0.52, 0.51]],
    "decoder.layer1.weight": [[0.3, 0.25], [0.22, 0.28]],
}

print("Checkpoints created:")
print(f"  Epoch 0: {len(checkpoint_epoch_0)} tensors")
print(f"  Epoch 10: {len(checkpoint_epoch_10)} tensors")

## Basic Comparison

Use `ModelDiff` to compare two checkpoints and see what changed.

In [None]:
# Create a differ
differ = ModelDiff()

# Compare checkpoints
diff = differ.compare(checkpoint_epoch_0, checkpoint_epoch_10)

print("Comparison Results:")
print(f"  Total layers compared: {diff.total_layers}")
print(f"  Modified layers: {len(diff.modified_layers)}")
print(f"  Unchanged layers: {len(diff.unchanged_layers)}")

## Analyzing Layer Changes

Let's dive deeper into which layers changed the most.

In [None]:
# Analyze each layer
print("Layer-by-Layer Analysis:")
print("-" * 50)

for layer_name, layer_diff in diff.layer_diffs.items():
    print(f"\n{layer_name}:")
    print(f"  Change magnitude: {layer_diff.magnitude:.6f}")
    print(f"  Relative change: {layer_diff.relative_change:.2%}")
    print(f"  Max element change: {layer_diff.max_change:.6f}")
    print(f"  Mean element change: {layer_diff.mean_change:.6f}")

## Configuration Options

Customize the comparison with `DiffConfig`.

In [None]:
# Custom configuration
config = DiffConfig(
    ignore_layers=["*bias*"],  # Ignore bias terms
    threshold=0.01,  # Only report changes > 1%
    compute_statistics=True,
    normalize=True,  # Normalize by layer size
)

differ = ModelDiff(config)
diff = differ.compare(checkpoint_epoch_0, checkpoint_epoch_10)

print(f"Config: threshold={config.threshold}, ignore bias={True}")
print(f"Significant changes found: {len(diff.significant_changes)}")

## Generating Reports

Generate comprehensive reports for documentation and analysis.

In [None]:
# Generate a detailed report
report = differ.generate_report(diff)

print("=" * 50)
print("DIFF REPORT")
print("=" * 50)
print(f"\nTotal Parameters: {report.total_parameters:,}")
print(f"Changed Parameters: {report.changed_parameters:,}")
print(f"Change Ratio: {report.change_ratio:.2%}")
print(f"\nAverage Change Magnitude: {report.avg_magnitude:.6f}")
print(f"Max Change Magnitude: {report.max_magnitude:.6f}")
print(f"\nMost Changed Layer: {report.most_changed_layer}")
print(f"Least Changed Layer: {report.least_changed_layer}")

## Tracking Training Progress

Compare multiple checkpoints to track training convergence.

In [None]:
# Simulate training checkpoints
import random

def generate_checkpoint(epoch, base_weights):
    """Simulate weight updates during training."""
    # Weights converge over time (smaller updates later)
    update_scale = 0.1 / (1 + epoch * 0.1)
    return {
        k: [[v + random.uniform(-update_scale, update_scale) 
             for v in row] for row in vals]
        for k, vals in base_weights.items()
    }

# Generate checkpoints
base = {"layer1": [[0.5, 0.5], [0.5, 0.5]]}
checkpoints = [base]
for epoch in range(1, 11):
    checkpoints.append(generate_checkpoint(epoch, checkpoints[-1]))

# Track changes over training
print("Training Progress:")
print("-" * 40)
differ = ModelDiff()

for i in range(1, len(checkpoints)):
    diff = differ.compare(checkpoints[i-1], checkpoints[i])
    print(f"Epoch {i-1} -> {i}: change = {diff.total_change:.6f}")

## Detecting Architecture Changes

Identify when layers are added or removed between checkpoints.

In [None]:
# Checkpoint with different architecture
checkpoint_v1 = {
    "layer1.weight": [[1.0, 2.0]],
    "layer2.weight": [[3.0, 4.0]],
}

checkpoint_v2 = {
    "layer1.weight": [[1.0, 2.0]],
    # layer2 removed
    "layer3.weight": [[5.0, 6.0]],  # new layer
    "layer4.weight": [[7.0, 8.0]],  # new layer
}

diff = differ.compare(checkpoint_v1, checkpoint_v2)

print("Architecture Changes:")
print(f"  Added layers: {diff.added_layers}")
print(f"  Removed layers: {diff.removed_layers}")
print(f"  Common layers: {diff.common_layers}")

## Exporting Results

Export comparison results for further analysis.

In [None]:
# Export to JSON
json_output = differ.export_diff(diff, format="json")
print("JSON Export (preview):")
print(json_output[:200] + "...")

# Export to CSV
csv_output = differ.export_diff(diff, format="csv")
print("\nCSV Export (preview):")
print(csv_output[:200] + "...")

## Conclusion

Model Diff provides powerful tools for:
- Comparing model checkpoints
- Tracking training progress
- Detecting architecture changes
- Generating detailed reports

For more examples, see the `examples/` directory in the repository.