# 08 - Model Evaluation

Comprehensive model evaluation and comparison.

## Learning Objectives
- Evaluate checkpoints on test set
- Compare multiple models
- Analyze failure cases
- Generate evaluation reports

In [None]:
# Setup
import sys
from pathlib import Path

project_root = Path.cwd().parent
sys.path.insert(0, str(project_root / 'src'))

print(f'Project root: {project_root}')

## 1. Single Checkpoint Evaluation

Evaluate one checkpoint:

In [None]:
# Command-line evaluation
print('To evaluate a checkpoint:')
print()
print('.venv/bin/python scripts/evaluate_checkpoint.py \\\\')
print('    --checkpoint outputs/best_from_sweep/checkpoint_best.pt \\\\')
print('    --test-data outputs/processed_current/test_sequences.npz \\\\')
print('    --vocab-path data/vocabulary.json \\\\')
print('    --output outputs/evaluation_results')

## 2. Compare Multiple Checkpoints

In [None]:
# Command-line comparison
print('To compare checkpoints:')
print()
print('.venv/bin/python scripts/compare_checkpoints.py \\\\')
print('    --checkpoints CP1.pt CP2.pt CP3.pt \\\\')
print('    --test-data outputs/processed_current/test_sequences.npz \\\\')
print('    --vocab-path data/vocabulary.json \\\\')
print('    --output outputs/checkpoint_comparison')

## 3. View Results

In [None]:
# Load evaluation results
import json
import pandas as pd

results_path = project_root / 'outputs' / 'evaluation_results' / 'evaluation_results.json'

if results_path.exists():
    with open(results_path, 'r') as f:
        results = json.load(f)
    
    metrics = results['metrics']
    print('Evaluation Metrics:')
    print(f'  Command Acc: {metrics["command_acc"]:.4f}')
    print(f'  Param Type Acc: {metrics["param_type_acc"]:.4f}')
    print(f'  Param Value Acc: {metrics["param_value_acc"]:.4f}')
    print(f'  Overall Acc: {metrics["overall_acc"]:.4f}')
else:
    print('No evaluation results found')

## Summary

You learned:
- Evaluating single checkpoints
- Comparing multiple models
- Analyzing metrics
- Generating reports

## Congratulations!

You've completed all tutorial notebooks! You now have a complete understanding of the G-code fingerprinting project.