# Quick Start Guide - Experimental Template Matching

This notebook provides a quick introduction to the experimental template matching system.
You'll learn how to:

1. Load a pre-trained model
2. Make predictions on new images
3. Visualize landmarks and confidence scores
4. Compare with ground truth
5. Analyze prediction errors

**Expected runtime: ~2 minutes**

## Prerequisites

- A trained model (use `train_experimental.py` to create one)
- Test images and ground truth landmarks
- Basic familiarity with numpy and matplotlib

In [None]:
# Import required libraries
import sys
import numpy as np
import matplotlib.pyplot as plt
import cv2
from pathlib import Path
import yaml

# Add project paths
PROJECT_ROOT = Path.cwd().parent
sys.path.insert(0, str(PROJECT_ROOT))

from core.experimental_predictor import ExperimentalLandmarkPredictor
from tests.fixtures import create_synthetic_images, create_synthetic_landmarks, get_quick_test_data

print("Libraries imported successfully!")
print(f"Project root: {PROJECT_ROOT}")

## 1. Load Configuration and Create Test Data

First, let's load the default configuration and create some synthetic test data for demonstration.

In [None]:
# Load default configuration
config_path = PROJECT_ROOT / "configs" / "default_config.yaml"
with open(config_path, 'r') as f:
    config = yaml.safe_load(f)

print("Configuration loaded:")
print(f"  Patch size: {config['eigenpatches']['patch_size']}")
print(f"  PCA components: {config['eigenpatches']['n_components']}")
print(f"  Pyramid levels: {config['eigenpatches']['pyramid_levels']}")
print(f"  Lambda shape: {config['landmark_predictor']['lambda_shape']}")

In [None]:
# Create synthetic test data for demonstration
print("Creating synthetic test data...")

# Create training data
train_images = create_synthetic_images(n_images=10, image_size=(128, 128), seed=42)
train_landmarks = create_synthetic_landmarks(n_images=10, image_size=(128, 128), seed=42)

# Create test data
test_images = create_synthetic_images(n_images=3, image_size=(128, 128), seed=100)
test_landmarks = create_synthetic_landmarks(n_images=3, image_size=(128, 128), seed=100)

print(f"Created {len(train_images)} training images and {len(test_images)} test images")
print(f"Image size: {train_images[0].shape}")
print(f"Landmarks per image: {len(train_landmarks[0])}")

## 2. Train a Model (Quick Demo)

For this demo, we'll quickly train a model on synthetic data. In practice, you would use `train_experimental.py` with real data.

In [None]:
# Initialize experimental predictor
print("Initializing experimental landmark predictor...")
predictor = ExperimentalLandmarkPredictor(config=config)

# Train the model (this will take ~30-60 seconds)
print("Training model on synthetic data...")
predictor.train(train_images, train_landmarks)

print("Training completed!")
print(f"Model statistics: {predictor.get_prediction_statistics()}")

## 3. Make Predictions

Now let's use the trained model to predict landmarks on test images.

In [None]:
# Make predictions on test images
print("Making predictions on test images...")

results = []
for i, test_image in enumerate(test_images):
    print(f"  Processing image {i+1}/{len(test_images)}...")
    
    # Predict landmarks with detailed results
    result = predictor.predict_landmarks(test_image, return_detailed=True)
    results.append(result)
    
    print(f"    Processing time: {result.processing_time:.3f} seconds")
    print(f"    Iterations: {result.iterations}")
    print(f"    Convergence error: {result.convergence_error:.3f}")

print("Predictions completed!")

## 4. Visualize Results

Let's visualize the predictions compared to ground truth landmarks.

In [None]:
# Visualize predictions vs ground truth
fig, axes = plt.subplots(1, len(test_images), figsize=(15, 5))
if len(test_images) == 1:
    axes = [axes]

for i, (image, result, gt_landmarks) in enumerate(zip(test_images, results, test_landmarks)):
    ax = axes[i]
    
    # Display image
    ax.imshow(image, cmap='gray')
    
    # Plot ground truth landmarks (green)
    gt_x, gt_y = gt_landmarks[:, 0], gt_landmarks[:, 1]
    ax.scatter(gt_x, gt_y, c='green', s=30, alpha=0.8, label='Ground Truth', marker='o')
    
    # Plot predicted landmarks (red)
    pred_x, pred_y = result.landmarks[:, 0], result.landmarks[:, 1]
    ax.scatter(pred_x, pred_y, c='red', s=30, alpha=0.8, label='Predicted', marker='x')
    
    # Draw lines connecting corresponding points
    for j in range(len(gt_landmarks)):
        ax.plot([gt_x[j], pred_x[j]], [gt_y[j], pred_y[j]], 'b-', alpha=0.3, linewidth=1)
    
    # Compute and display error
    errors = np.linalg.norm(result.landmarks - gt_landmarks, axis=1)
    mean_error = np.mean(errors)
    
    ax.set_title(f'Test Image {i+1}\nMean Error: {mean_error:.2f} pixels')
    ax.set_xlabel('X coordinate')
    ax.set_ylabel('Y coordinate')
    ax.legend()
    ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Print summary statistics
all_errors = []
for result, gt_landmarks in zip(results, test_landmarks):
    errors = np.linalg.norm(result.landmarks - gt_landmarks, axis=1)
    all_errors.extend(errors)

all_errors = np.array(all_errors)
print(f"\nPrediction Summary:")
print(f"  Mean error: {np.mean(all_errors):.3f} ± {np.std(all_errors):.3f} pixels")
print(f"  Median error: {np.median(all_errors):.3f} pixels")
print(f"  Min error: {np.min(all_errors):.3f} pixels")
print(f"  Max error: {np.max(all_errors):.3f} pixels")

## 5. Analyze Per-Landmark Performance

Let's examine how well each individual landmark is predicted.

In [None]:
# Analyze per-landmark errors
n_landmarks = len(test_landmarks[0])
per_landmark_errors = [[] for _ in range(n_landmarks)]

for result, gt_landmarks in zip(results, test_landmarks):
    errors = np.linalg.norm(result.landmarks - gt_landmarks, axis=1)
    for j, error in enumerate(errors):
        per_landmark_errors[j].append(error)

# Convert to numpy arrays and compute statistics
landmark_stats = []
for j in range(n_landmarks):
    errors = np.array(per_landmark_errors[j])
    stats = {
        'landmark': j,
        'mean_error': np.mean(errors),
        'std_error': np.std(errors),
        'min_error': np.min(errors),
        'max_error': np.max(errors)
    }
    landmark_stats.append(stats)

# Plot per-landmark performance
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 5))

# Bar plot of mean errors
landmark_indices = range(n_landmarks)
mean_errors = [stats['mean_error'] for stats in landmark_stats]
std_errors = [stats['std_error'] for stats in landmark_stats]

bars = ax1.bar(landmark_indices, mean_errors, yerr=std_errors, capsize=5, alpha=0.7)
ax1.set_xlabel('Landmark Index')
ax1.set_ylabel('Mean Error (pixels)')
ax1.set_title('Per-Landmark Mean Error')
ax1.grid(True, alpha=0.3)

# Highlight best and worst landmarks
best_landmark = np.argmin(mean_errors)
worst_landmark = np.argmax(mean_errors)
bars[best_landmark].set_color('green')
bars[worst_landmark].set_color('red')

# Box plot of error distributions
ax2.boxplot(per_landmark_errors, labels=landmark_indices)
ax2.set_xlabel('Landmark Index')
ax2.set_ylabel('Error (pixels)')
ax2.set_title('Per-Landmark Error Distribution')
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Print landmark analysis
print(f"Per-Landmark Analysis:")
print(f"  Best landmark: #{best_landmark} (mean error: {mean_errors[best_landmark]:.3f} pixels)")
print(f"  Worst landmark: #{worst_landmark} (mean error: {mean_errors[worst_landmark]:.3f} pixels)")
print(f"  Error range: {np.max(mean_errors) - np.min(mean_errors):.3f} pixels")

## 6. Confidence Analysis

If the model provides confidence scores, let's analyze their relationship with prediction accuracy.

In [None]:
# Analyze confidence scores (if available)
try:
    confidence_results = []
    for test_image in test_images:
        landmarks, confidence = predictor.predict_with_confidence(test_image)
        confidence_results.append(confidence)
    
    # Plot confidence vs error relationship
    all_confidences = []
    all_errors = []
    
    for i, (result, gt_landmarks, confidence) in enumerate(zip(results, test_landmarks, confidence_results)):
        errors = np.linalg.norm(result.landmarks - gt_landmarks, axis=1)
        all_errors.extend(errors)
        all_confidences.extend(confidence)
    
    plt.figure(figsize=(10, 6))
    plt.scatter(all_confidences, all_errors, alpha=0.6)
    plt.xlabel('Confidence Score')
    plt.ylabel('Prediction Error (pixels)')
    plt.title('Confidence vs Prediction Error')
    plt.grid(True, alpha=0.3)
    
    # Add trend line
    z = np.polyfit(all_confidences, all_errors, 1)
    p = np.poly1d(z)
    plt.plot(all_confidences, p(all_confidences), "r--", alpha=0.8)
    
    # Compute correlation
    correlation = np.corrcoef(all_confidences, all_errors)[0, 1]
    plt.text(0.05, 0.95, f'Correlation: {correlation:.3f}', transform=plt.gca().transAxes, 
             bbox=dict(boxstyle='round', facecolor='white', alpha=0.8))
    
    plt.show()
    
    print(f"Confidence Analysis:")
    print(f"  Mean confidence: {np.mean(all_confidences):.3f}")
    print(f"  Confidence range: {np.min(all_confidences):.3f} - {np.max(all_confidences):.3f}")
    print(f"  Confidence-Error correlation: {correlation:.3f}")
    
    if correlation < -0.3:
        print("  ✓ Good: Higher confidence correlates with lower error")
    elif correlation > 0.3:
        print("  ⚠ Warning: Higher confidence correlates with higher error")
    else:
        print("  → Neutral: Weak correlation between confidence and error")
        
except Exception as e:
    print(f"Confidence analysis not available: {e}")
    print("This is normal for models without confidence prediction capability.")

## 7. Model Performance Summary

Let's summarize the model's performance and compare it with the baseline.

In [None]:
# Compute overall performance metrics
all_errors = []
processing_times = []

for result, gt_landmarks in zip(results, test_landmarks):
    errors = np.linalg.norm(result.landmarks - gt_landmarks, axis=1)
    mean_error = np.mean(errors)
    all_errors.append(mean_error)
    processing_times.append(result.processing_time)

all_errors = np.array(all_errors)
processing_times = np.array(processing_times)

# Baseline comparison
baseline_error = 5.63  # pixels (from original implementation)
method_error = np.mean(all_errors)
improvement = (baseline_error - method_error) / baseline_error * 100

print("="*60)
print("EXPERIMENTAL TEMPLATE MATCHING - PERFORMANCE SUMMARY")
print("="*60)

print(f"\nDataset Information:")
print(f"  Training images: {len(train_images)}")
print(f"  Test images: {len(test_images)}")
print(f"  Image size: {test_images[0].shape}")
print(f"  Landmarks per image: {len(test_landmarks[0])}")

print(f"\nConfiguration:")
print(f"  Patch size: {config['eigenpatches']['patch_size']}")
print(f"  PCA components: {config['eigenpatches']['n_components']}")
print(f"  Pyramid levels: {config['eigenpatches']['pyramid_levels']}")
print(f"  Lambda shape: {config['landmark_predictor']['lambda_shape']}")
print(f"  Max iterations: {config['landmark_predictor']['max_iterations']}")

print(f"\nPerformance Results:")
print(f"  Mean error: {method_error:.3f} ± {np.std(all_errors):.3f} pixels")
print(f"  Median error: {np.median(all_errors):.3f} pixels")
print(f"  Error range: {np.min(all_errors):.3f} - {np.max(all_errors):.3f} pixels")
print(f"  Mean processing time: {np.mean(processing_times):.3f} ± {np.std(processing_times):.3f} seconds")

print(f"\nBaseline Comparison:")
print(f"  Baseline error: {baseline_error:.2f} pixels")
print(f"  Method error: {method_error:.3f} pixels")
print(f"  Improvement: {improvement:+.1f}%")

if improvement > 10:
    performance_category = "🟢 Significantly Better"
elif improvement > 5:
    performance_category = "🟢 Better"
elif improvement > -5:
    performance_category = "🟡 Equivalent"
elif improvement > -10:
    performance_category = "🟠 Worse"
else:
    performance_category = "🔴 Significantly Worse"

print(f"  Category: {performance_category}")

print(f"\nNext Steps:")
print(f"  📊 Use evaluate_experimental.py for comprehensive analysis")
print(f"  🔬 Explore 01_Mathematical_Analysis.ipynb for deeper insights")
print(f"  ⚡ Try different configurations in experimental_configs.yaml")
print(f"  🧪 Run parameter sensitivity analysis with 04_Parameter_Sensitivity.ipynb")

print("\n" + "="*60)

## 8. Save Results (Optional)

Optionally save the model and results for further analysis.

In [None]:
# Save model and results (optional)
save_results = input("Save model and results? (y/n): ").lower().startswith('y')

if save_results:
    import pickle
    from datetime import datetime
    
    # Create output directory
    output_dir = PROJECT_ROOT / "results" / "quickstart"
    output_dir.mkdir(parents=True, exist_ok=True)
    
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    
    # Save model
    model_path = output_dir / f"quickstart_model_{timestamp}.pkl"
    predictor.save(str(model_path))
    print(f"Model saved to: {model_path}")
    
    # Save results
    results_data = {
        'config': config,
        'test_results': results,
        'test_landmarks': test_landmarks,
        'performance_summary': {
            'mean_error': float(method_error),
            'std_error': float(np.std(all_errors)),
            'baseline_comparison': float(improvement),
            'performance_category': performance_category
        }
    }
    
    results_path = output_dir / f"quickstart_results_{timestamp}.pkl"
    with open(results_path, 'wb') as f:
        pickle.dump(results_data, f)
    print(f"Results saved to: {results_path}")
    
else:
    print("Results not saved.")

## Conclusion

🎉 **Congratulations!** You've successfully:

✅ Loaded and configured the experimental template matching system  
✅ Trained a model on synthetic data  
✅ Made predictions and visualized results  
✅ Analyzed per-landmark performance  
✅ Compared with baseline performance  

### What's Next?

- **📘 Mathematical Analysis**: Explore `01_Mathematical_Analysis.ipynb` for deep mathematical insights
- **📊 Results Analysis**: Use `02_Results_Analysis.ipynb` to analyze real experimental results
- **🧪 Experimentation**: Try `03_Prompt_Driven_Experiments.ipynb` for AI-assisted research
- **⚙️ Parameter Tuning**: Use `04_Parameter_Sensitivity.ipynb` to optimize performance

### Production Usage

For production use with real data:

1. **Training**: Use `train_experimental.py` with your coordinate files
2. **Processing**: Use `process_experimental.py` to process test images
3. **Evaluation**: Use `evaluate_experimental.py` for comprehensive analysis
4. **Pipeline**: Use `run_full_pipeline.py` for automated workflows

**Happy experimenting! 🚀**