# Lab 4.3.7: Reproducibility Audit

**Module:** 4.3 - MLOps & Experiment Tracking  
**Time:** 2 hours  
**Difficulty:** ‚≠ê‚≠ê‚≠ê

---

## üéØ Learning Objectives

By the end of this notebook, you will:
- [ ] Understand why reproducibility matters in ML
- [ ] Implement proper random seed management
- [ ] Capture and recreate training environments
- [ ] Verify training reproducibility systematically
- [ ] Create reproducibility audit reports

---

## üìö Prerequisites

- Completed: Lab 4.3.6 (Model Registry)
- Knowledge of: Python, PyTorch, environment management
- Hardware: DGX Spark (128GB unified memory)

---

## üåç Real-World Context

**"I can't reproduce last week's results!"**

This nightmare scenario happens more often than you'd think:

| Issue | Consequence |
|-------|-------------|
| Different random seeds | Results vary by 2-5% |
| Library version mismatch | Model doesn't load |
| Missing preprocessing steps | Wrong predictions |
| GPU non-determinism | Slight metric differences |
| Data leakage in splits | Inflated test scores |

**Reproducibility Crisis in ML:**
- NeurIPS 2019: Only 50% of papers had reproducible code
- Pharmaceutical AI: FDA requires reproducible models
- Self-driving cars: Regulatory audits need exact reproduction

---

## üßí ELI5: What is Reproducibility?

> **Imagine you're a scientist making a volcano for the science fair.**
>
> **Not reproducible:**
> - "I added some baking soda and... stuff"
> - "It worked yesterday, I swear!"
> - "Maybe try more vinegar?"
>
> **Reproducible:**
> - "Add exactly 2 tablespoons of baking soda"
> - "Pour 50ml of white vinegar"
> - "Wait 3 seconds"
> - "BOOM! Works every time!"
>
> **In ML, reproducibility means:**
> - Same code + same data + same settings = same results
> - Every time
> - On any machine

---

## Part 1: The Reproducibility Checklist

### What Affects Reproducibility?

| Factor | Example | Impact |
|--------|---------|--------|
| **Random seeds** | numpy, torch, python | High - different initialization |
| **Library versions** | torch 2.0 vs 2.1 | Medium - API changes |
| **Hardware** | GPU model, CUDA version | Low-Medium - floating point |
| **Data ordering** | Shuffle state | High - different batches |
| **Environment** | Python version, OS | Low - usually compatible |

In [None]:
import torch
import torch.nn as nn
import numpy as np
import random
import os
import json
import hashlib
import subprocess
import sys
from pathlib import Path
from datetime import datetime
from dataclasses import dataclass, asdict
from typing import Dict, Any, List, Optional, Tuple
import platform

print(f"Python: {sys.version.split()[0]}")
print(f"PyTorch: {torch.__version__}")
print(f"NumPy: {np.__version__}")
print(f"CUDA: {torch.version.cuda}")

In [None]:
# Setup directories
NOTEBOOK_DIR = Path.cwd()
MODULE_DIR = (NOTEBOOK_DIR / "..").resolve()
AUDIT_DIR = MODULE_DIR / "evaluation" / "reproducibility"
AUDIT_DIR.mkdir(parents=True, exist_ok=True)

print(f"üìÅ Audit reports will be saved to: {AUDIT_DIR}")

---

## Part 2: Random Seed Management

The foundation of reproducibility is proper random seed management.

In [None]:
def set_seed(seed: int = 42, deterministic: bool = True):
    """
    Set random seeds for all libraries to ensure reproducibility.
    
    Args:
        seed: The random seed to use
        deterministic: If True, use deterministic algorithms (may be slower)
    
    Returns:
        dict: The seed state for verification
    """
    # Python random
    random.seed(seed)
    
    # NumPy
    np.random.seed(seed)
    
    # PyTorch
    torch.manual_seed(seed)
    
    # PyTorch CUDA
    if torch.cuda.is_available():
        torch.cuda.manual_seed(seed)
        torch.cuda.manual_seed_all(seed)  # For multi-GPU
    
    # Deterministic algorithms
    if deterministic:
        torch.backends.cudnn.deterministic = True
        torch.backends.cudnn.benchmark = False
        
        # PyTorch 2.0+ deterministic mode
        if hasattr(torch, 'use_deterministic_algorithms'):
            try:
                torch.use_deterministic_algorithms(True)
            except Exception:
                pass  # Some ops don't have deterministic implementations
    
    # Environment variable for CUDA
    os.environ['PYTHONHASHSEED'] = str(seed)
    
    return {
        'seed': seed,
        'deterministic': deterministic,
        'python_hash_seed': os.environ.get('PYTHONHASHSEED')
    }


def get_seed_state() -> Dict[str, Any]:
    """
    Capture the current random state of all generators.
    
    Returns:
        dict: States that can be used to restore randomness
    """
    state = {
        'python_state': random.getstate(),
        'numpy_state': np.random.get_state(),
        'torch_state': torch.get_rng_state(),
    }
    
    if torch.cuda.is_available():
        state['cuda_state'] = torch.cuda.get_rng_state_all()
    
    return state


def set_seed_state(state: Dict[str, Any]):
    """Restore random state from a captured state dict."""
    random.setstate(state['python_state'])
    np.random.set_state(state['numpy_state'])
    torch.set_rng_state(state['torch_state'])
    
    if 'cuda_state' in state and torch.cuda.is_available():
        torch.cuda.set_rng_state_all(state['cuda_state'])


print("‚úÖ Seed management functions defined")

In [None]:
# Demo: Verify seed setting works
print("üî¨ Testing Seed Reproducibility")
print("=" * 50)

# Test 1: Same seed = same results
set_seed(42)
result1 = {
    'python': random.random(),
    'numpy': np.random.rand(),
    'torch': torch.rand(1).item()
}

set_seed(42)  # Reset with same seed
result2 = {
    'python': random.random(),
    'numpy': np.random.rand(),
    'torch': torch.rand(1).item()
}

print("\nWith same seed (42):")
for key in result1:
    match = "‚úÖ" if result1[key] == result2[key] else "‚ùå"
    print(f"   {key}: {result1[key]:.6f} vs {result2[key]:.6f} {match}")

# Test 2: Different seed = different results
set_seed(123)
result3 = {
    'python': random.random(),
    'numpy': np.random.rand(),
    'torch': torch.rand(1).item()
}

print("\nWith different seed (123):")
for key in result1:
    different = "‚úÖ (different)" if result1[key] != result3[key] else "‚ùå (same!)"
    print(f"   {key}: {result1[key]:.6f} vs {result3[key]:.6f} {different}")

---

## Part 3: Environment Capture

Capture everything about the training environment.

In [None]:
@dataclass
class EnvironmentSnapshot:
    """Complete snapshot of the training environment."""
    
    # System info
    python_version: str
    os_name: str
    os_version: str
    platform: str
    
    # Hardware
    cpu_count: int
    gpu_available: bool
    gpu_name: str
    gpu_count: int
    cuda_version: str
    
    # Library versions
    torch_version: str
    numpy_version: str
    python_packages: Dict[str, str]
    
    # Timestamp
    captured_at: str
    
    def to_dict(self) -> Dict[str, Any]:
        return asdict(self)
    
    def to_json(self) -> str:
        return json.dumps(self.to_dict(), indent=2)


def capture_environment() -> EnvironmentSnapshot:
    """
    Capture complete environment snapshot.
    
    Returns:
        EnvironmentSnapshot with all environment details
    """
    # Get installed packages
    try:
        result = subprocess.run(
            [sys.executable, '-m', 'pip', 'freeze'],
            capture_output=True, text=True, timeout=30
        )
        packages = {}
        for line in result.stdout.strip().split('\n'):
            if '==' in line:
                name, version = line.split('==')
                packages[name] = version
    except Exception:
        packages = {}
    
    # GPU info
    gpu_available = torch.cuda.is_available()
    gpu_name = torch.cuda.get_device_name(0) if gpu_available else "N/A"
    gpu_count = torch.cuda.device_count() if gpu_available else 0
    cuda_version = torch.version.cuda if gpu_available else "N/A"
    
    return EnvironmentSnapshot(
        python_version=sys.version.split()[0],
        os_name=platform.system(),
        os_version=platform.release(),
        platform=platform.platform(),
        cpu_count=os.cpu_count() or 0,
        gpu_available=gpu_available,
        gpu_name=gpu_name,
        gpu_count=gpu_count,
        cuda_version=cuda_version,
        torch_version=torch.__version__,
        numpy_version=np.__version__,
        python_packages=packages,
        captured_at=datetime.now().isoformat()
    )


print("‚úÖ Environment capture functions defined")

In [None]:
# Capture current environment
env_snapshot = capture_environment()

print("üì∏ ENVIRONMENT SNAPSHOT")
print("=" * 60)
print(f"\nüñ•Ô∏è System:")
print(f"   Python: {env_snapshot.python_version}")
print(f"   OS: {env_snapshot.os_name} {env_snapshot.os_version}")
print(f"   Platform: {env_snapshot.platform}")
print(f"   CPU cores: {env_snapshot.cpu_count}")

print(f"\nüéÆ GPU:")
print(f"   Available: {env_snapshot.gpu_available}")
print(f"   Name: {env_snapshot.gpu_name}")
print(f"   Count: {env_snapshot.gpu_count}")
print(f"   CUDA: {env_snapshot.cuda_version}")

print(f"\nüì¶ Key Libraries:")
print(f"   PyTorch: {env_snapshot.torch_version}")
print(f"   NumPy: {env_snapshot.numpy_version}")
print(f"   Total packages: {len(env_snapshot.python_packages)}")

In [None]:
# Save environment snapshot
env_file = AUDIT_DIR / "environment_snapshot.json"
with open(env_file, 'w') as f:
    f.write(env_snapshot.to_json())

print(f"üíæ Environment saved to: {env_file}")

---

## Part 4: Reproducibility Verification

Actually verify that training is reproducible!

In [None]:
# Define a simple model for testing
class SimpleModel(nn.Module):
    def __init__(self, input_dim: int = 10, hidden_dim: int = 32, output_dim: int = 2):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(input_dim, hidden_dim),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(hidden_dim, output_dim)
        )
    
    def forward(self, x):
        return self.net(x)


def train_epoch(model, optimizer, data, targets, criterion):
    """Train for one epoch and return loss."""
    model.train()
    optimizer.zero_grad()
    output = model(data)
    loss = criterion(output, targets)
    loss.backward()
    optimizer.step()
    return loss.item()


def get_model_hash(model: nn.Module) -> str:
    """Compute hash of model weights for comparison."""
    hasher = hashlib.sha256()
    for param in model.parameters():
        hasher.update(param.data.cpu().numpy().tobytes())
    return hasher.hexdigest()[:16]


print("‚úÖ Training functions defined")

In [None]:
@dataclass
class ReproducibilityResult:
    """Result of a reproducibility test."""
    is_reproducible: bool
    seed: int
    run1_losses: List[float]
    run2_losses: List[float]
    run1_model_hash: str
    run2_model_hash: str
    max_loss_difference: float
    weights_match: bool
    timestamp: str


def verify_reproducibility(
    seed: int = 42,
    epochs: int = 5,
    tolerance: float = 1e-6
) -> ReproducibilityResult:
    """
    Verify that training is reproducible with the same seed.
    
    Args:
        seed: Random seed to use
        epochs: Number of training epochs
        tolerance: Maximum allowed difference in losses
    
    Returns:
        ReproducibilityResult with detailed comparison
    """
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    
    # First run
    set_seed(seed)
    model1 = SimpleModel().to(device)
    optimizer1 = torch.optim.Adam(model1.parameters(), lr=0.01)
    criterion = nn.CrossEntropyLoss()
    
    # Generate data with seed
    data = torch.randn(100, 10).to(device)
    targets = torch.randint(0, 2, (100,)).to(device)
    
    losses1 = []
    for _ in range(epochs):
        loss = train_epoch(model1, optimizer1, data, targets, criterion)
        losses1.append(loss)
    
    hash1 = get_model_hash(model1)
    
    # Second run (reset everything)
    set_seed(seed)
    model2 = SimpleModel().to(device)
    optimizer2 = torch.optim.Adam(model2.parameters(), lr=0.01)
    
    # Regenerate data with same seed
    data = torch.randn(100, 10).to(device)
    targets = torch.randint(0, 2, (100,)).to(device)
    
    losses2 = []
    for _ in range(epochs):
        loss = train_epoch(model2, optimizer2, data, targets, criterion)
        losses2.append(loss)
    
    hash2 = get_model_hash(model2)
    
    # Compare
    max_diff = max(abs(l1 - l2) for l1, l2 in zip(losses1, losses2))
    weights_match = hash1 == hash2
    is_reproducible = max_diff < tolerance and weights_match
    
    return ReproducibilityResult(
        is_reproducible=is_reproducible,
        seed=seed,
        run1_losses=losses1,
        run2_losses=losses2,
        run1_model_hash=hash1,
        run2_model_hash=hash2,
        max_loss_difference=max_diff,
        weights_match=weights_match,
        timestamp=datetime.now().isoformat()
    )


print("‚úÖ Reproducibility verification function defined")

In [None]:
# Run reproducibility test
print("üî¨ REPRODUCIBILITY VERIFICATION")
print("=" * 60)

result = verify_reproducibility(seed=42, epochs=5)

status = "‚úÖ REPRODUCIBLE" if result.is_reproducible else "‚ùå NOT REPRODUCIBLE"
print(f"\nResult: {status}")
print(f"\nDetails:")
print(f"   Seed: {result.seed}")
print(f"   Weights match: {result.weights_match}")
print(f"   Max loss difference: {result.max_loss_difference:.2e}")

print(f"\nüìä Loss Comparison:")
print(f"   {'Epoch':<8} {'Run 1':<12} {'Run 2':<12} {'Diff':<12}")
print(f"   {'-'*44}")
for i, (l1, l2) in enumerate(zip(result.run1_losses, result.run2_losses)):
    diff = abs(l1 - l2)
    match = "‚úì" if diff < 1e-6 else "‚úó"
    print(f"   {i+1:<8} {l1:<12.6f} {l2:<12.6f} {diff:<12.2e} {match}")

print(f"\nüîë Model Hashes:")
print(f"   Run 1: {result.run1_model_hash}")
print(f"   Run 2: {result.run2_model_hash}")

---

## Part 5: Comprehensive Reproducibility Audit

Create a complete audit report for compliance and documentation.

In [None]:
@dataclass
class AuditResult:
    """Complete reproducibility audit result."""
    
    # Summary
    passed: bool
    audit_id: str
    timestamp: str
    
    # Components
    environment: EnvironmentSnapshot
    reproducibility: ReproducibilityResult
    seed_config: Dict[str, Any]
    
    # Data verification
    data_hash: str
    
    # Checks passed
    checks: Dict[str, bool]
    
    def to_report(self) -> str:
        """Generate a human-readable audit report."""
        status = "PASSED" if self.passed else "FAILED"
        
        report = f"""
{'='*70}
                    REPRODUCIBILITY AUDIT REPORT
{'='*70}

Audit ID: {self.audit_id}
Date: {self.timestamp}
Status: {status}

{'-'*70}
ENVIRONMENT
{'-'*70}
Python Version: {self.environment.python_version}
PyTorch Version: {self.environment.torch_version}
CUDA Version: {self.environment.cuda_version}
GPU: {self.environment.gpu_name}
Platform: {self.environment.platform}

{'-'*70}
REPRODUCIBILITY VERIFICATION
{'-'*70}
Seed Used: {self.reproducibility.seed}
Weights Match: {self.reproducibility.weights_match}
Max Loss Difference: {self.reproducibility.max_loss_difference:.2e}
Training Reproducible: {self.reproducibility.is_reproducible}

{'-'*70}
CHECKS
{'-'*70}
"""
        for check, passed in self.checks.items():
            status = "‚úÖ PASS" if passed else "‚ùå FAIL"
            report += f"{check}: {status}\n"
        
        report += f"""
{'-'*70}
DATA
{'-'*70}
Data Hash: {self.data_hash}

{'='*70}
                           END OF REPORT
{'='*70}
"""
        return report


def run_full_audit(seed: int = 42) -> AuditResult:
    """
    Run a complete reproducibility audit.
    
    Args:
        seed: Random seed to use for testing
    
    Returns:
        AuditResult with complete audit information
    """
    audit_id = f"AUDIT-{datetime.now().strftime('%Y%m%d-%H%M%S')}"
    
    # Capture environment
    env = capture_environment()
    
    # Set seed and get config
    seed_config = set_seed(seed)
    
    # Run reproducibility verification
    repro_result = verify_reproducibility(seed=seed)
    
    # Generate sample data hash
    set_seed(seed)
    sample_data = torch.randn(100, 10)
    data_hash = hashlib.sha256(sample_data.numpy().tobytes()).hexdigest()[:16]
    
    # Run all checks
    checks = {
        "Python seed set": True,
        "NumPy seed set": True,
        "PyTorch seed set": True,
        "CUDA deterministic": torch.backends.cudnn.deterministic if torch.cuda.is_available() else True,
        "Training reproducible": repro_result.is_reproducible,
        "Weights match": repro_result.weights_match,
        "Environment captured": env is not None,
        "Data hash computed": len(data_hash) == 16
    }
    
    all_passed = all(checks.values())
    
    return AuditResult(
        passed=all_passed,
        audit_id=audit_id,
        timestamp=datetime.now().isoformat(),
        environment=env,
        reproducibility=repro_result,
        seed_config=seed_config,
        data_hash=data_hash,
        checks=checks
    )


print("‚úÖ Full audit function defined")

In [None]:
# Run the full audit
print("üîç Running Full Reproducibility Audit...")
print()

audit = run_full_audit(seed=42)

# Print the report
print(audit.to_report())

In [None]:
# Save audit report
report_file = AUDIT_DIR / f"{audit.audit_id}.txt"
with open(report_file, 'w') as f:
    f.write(audit.to_report())

# Save JSON for programmatic access
json_file = AUDIT_DIR / f"{audit.audit_id}.json"
audit_dict = {
    "passed": audit.passed,
    "audit_id": audit.audit_id,
    "timestamp": audit.timestamp,
    "seed_config": audit.seed_config,
    "data_hash": audit.data_hash,
    "checks": audit.checks,
    "environment": audit.environment.to_dict(),
    "reproducibility": {
        "is_reproducible": audit.reproducibility.is_reproducible,
        "seed": audit.reproducibility.seed,
        "weights_match": audit.reproducibility.weights_match,
        "max_loss_difference": audit.reproducibility.max_loss_difference
    }
}

with open(json_file, 'w') as f:
    json.dump(audit_dict, f, indent=2)

print(f"\nüíæ Audit saved to:")
print(f"   Report: {report_file}")
print(f"   JSON: {json_file}")

---

## Part 6: DataLoader Reproducibility

DataLoader shuffling needs special handling for reproducibility.

In [None]:
from torch.utils.data import DataLoader, TensorDataset

def worker_init_fn(worker_id: int):
    """
    Initialize each DataLoader worker with a unique seed.
    
    This ensures reproducibility across workers.
    """
    worker_seed = torch.initial_seed() % 2**32
    np.random.seed(worker_seed)
    random.seed(worker_seed)


def create_reproducible_dataloader(
    data: torch.Tensor,
    targets: torch.Tensor,
    batch_size: int = 32,
    shuffle: bool = True,
    seed: int = 42
) -> DataLoader:
    """
    Create a DataLoader with reproducible shuffling.
    
    Args:
        data: Input data tensor
        targets: Target tensor
        batch_size: Batch size
        shuffle: Whether to shuffle
        seed: Random seed for shuffling
    
    Returns:
        Reproducible DataLoader
    """
    dataset = TensorDataset(data, targets)
    
    # Create a generator with fixed seed for shuffling
    generator = torch.Generator()
    generator.manual_seed(seed)
    
    return DataLoader(
        dataset,
        batch_size=batch_size,
        shuffle=shuffle,
        generator=generator,
        worker_init_fn=worker_init_fn,
        num_workers=0  # Use 0 for maximum reproducibility
    )


# Demo: Verify DataLoader reproducibility
print("üî¨ Testing DataLoader Reproducibility")
print("=" * 50)

# Create sample data
data = torch.arange(100).float().unsqueeze(1)
targets = torch.arange(100)

# First loader
loader1 = create_reproducible_dataloader(data, targets, batch_size=10, seed=42)
batches1 = [batch[0][:3, 0].tolist() for batch in loader1]

# Second loader (same seed)
loader2 = create_reproducible_dataloader(data, targets, batch_size=10, seed=42)
batches2 = [batch[0][:3, 0].tolist() for batch in loader2]

print("\nFirst 3 elements of each batch:")
print(f"{'Batch':<8} {'Loader 1':<20} {'Loader 2':<20} {'Match':<10}")
print("-" * 60)

for i, (b1, b2) in enumerate(zip(batches1, batches2)):
    match = "‚úÖ" if b1 == b2 else "‚ùå"
    print(f"{i+1:<8} {str(b1):<20} {str(b2):<20} {match}")

---

## ‚úã Try It Yourself: Exercise

**Task:** Create your own reproducibility audit.

1. Define a model architecture
2. Train it twice with the same seed
3. Verify the losses match exactly
4. Generate an audit report
5. Test what happens with a different seed

<details>
<summary>üí° Hint</summary>

```python
# Define your model
class MyModel(nn.Module):
    def __init__(self):
        super().__init__()
        # Your architecture

# Run audit
def my_reproducibility_test(seed):
    set_seed(seed)
    model = MyModel()
    # Train and capture losses
    return losses, model_hash

# Compare runs
losses1, hash1 = my_reproducibility_test(42)
losses2, hash2 = my_reproducibility_test(42)

print(f"Match: {hash1 == hash2}")
```
</details>

In [None]:
# YOUR CODE HERE

# Step 1: Define model


# Step 2: Train twice with same seed


# Step 3: Verify losses match


# Step 4: Generate report


# Step 5: Test with different seed


---

## ‚ö†Ô∏è Common Mistakes

### Mistake 1: Setting Seed Once at the Start

In [None]:
# ‚ùå WRONG: Seed only at beginning
# set_seed(42)
# for epoch in range(10):
#     train()  # Random state changes unpredictably

# ‚úÖ RIGHT: Reset seed when exact reproduction is needed
# set_seed(42)
# train_first_model()
# 
# set_seed(42)  # Reset before second run
# train_second_model()  # Now reproducible!

print("Reset seeds before each run you want to reproduce!")

### Mistake 2: Using Non-Deterministic Operations

In [None]:
# ‚ùå WRONG: Some ops are non-deterministic by default
# output = torch.nn.functional.interpolate(x, scale_factor=2)  # May vary!

# ‚úÖ RIGHT: Use deterministic mode
# torch.use_deterministic_algorithms(True)
# Or use deterministic implementations

print("Enable torch.use_deterministic_algorithms(True) for strict reproducibility.")
print("Note: Some operations don't have deterministic implementations.")

---

## üéâ Checkpoint

You've learned:
- ‚úÖ Why reproducibility matters in ML
- ‚úÖ Proper random seed management
- ‚úÖ Environment capture and recreation
- ‚úÖ Reproducibility verification
- ‚úÖ Creating comprehensive audit reports

---

## üìñ Further Reading

- [PyTorch Reproducibility Guide](https://pytorch.org/docs/stable/notes/randomness.html)
- [Reproducibility in ML (Papers With Code)](https://paperswithcode.com/sota)
- [ML Reproducibility Checklist](https://www.cs.mcgill.ca/~jpineau/ReproducibilityChecklist.pdf)
- [Docker for ML Reproducibility](https://docs.docker.com/)

---

## üßπ Cleanup

In [None]:
import gc

gc.collect()

if torch.cuda.is_available():
    torch.cuda.empty_cache()

print(f"üìÅ Audit reports saved to: {AUDIT_DIR}")
print("‚úÖ Resources cleaned up")

---

## üìù Module Summary

Congratulations on completing Module 4.3: MLOps & Experiment Tracking!

### What You've Learned:

| Lab | Topic | Key Skills |
|-----|-------|------------|
| 4.3.1 | MLflow Setup | Experiment tracking, logging, UI |
| 4.3.2 | W&B Integration | Dashboards, sweeps, team collaboration |
| 4.3.3 | Benchmark Suite | lm-eval, model comparison, metrics |
| 4.3.4 | Custom Evaluation | LLM-as-judge, pairwise comparison |
| 4.3.5 | Drift Detection | Evidently AI, monitoring, alerts |
| 4.3.6 | Model Registry | Versioning, lifecycle, promotion |
| 4.3.7 | Reproducibility | Seeds, environments, audits |

### Next Steps:
- Document your experiment tracking setup
- Keep your benchmark results for the capstone
- Proceed to Module 4.4: Containerization & Deployment

**Well done! You're now equipped with industry-standard MLOps practices!**