# Module 02: Model Versioning and Registry

**Difficulty**: ⭐⭐ Intermediate  
**Estimated Time**: 50 minutes  
**Prerequisites**: 
- Module 01: Experiment Tracking with MLflow
- Understanding of ML model lifecycle

## Learning Objectives

By the end of this notebook, you will be able to:
1. Set up and use MLflow Model Registry for model versioning
2. Register models and manage multiple versions
3. Transition models between lifecycle stages (Staging, Production, Archived)
4. Track model lineage and metadata
5. Implement model governance and approval workflows

## 1. Why Model Versioning and Registry Matter

Imagine you have multiple models in production:

**Without a Model Registry:**
- ❌ "Which model version is in production?"
- ❌ "Who approved this model for deployment?"
- ❌ "When was the last model update?"
- ❌ "Can we rollback to the previous version?"
- ❌ "What data was this model trained on?"

**With a Model Registry:**
- ✅ Centralized model storage
- ✅ Version control for models
- ✅ Stage-based model lifecycle (Staging → Production)
- ✅ Model lineage tracking
- ✅ Collaborative model management
- ✅ Easy rollback and A/B testing

In [None]:
# Setup: Import required libraries
import mlflow
import mlflow.sklearn
from mlflow.tracking import MlflowClient
from mlflow.models.signature import infer_signature
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
import warnings
from datetime import datetime

warnings.filterwarnings('ignore')
sns.set_style('whitegrid')
%matplotlib inline

# Set random seed for reproducibility
np.random.seed(42)

print("✓ Libraries imported successfully")
print(f"✓ MLflow version: {mlflow.__version__}")

## 2. Setting Up MLflow Model Registry

The Model Registry is a centralized model store that provides:
- **Model Versioning**: Automatic version increments
- **Stage Transitions**: None → Staging → Production → Archived
- **Annotations**: Descriptions, tags, and metadata
- **Model Lineage**: Links to training runs and datasets

In [None]:
# Set up MLflow tracking and registry
mlflow.set_tracking_uri("file:./mlruns")

# Create experiment for model registry demo
experiment_name = "model_registry_demo"
mlflow.set_experiment(experiment_name)

# Initialize MLflow client for registry operations
client = MlflowClient()

print(f"✓ MLflow tracking URI: {mlflow.get_tracking_uri()}")
print(f"✓ Active experiment: {experiment_name}")
print("\nTo view the Model Registry:")
print("  1. Run: mlflow ui")
print("  2. Navigate to http://localhost:5000")
print("  3. Click on 'Models' tab")

## 3. Preparing Sample Data and Training Models

Let's create a dataset and train multiple model versions to demonstrate the registry.

In [None]:
# Generate synthetic dataset for fraud detection
X, y = make_classification(
    n_samples=3000,
    n_features=25,
    n_informative=20,
    n_redundant=5,
    n_classes=2,
    weights=[0.95, 0.05],  # Highly imbalanced (5% fraud)
    random_state=42
)

# Create feature names
feature_names = [f'feature_{i}' for i in range(25)]

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

print(f"Training set: {X_train.shape[0]} samples")
print(f"Test set: {X_test.shape[0]} samples")
print(f"\nClass distribution:")
print(f"  Legitimate: {(y == 0).sum()} ({(y == 0).mean()*100:.1f}%)")
print(f"  Fraud: {(y == 1).sum()} ({(y == 1).mean()*100:.1f}%)")

## 4. Registering Your First Model

To register a model:
1. Train and log the model using MLflow
2. Register it with a unique name
3. MLflow automatically creates version 1

In [None]:
# Define registered model name
# This is the name that will appear in the Model Registry
model_name = "fraud_detection_model"

# Train and register the first model version
with mlflow.start_run(run_name="initial_logistic_regression") as run:
    
    # Model parameters
    params = {
        'C': 1.0,
        'max_iter': 200,
        'solver': 'lbfgs',
        'class_weight': 'balanced'  # Handle imbalanced data
    }
    
    # Log parameters
    mlflow.log_params(params)
    
    # Train model
    model = LogisticRegression(**params, random_state=42)
    model.fit(X_train, y_train)
    
    # Make predictions
    y_pred = model.predict(X_test)
    
    # Calculate and log metrics
    metrics = {
        'accuracy': accuracy_score(y_test, y_pred),
        'precision': precision_score(y_test, y_pred),
        'recall': recall_score(y_test, y_pred),
        'f1_score': f1_score(y_test, y_pred)
    }
    mlflow.log_metrics(metrics)
    
    # Create model signature (input/output schema)
    # This helps with model validation and documentation
    signature = infer_signature(X_train, model.predict(X_train))
    
    # Log and register the model
    # registered_model_name triggers registration
    mlflow.sklearn.log_model(
        sk_model=model,
        artifact_path="model",
        signature=signature,
        registered_model_name=model_name
    )
    
    run_id = run.info.run_id
    
    print("✓ Model trained and registered!")
    print(f"✓ Run ID: {run_id}")
    print(f"✓ Registered as: {model_name}")
    print(f"\nMetrics:")
    for metric, value in metrics.items():
        print(f"  {metric}: {value:.4f}")

## 5. Creating Multiple Model Versions

As you improve your model, you'll create new versions. The registry tracks all versions automatically.

In [None]:
# Train and register version 2: Random Forest
with mlflow.start_run(run_name="v2_random_forest") as run:
    
    params = {
        'n_estimators': 100,
        'max_depth': 10,
        'class_weight': 'balanced',
        'random_state': 42
    }
    
    mlflow.log_params(params)
    
    model = RandomForestClassifier(**params)
    model.fit(X_train, y_train)
    
    y_pred = model.predict(X_test)
    
    metrics = {
        'accuracy': accuracy_score(y_test, y_pred),
        'precision': precision_score(y_test, y_pred),
        'recall': recall_score(y_test, y_pred),
        'f1_score': f1_score(y_test, y_pred)
    }
    mlflow.log_metrics(metrics)
    
    signature = infer_signature(X_train, model.predict(X_train))
    
    # Register as the same model name - creates version 2
    mlflow.sklearn.log_model(
        sk_model=model,
        artifact_path="model",
        signature=signature,
        registered_model_name=model_name
    )
    
    print("✓ Version 2 (Random Forest) registered!")
    print(f"\nMetrics:")
    for metric, value in metrics.items():
        print(f"  {metric}: {value:.4f}")

In [None]:
# Train and register version 3: Gradient Boosting
with mlflow.start_run(run_name="v3_gradient_boosting") as run:
    
    params = {
        'n_estimators': 100,
        'learning_rate': 0.1,
        'max_depth': 5,
        'random_state': 42
    }
    
    mlflow.log_params(params)
    
    model = GradientBoostingClassifier(**params)
    model.fit(X_train, y_train)
    
    y_pred = model.predict(X_test)
    
    metrics = {
        'accuracy': accuracy_score(y_test, y_pred),
        'precision': precision_score(y_test, y_pred),
        'recall': recall_score(y_test, y_pred),
        'f1_score': f1_score(y_test, y_pred)
    }
    mlflow.log_metrics(metrics)
    
    signature = infer_signature(X_train, model.predict(X_train))
    
    # Register as version 3
    mlflow.sklearn.log_model(
        sk_model=model,
        artifact_path="model",
        signature=signature,
        registered_model_name=model_name
    )
    
    print("✓ Version 3 (Gradient Boosting) registered!")
    print(f"\nMetrics:")
    for metric, value in metrics.items():
        print(f"  {metric}: {value:.4f}")

## 6. Viewing and Managing Model Versions

Use the MLflow Client to query and manage registered models.

In [None]:
# List all registered models
registered_models = client.search_registered_models()

print("Registered Models:")
print("="*80)
for rm in registered_models:
    print(f"\nModel: {rm.name}")
    print(f"Description: {rm.description if rm.description else 'No description'}")
    print(f"Latest versions: {len(rm.latest_versions)}")

In [None]:
# Get all versions of our fraud detection model
versions = client.search_model_versions(f"name='{model_name}'")

print(f"All versions of '{model_name}':")
print("="*80)

# Create a summary DataFrame
version_data = []
for version in versions:
    # Get metrics from the run
    run = client.get_run(version.run_id)
    
    version_data.append({
        'Version': version.version,
        'Stage': version.current_stage,
        'Run ID': version.run_id[:8] + '...',
        'Accuracy': run.data.metrics.get('accuracy', 0),
        'F1 Score': run.data.metrics.get('f1_score', 0),
        'Recall': run.data.metrics.get('recall', 0)
    })

versions_df = pd.DataFrame(version_data)
versions_df = versions_df.sort_values('Version', ascending=False)
print(versions_df.to_string(index=False))

## 7. Model Lifecycle Stages

MLflow provides four lifecycle stages:
1. **None**: Default stage after registration
2. **Staging**: Model is being tested (pre-production)
3. **Production**: Model is live and serving predictions
4. **Archived**: Model is retired but kept for reference

### Typical Workflow:
```
Register → None → Staging → Production → Archived
```

In [None]:
# Transition version 2 (Random Forest) to Staging
client.transition_model_version_stage(
    name=model_name,
    version=2,
    stage="Staging",
    archive_existing_versions=False  # Keep other versions in their current stages
)

print("✓ Version 2 transitioned to Staging")

# Transition version 3 (Gradient Boosting) to Production
# This is our best performing model
client.transition_model_version_stage(
    name=model_name,
    version=3,
    stage="Production",
    archive_existing_versions=False
)

print("✓ Version 3 transitioned to Production")

# Archive version 1 (Logistic Regression)
# It's no longer needed but we keep it for historical reference
client.transition_model_version_stage(
    name=model_name,
    version=1,
    stage="Archived",
    archive_existing_versions=False
)

print("✓ Version 1 archived")

In [None]:
# View updated stages
versions = client.search_model_versions(f"name='{model_name}'")

print(f"\nUpdated lifecycle stages for '{model_name}':")
print("="*80)

version_data = []
for version in versions:
    run = client.get_run(version.run_id)
    version_data.append({
        'Version': version.version,
        'Stage': version.current_stage,
        'F1 Score': f"{run.data.metrics.get('f1_score', 0):.4f}",
        'Description': version.description if version.description else 'No description'
    })

versions_df = pd.DataFrame(version_data)
versions_df = versions_df.sort_values('Version', ascending=False)
print(versions_df.to_string(index=False))

## 8. Adding Model Metadata and Annotations

Document your models with descriptions, tags, and annotations for better governance.

In [None]:
# Update model description
client.update_registered_model(
    name=model_name,
    description="Fraud detection model for transaction monitoring. "
                "Uses ensemble methods to identify fraudulent transactions. "
                "Optimized for high recall to minimize false negatives."
)

print(f"✓ Updated model description for '{model_name}'")

# Add version-specific descriptions
client.update_model_version(
    name=model_name,
    version=3,
    description="Production model (v3). Gradient Boosting Classifier. "
                "Achieved F1=0.85 on validation set. "
                f"Deployed: {datetime.now().strftime('%Y-%m-%d')}"
)

client.update_model_version(
    name=model_name,
    version=2,
    description="Staging model (v2). Random Forest Classifier. "
                "Currently being A/B tested against production."
)

print("✓ Updated version-specific descriptions")

In [None]:
# Add tags for better organization and searchability
client.set_registered_model_tag(
    name=model_name,
    key="task",
    value="binary_classification"
)

client.set_registered_model_tag(
    name=model_name,
    key="domain",
    value="fraud_detection"
)

client.set_registered_model_tag(
    name=model_name,
    key="team",
    value="risk_analytics"
)

# Add version-specific tags
client.set_model_version_tag(
    name=model_name,
    version=3,
    key="validation_status",
    value="approved"
)

client.set_model_version_tag(
    name=model_name,
    version=3,
    key="approved_by",
    value="data_science_lead"
)

print("✓ Added tags for model governance")

## 9. Loading Models from Registry

In production, you load models by name and stage, not by run ID. This enables easy model updates.

In [None]:
# Load the production model
# This URI automatically gets the latest production version
production_model_uri = f"models:/{model_name}/Production"
production_model = mlflow.sklearn.load_model(production_model_uri)

print(f"✓ Loaded production model from registry")
print(f"  Model type: {type(production_model).__name__}")

# Make predictions with production model
sample_predictions = production_model.predict(X_test[:5])
print(f"\nSample predictions: {sample_predictions}")
print(f"Actual values: {y_test[:5]}")

In [None]:
# Load staging model for comparison
staging_model_uri = f"models:/{model_name}/Staging"
staging_model = mlflow.sklearn.load_model(staging_model_uri)

print(f"✓ Loaded staging model from registry")
print(f"  Model type: {type(staging_model).__name__}")

# Compare predictions
production_preds = production_model.predict(X_test)
staging_preds = staging_model.predict(X_test)

prod_f1 = f1_score(y_test, production_preds)
staging_f1 = f1_score(y_test, staging_preds)

print(f"\nProduction model F1: {prod_f1:.4f}")
print(f"Staging model F1: {staging_f1:.4f}")
print(f"\nDifference: {abs(prod_f1 - staging_f1):.4f}")

In [None]:
# Load specific version (useful for debugging or rollback)
specific_version_uri = f"models:/{model_name}/1"
v1_model = mlflow.sklearn.load_model(specific_version_uri)

print(f"✓ Loaded version 1 (archived) from registry")
print(f"  Model type: {type(v1_model).__name__}")

# This is useful for:
# - Debugging production issues
# - Comparing old vs new models
# - Rolling back to a previous version
v1_preds = v1_model.predict(X_test)
v1_f1 = f1_score(y_test, v1_preds)
print(f"  F1 Score: {v1_f1:.4f}")

## 10. Model Lineage and Traceability

Track where models came from and what data they used.

In [None]:
# Get detailed information about production model version
prod_versions = client.get_latest_versions(model_name, stages=["Production"])
prod_version = prod_versions[0]

print(f"Production Model Lineage:")
print("="*80)
print(f"Model Name: {prod_version.name}")
print(f"Version: {prod_version.version}")
print(f"Stage: {prod_version.current_stage}")
print(f"Run ID: {prod_version.run_id}")
print(f"Source: {prod_version.source}")
print(f"Created: {datetime.fromtimestamp(prod_version.creation_timestamp/1000)}")
print(f"Last Updated: {datetime.fromtimestamp(prod_version.last_updated_timestamp/1000)}")

# Get the training run details
run = client.get_run(prod_version.run_id)
print(f"\nTraining Run Details:")
print(f"  Parameters:")
for param, value in run.data.params.items():
    print(f"    {param}: {value}")
    
print(f"\n  Metrics:")
for metric, value in run.data.metrics.items():
    print(f"    {metric}: {value:.4f}")

## 11. Implementing Model Promotion Workflow

Create a function to automate model promotion based on performance criteria.

In [None]:
def promote_model_if_better(model_name, staging_version, production_version=None, 
                            metric='f1_score', threshold=0.02):
    """
    Promote staging model to production if it performs better.
    
    Args:
        model_name: Name of the registered model
        staging_version: Version number in staging
        production_version: Current production version (if exists)
        metric: Metric to compare (default: f1_score)
        threshold: Minimum improvement required (default: 0.02)
    
    Returns:
        bool: True if promoted, False otherwise
    """
    
    # Get staging model metrics
    staging_model_version = client.get_model_version(model_name, staging_version)
    staging_run = client.get_run(staging_model_version.run_id)
    staging_metric_value = staging_run.data.metrics.get(metric, 0)
    
    print(f"Staging model (v{staging_version}) {metric}: {staging_metric_value:.4f}")
    
    # Check if production model exists
    if production_version is None:
        print("No production model exists. Promoting staging to production.")
        should_promote = True
    else:
        # Get production model metrics
        prod_model_version = client.get_model_version(model_name, production_version)
        prod_run = client.get_run(prod_model_version.run_id)
        prod_metric_value = prod_run.data.metrics.get(metric, 0)
        
        print(f"Production model (v{production_version}) {metric}: {prod_metric_value:.4f}")
        
        # Check if improvement exceeds threshold
        improvement = staging_metric_value - prod_metric_value
        print(f"Improvement: {improvement:.4f} (threshold: {threshold})")
        
        should_promote = improvement >= threshold
    
    if should_promote:
        # Archive current production model
        if production_version is not None:
            client.transition_model_version_stage(
                name=model_name,
                version=production_version,
                stage="Archived"
            )
            print(f"✓ Archived production v{production_version}")
        
        # Promote staging to production
        client.transition_model_version_stage(
            name=model_name,
            version=staging_version,
            stage="Production"
        )
        print(f"✓ Promoted staging v{staging_version} to Production")
        
        # Add promotion metadata
        client.set_model_version_tag(
            name=model_name,
            version=staging_version,
            key="promoted_date",
            value=datetime.now().strftime('%Y-%m-%d %H:%M:%S')
        )
        
        return True
    else:
        print("✗ Staging model does not meet promotion criteria")
        return False

# Example: Try to promote version 2 to production
print("Attempting to promote model...")
print("="*80)
promoted = promote_model_if_better(
    model_name=model_name,
    staging_version=2,
    production_version=3,
    metric='f1_score',
    threshold=0.02
)

## 12. Exercises

### Exercise 1: Register and Version a New Model

Create a new registered model for a different use case.

**Requirements:**
1. Train at least 3 different model types (your choice)
2. Register them all under the same model name "customer_churn_predictor"
3. Add descriptions and tags to each version
4. Create a comparison table showing all versions and their metrics

**Hint**: Use the same dataset or create a new one.

In [None]:
# Your solution here

# TODO: Implement your solution
# 1. Train multiple models
# 2. Register them with the same name
# 3. Add metadata
# 4. Create comparison table

### Exercise 2: Implement Stage Transition Workflow

Create a complete workflow for moving models through lifecycle stages.

**Requirements:**
1. Identify the best model from Exercise 1 based on a metric of your choice
2. Transition it through: None → Staging → Production
3. Add appropriate tags at each stage (e.g., "tested_by", "approved_by", "deployment_date")
4. Create a function that checks model readiness before transitioning stages

**Bonus**: Add validation checks (e.g., minimum accuracy threshold, required metadata)

In [None]:
# Your solution here

def check_model_readiness(model_name, version, target_stage):
    """
    Check if a model is ready to transition to the target stage.
    Add validation logic here.
    """
    # TODO: Implement validation logic
    pass

# TODO: Implement the workflow

### Exercise 3: Model Rollback Scenario

Simulate a production issue and implement a rollback strategy.

**Scenario**: Your production model (v3) is causing issues. You need to rollback to v2.

**Requirements:**
1. Transition current production model to Archived
2. Promote the previous version back to Production
3. Add tags documenting the rollback (reason, timestamp, who initiated)
4. Create a rollback report showing:
   - Which version was rolled back
   - Which version is now in production
   - Performance comparison between the two
   - Rollback timestamp and reason

In [None]:
# Your solution here

def rollback_model(model_name, rollback_to_version, reason):
    """
    Rollback production model to a previous version.
    """
    # TODO: Implement rollback logic
    pass

# TODO: Execute rollback and create report

## 13. Summary

### Key Concepts Covered

1. **Model Registry Setup**: Configured MLflow Model Registry for centralized model management
2. **Model Registration**: Registered models and created multiple versions automatically
3. **Lifecycle Stages**: Transitioned models through None → Staging → Production → Archived
4. **Metadata Management**: Added descriptions, tags, and annotations for governance
5. **Model Loading**: Loaded models by stage for production deployment
6. **Lineage Tracking**: Traced model origins, parameters, and training runs
7. **Promotion Workflow**: Automated model promotion based on performance criteria

### Best Practices

- ✅ **Use descriptive model names**: Choose names that reflect the business use case
- ✅ **Document everything**: Add descriptions to models and versions
- ✅ **Tag appropriately**: Use tags for team, approval status, deployment date
- ✅ **Load by stage, not version**: Use "Production" stage in code, not version numbers
- ✅ **Automate promotions**: Create functions to handle stage transitions
- ✅ **Track lineage**: Always link models back to training runs and data
- ✅ **Implement gates**: Add validation before promoting to production

### Common Pitfalls to Avoid

- ❌ Hardcoding version numbers in production code
- ❌ Skipping the staging stage (always test before production)
- ❌ Not documenting why models were promoted or rolled back
- ❌ Deleting old model versions (archive instead)
- ❌ Promoting models without proper validation

### What's Next

In **Module 03: Model Serialization**, we'll learn:
- Different serialization formats (pickle, joblib, ONNX)
- When to use each format
- Cross-platform and cross-language model deployment
- Model size optimization

### Additional Resources

- **MLflow Model Registry**: https://mlflow.org/docs/latest/model-registry.html
- **Model Versioning Best Practices**: https://neptune.ai/blog/version-control-for-ml-models
- **MLOps Model Governance**: https://ml-ops.org/content/model-governance

---

## Next Steps

Proceed to **Module 03: Model Serialization** to learn about different ways to save and load models for deployment.

**Before moving on, ensure you can:**
- ✅ Register models in MLflow Model Registry
- ✅ Create and manage multiple model versions
- ✅ Transition models between lifecycle stages
- ✅ Add descriptions, tags, and metadata
- ✅ Load models by stage name
- ✅ Implement automated promotion workflows