# MLflow Autologging Demo with Scikit-Learn

**Issue #219: Autologging demo (sklearn) + template notebook**

This notebook demonstrates MLflow's autologging capabilities with scikit-learn models. MLflow autologging automatically captures:

- **Parameters**: Model hyperparameters, preprocessing settings
- **Metrics**: Training and validation metrics
- **Artifacts**: Trained models, plots, feature importance
- **Model Signatures**: Input/output schema for model serving

## Learning Objectives

By the end of this notebook, you will:
1. Understand MLflow autologging setup and configuration
2. Train multiple sklearn models with automatic experiment tracking
3. Compare model performance using MLflow UI
4. Load and use logged models for prediction
5. Understand best practices for ML experiment tracking

## 1. Setup and Dependencies

First, let's import the required libraries and set up our environment.

In [None]:
# Core libraries
import os
import warnings
from datetime import datetime
from typing import Tuple, Dict, Any

# Data science libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# MLflow for experiment tracking
import mlflow
import mlflow.sklearn

# Scikit-learn for machine learning
from sklearn.datasets import make_classification, make_regression
from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor
from sklearn.linear_model import LogisticRegression, LinearRegression
from sklearn.svm import SVC
from sklearn.metrics import (
    accuracy_score, classification_report, confusion_matrix,
    mean_squared_error, r2_score, mean_absolute_error
)
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

# Suppress warnings for cleaner output
warnings.filterwarnings('ignore')

# Set plotting style
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

print("✅ All libraries imported successfully!")
print(f"MLflow version: {mlflow.__version__}")
print(f"Current working directory: {os.getcwd()}")

## 2. MLflow Configuration

Let's configure MLflow tracking and enable autologging for scikit-learn models.

In [None]:
# MLflow configuration
TRACKING_URI = os.getenv("MLFLOW_TRACKING_URI", "http://localhost:5001")
EXPERIMENT_NAME = "sklearn-autolog-demo-notebook"

# Set up MLflow tracking
mlflow.set_tracking_uri(TRACKING_URI)

try:
    # Create or set experiment
    experiment = mlflow.set_experiment(EXPERIMENT_NAME)
    print(f"✅ MLflow experiment set: {EXPERIMENT_NAME}")
    print(f"🔗 Tracking URI: {TRACKING_URI}")
    print(f"📊 Experiment ID: {experiment.experiment_id}")
except Exception as e:
    print(f"❌ MLflow setup failed: {e}")
    print("💡 Make sure MLflow server is running: mlflow server --host 0.0.0.0 --port 5001")

# Enable sklearn autologging with comprehensive settings
mlflow.sklearn.autolog(
    log_input_examples=True,    # Log sample input data for model serving
    log_model_signatures=True,  # Log input/output schema
    log_models=True,           # Log trained models as artifacts
    log_datasets=True,         # Log dataset information
    disable=False,             # Enable autologging
    exclusive=False,           # Allow manual logging too
    disable_for_unsupported_versions=False,
    silent=False               # Show autologging messages
)

print("\n🚀 MLflow sklearn autologging enabled!")
print("📝 The following will be automatically logged:")
print("   - Model parameters (hyperparameters)")
print("   - Training metrics")
print("   - Model artifacts")
print("   - Model signatures")
print("   - Input examples")

## 3. Dataset Creation

Let's create synthetic datasets for both classification and regression tasks.

In [None]:
def create_classification_dataset(n_samples=1000, n_features=10, random_state=42):
    """
    Create a synthetic classification dataset.
    """
    X, y = make_classification(
        n_samples=n_samples,
        n_features=n_features,
        n_informative=int(n_features * 0.7),
        n_redundant=int(n_features * 0.2),
        n_clusters_per_class=1,
        class_sep=1.0,
        random_state=random_state
    )
    
    # Create feature names
    feature_names = [f"feature_{i+1}" for i in range(n_features)]
    
    # Convert to DataFrame
    X_df = pd.DataFrame(X, columns=feature_names)
    y_series = pd.Series(y, name="target")
    
    return X_df, y_series

def create_regression_dataset(n_samples=1000, n_features=8, noise=0.1, random_state=42):
    """
    Create a synthetic regression dataset.
    """
    X, y = make_regression(
        n_samples=n_samples,
        n_features=n_features,
        n_informative=int(n_features * 0.8),
        noise=noise,
        random_state=random_state
    )
    
    # Create feature names
    feature_names = [f"feature_{i+1}" for i in range(n_features)]
    
    # Convert to DataFrame
    X_df = pd.DataFrame(X, columns=feature_names)
    y_series = pd.Series(y, name="target")
    
    return X_df, y_series

# Create datasets
print("📊 Creating synthetic datasets...")

# Classification dataset
X_clf, y_clf = create_classification_dataset(n_samples=1500, n_features=12, random_state=42)
X_train_clf, X_test_clf, y_train_clf, y_test_clf = train_test_split(
    X_clf, y_clf, test_size=0.2, random_state=42, stratify=y_clf
)

# Regression dataset
X_reg, y_reg = create_regression_dataset(n_samples=1200, n_features=10, noise=0.15, random_state=42)
X_train_reg, X_test_reg, y_train_reg, y_test_reg = train_test_split(
    X_reg, y_reg, test_size=0.2, random_state=42
)

print(f"✅ Classification dataset: {X_clf.shape[0]} samples, {X_clf.shape[1]} features")
print(f"   - Train: {len(X_train_clf)} samples")
print(f"   - Test: {len(X_test_clf)} samples")
print(f"   - Classes: {sorted(y_clf.unique())}")

print(f"\n✅ Regression dataset: {X_reg.shape[0]} samples, {X_reg.shape[1]} features")
print(f"   - Train: {len(X_train_reg)} samples")
print(f"   - Test: {len(X_test_reg)} samples")
print(f"   - Target range: [{y_reg.min():.2f}, {y_reg.max():.2f}]")

## 4. Exploratory Data Analysis

Let's visualize our datasets to understand their characteristics.

In [None]:
# Create visualizations
fig, axes = plt.subplots(2, 2, figsize=(15, 10))

# Classification target distribution
y_clf.value_counts().plot(kind='bar', ax=axes[0, 0])
axes[0, 0].set_title('Classification Target Distribution')
axes[0, 0].set_xlabel('Class')
axes[0, 0].set_ylabel('Count')

# Classification feature correlation heatmap (first 8 features)
corr_clf = X_clf.iloc[:, :8].corr()
sns.heatmap(corr_clf, annot=True, cmap='coolwarm', center=0, ax=axes[0, 1])
axes[0, 1].set_title('Classification Features Correlation')

# Regression target distribution
y_reg.hist(bins=30, ax=axes[1, 0])
axes[1, 0].set_title('Regression Target Distribution')
axes[1, 0].set_xlabel('Target Value')
axes[1, 0].set_ylabel('Frequency')

# Regression feature correlation heatmap (first 8 features)
corr_reg = X_reg.iloc[:, :8].corr()
sns.heatmap(corr_reg, annot=True, cmap='coolwarm', center=0, ax=axes[1, 1])
axes[1, 1].set_title('Regression Features Correlation')

plt.tight_layout()
plt.show()

print("📈 Dataset visualization complete!")

## 5. Classification Models with MLflow Autologging

Now let's train multiple classification models and see how MLflow automatically logs everything.

### 5.1 Logistic Regression with Pipeline

In [None]:
# Logistic Regression with preprocessing pipeline
print("🚀 Training Logistic Regression with Pipeline...")

with mlflow.start_run(run_name=f"logistic_regression_{datetime.now().strftime('%Y%m%d_%H%M%S')}") as run:
    
    # Add custom tags
    mlflow.set_tag("model_type", "logistic_regression")
    mlflow.set_tag("task_type", "classification")
    mlflow.set_tag("notebook_section", "5.1")
    mlflow.set_tag("dataset_type", "synthetic")
    
    # Create pipeline (autologging will capture all parameters)
    lr_pipeline = Pipeline([
        ('scaler', StandardScaler()),
        ('classifier', LogisticRegression(
            random_state=42,
            max_iter=1000,
            C=1.0,
            solver='liblinear'
        ))
    ])
    
    # Train model (autologging captures training metrics)
    lr_pipeline.fit(X_train_clf, y_train_clf)
    
    # Make predictions
    y_pred_lr = lr_pipeline.predict(X_test_clf)
    y_pred_proba_lr = lr_pipeline.predict_proba(X_test_clf)
    
    # Calculate additional metrics
    accuracy_lr = accuracy_score(y_test_clf, y_pred_lr)
    
    # Log custom metrics
    mlflow.log_metric("test_accuracy", accuracy_lr)
    mlflow.log_metric("n_features", X_train_clf.shape[1])
    mlflow.log_metric("train_samples", len(X_train_clf))
    mlflow.log_metric("test_samples", len(X_test_clf))
    
    # Log classification report as artifact
    report_lr = classification_report(y_test_clf, y_pred_lr, output_dict=True)
    mlflow.log_dict(report_lr, "classification_report.json")
    
    # Create and log confusion matrix plot
    plt.figure(figsize=(8, 6))
    cm = confusion_matrix(y_test_clf, y_pred_lr)
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
    plt.title('Logistic Regression - Confusion Matrix')
    plt.ylabel('True Label')
    plt.xlabel('Predicted Label')
    plt.tight_layout()
    mlflow.log_figure(plt.gcf(), "confusion_matrix.png")
    plt.show()
    
    print(f"✅ Logistic Regression completed!")
    print(f"   - Test Accuracy: {accuracy_lr:.4f}")
    print(f"   - MLflow Run ID: {run.info.run_id}")
    
    lr_run_id = run.info.run_id

### 5.2 Random Forest Classifier

In [None]:
# Random Forest Classifier
print("🌲 Training Random Forest Classifier...")

with mlflow.start_run(run_name=f"random_forest_{datetime.now().strftime('%Y%m%d_%H%M%S')}") as run:
    
    # Add custom tags
    mlflow.set_tag("model_type", "random_forest")
    mlflow.set_tag("task_type", "classification")
    mlflow.set_tag("notebook_section", "5.2")
    mlflow.set_tag("dataset_type", "synthetic")
    
    # Create model (autologging will capture all hyperparameters)
    rf_model = RandomForestClassifier(
        n_estimators=100,
        random_state=42,
        max_depth=10,
        min_samples_split=5,
        min_samples_leaf=2,
        bootstrap=True
    )
    
    # Train model
    rf_model.fit(X_train_clf, y_train_clf)
    
    # Make predictions
    y_pred_rf = rf_model.predict(X_test_clf)
    
    # Calculate metrics
    accuracy_rf = accuracy_score(y_test_clf, y_pred_rf)
    
    # Log additional metrics
    mlflow.log_metric("test_accuracy", accuracy_rf)
    mlflow.log_metric("feature_importances_mean", np.mean(rf_model.feature_importances_))
    mlflow.log_metric("feature_importances_std", np.std(rf_model.feature_importances_))
    
    # Log feature importances
    feature_importance_df = pd.DataFrame({
        'feature': X_train_clf.columns,
        'importance': rf_model.feature_importances_
    }).sort_values('importance', ascending=False)
    
    mlflow.log_dict(feature_importance_df.to_dict('records'), "feature_importances.json")
    
    # Create and log feature importance plot
    plt.figure(figsize=(10, 6))
    sns.barplot(data=feature_importance_df.head(10), x='importance', y='feature')
    plt.title('Random Forest - Top 10 Feature Importances')
    plt.xlabel('Importance')
    plt.tight_layout()
    mlflow.log_figure(plt.gcf(), "feature_importances.png")
    plt.show()
    
    print(f"✅ Random Forest completed!")
    print(f"   - Test Accuracy: {accuracy_rf:.4f}")
    print(f"   - Top feature: {feature_importance_df.iloc[0]['feature']} ({feature_importance_df.iloc[0]['importance']:.4f})")
    print(f"   - MLflow Run ID: {run.info.run_id}")
    
    rf_run_id = run.info.run_id

### 5.3 Support Vector Machine

In [None]:
# Support Vector Machine with different parameters
print("⚡ Training Support Vector Machine...")

with mlflow.start_run(run_name=f"svm_{datetime.now().strftime('%Y%m%d_%H%M%S')}") as run:
    
    # Add custom tags
    mlflow.set_tag("model_type", "svm")
    mlflow.set_tag("task_type", "classification")
    mlflow.set_tag("notebook_section", "5.3")
    mlflow.set_tag("dataset_type", "synthetic")
    
    # Create pipeline with SVM
    svm_pipeline = Pipeline([
        ('scaler', StandardScaler()),
        ('classifier', SVC(
            random_state=42,
            C=1.0,
            kernel='rbf',
            gamma='scale',
            probability=True  # Enable probability predictions
        ))
    ])
    
    # Train model
    svm_pipeline.fit(X_train_clf, y_train_clf)
    
    # Make predictions
    y_pred_svm = svm_pipeline.predict(X_test_clf)
    y_pred_proba_svm = svm_pipeline.predict_proba(X_test_clf)
    
    # Calculate metrics
    accuracy_svm = accuracy_score(y_test_clf, y_pred_svm)
    
    # Log custom metrics
    mlflow.log_metric("test_accuracy", accuracy_svm)
    mlflow.log_metric("n_support_vectors", sum(svm_pipeline.named_steps['classifier'].n_support_))
    
    # Cross-validation score
    cv_scores = cross_val_score(svm_pipeline, X_train_clf, y_train_clf, cv=5)
    mlflow.log_metric("cv_score_mean", cv_scores.mean())
    mlflow.log_metric("cv_score_std", cv_scores.std())
    
    print(f"✅ SVM completed!")
    print(f"   - Test Accuracy: {accuracy_svm:.4f}")
    print(f"   - CV Score: {cv_scores.mean():.4f} ± {cv_scores.std():.4f}")
    print(f"   - Support Vectors: {sum(svm_pipeline.named_steps['classifier'].n_support_)}")
    print(f"   - MLflow Run ID: {run.info.run_id}")
    
    svm_run_id = run.info.run_id

## 6. Regression Models with MLflow Autologging

Now let's demonstrate autologging with regression models.

### 6.1 Linear Regression

In [None]:
# Linear Regression with Pipeline
print("📈 Training Linear Regression...")

with mlflow.start_run(run_name=f"linear_regression_{datetime.now().strftime('%Y%m%d_%H%M%S')}") as run:
    
    # Add custom tags
    mlflow.set_tag("model_type", "linear_regression")
    mlflow.set_tag("task_type", "regression")
    mlflow.set_tag("notebook_section", "6.1")
    mlflow.set_tag("dataset_type", "synthetic")
    
    # Create pipeline
    linear_pipeline = Pipeline([
        ('scaler', StandardScaler()),
        ('regressor', LinearRegression())
    ])
    
    # Train model
    linear_pipeline.fit(X_train_reg, y_train_reg)
    
    # Make predictions
    y_pred_linear = linear_pipeline.predict(X_test_reg)
    
    # Calculate metrics
    mse_linear = mean_squared_error(y_test_reg, y_pred_linear)
    r2_linear = r2_score(y_test_reg, y_pred_linear)
    mae_linear = mean_absolute_error(y_test_reg, y_pred_linear)
    
    # Log additional metrics
    mlflow.log_metric("test_mse", mse_linear)
    mlflow.log_metric("test_r2", r2_linear)
    mlflow.log_metric("test_mae", mae_linear)
    mlflow.log_metric("test_rmse", np.sqrt(mse_linear))
    
    # Create and log prediction vs actual plot
    plt.figure(figsize=(10, 6))
    plt.scatter(y_test_reg, y_pred_linear, alpha=0.6)
    plt.plot([y_test_reg.min(), y_test_reg.max()], [y_test_reg.min(), y_test_reg.max()], 'r--', lw=2)
    plt.xlabel('Actual Values')
    plt.ylabel('Predicted Values')
    plt.title(f'Linear Regression - Predicted vs Actual (R² = {r2_linear:.4f})')
    plt.tight_layout()
    mlflow.log_figure(plt.gcf(), "predictions_vs_actual.png")
    plt.show()
    
    print(f"✅ Linear Regression completed!")
    print(f"   - Test R²: {r2_linear:.4f}")
    print(f"   - Test RMSE: {np.sqrt(mse_linear):.4f}")
    print(f"   - Test MAE: {mae_linear:.4f}")
    print(f"   - MLflow Run ID: {run.info.run_id}")
    
    linear_run_id = run.info.run_id

### 6.2 Random Forest Regressor

In [None]:
# Random Forest Regressor
print("🌲 Training Random Forest Regressor...")

with mlflow.start_run(run_name=f"rf_regressor_{datetime.now().strftime('%Y%m%d_%H%M%S')}") as run:
    
    # Add custom tags
    mlflow.set_tag("model_type", "random_forest_regressor")
    mlflow.set_tag("task_type", "regression")
    mlflow.set_tag("notebook_section", "6.2")
    mlflow.set_tag("dataset_type", "synthetic")
    
    # Create model
    rf_regressor = RandomForestRegressor(
        n_estimators=100,
        random_state=42,
        max_depth=15,
        min_samples_split=5,
        min_samples_leaf=2,
        bootstrap=True
    )
    
    # Train model
    rf_regressor.fit(X_train_reg, y_train_reg)
    
    # Make predictions
    y_pred_rf_reg = rf_regressor.predict(X_test_reg)
    
    # Calculate metrics
    mse_rf_reg = mean_squared_error(y_test_reg, y_pred_rf_reg)
    r2_rf_reg = r2_score(y_test_reg, y_pred_rf_reg)
    mae_rf_reg = mean_absolute_error(y_test_reg, y_pred_rf_reg)
    
    # Log additional metrics
    mlflow.log_metric("test_mse", mse_rf_reg)
    mlflow.log_metric("test_r2", r2_rf_reg)
    mlflow.log_metric("test_mae", mae_rf_reg)
    mlflow.log_metric("test_rmse", np.sqrt(mse_rf_reg))
    
    # Log feature importances
    feature_importance_reg_df = pd.DataFrame({
        'feature': X_train_reg.columns,
        'importance': rf_regressor.feature_importances_
    }).sort_values('importance', ascending=False)
    
    mlflow.log_dict(feature_importance_reg_df.to_dict('records'), "feature_importances.json")
    
    # Create residuals plot
    residuals = y_test_reg - y_pred_rf_reg
    
    fig, axes = plt.subplots(1, 2, figsize=(15, 6))
    
    # Residuals vs predicted
    axes[0].scatter(y_pred_rf_reg, residuals, alpha=0.6)
    axes[0].axhline(y=0, color='r', linestyle='--')
    axes[0].set_xlabel('Predicted Values')
    axes[0].set_ylabel('Residuals')
    axes[0].set_title('Residuals vs Predicted')
    
    # Feature importances
    sns.barplot(data=feature_importance_reg_df.head(8), x='importance', y='feature', ax=axes[1])
    axes[1].set_title('Top 8 Feature Importances')
    
    plt.tight_layout()
    mlflow.log_figure(plt.gcf(), "regression_analysis.png")
    plt.show()
    
    print(f"✅ Random Forest Regressor completed!")
    print(f"   - Test R²: {r2_rf_reg:.4f}")
    print(f"   - Test RMSE: {np.sqrt(mse_rf_reg):.4f}")
    print(f"   - Top feature: {feature_importance_reg_df.iloc[0]['feature']} ({feature_importance_reg_df.iloc[0]['importance']:.4f})")
    print(f"   - MLflow Run ID: {run.info.run_id}")
    
    rf_reg_run_id = run.info.run_id

## 7. Model Comparison and Analysis

Let's compare all our trained models using MLflow data.

In [None]:
# Get experiment and runs
experiment = mlflow.get_experiment_by_name(EXPERIMENT_NAME)
runs = mlflow.search_runs(experiment_ids=[experiment.experiment_id])

print(f"📊 Experiment Summary: {EXPERIMENT_NAME}")
print(f"Total runs: {len(runs)}")
print("\n" + "="*80)

# Filter and display classification models
classification_runs = runs[runs['tags.task_type'] == 'classification'].copy()
if not classification_runs.empty:
    print("🎯 CLASSIFICATION MODELS")
    print("-" * 40)
    
    for _, run in classification_runs.iterrows():
        model_type = run.get('tags.model_type', 'Unknown')
        accuracy = run.get('metrics.test_accuracy', 'N/A')
        run_id = run['run_id'][:8]
        
        print(f"  {model_type:20} | Accuracy: {accuracy:6.4f} | Run: {run_id}")

# Filter and display regression models
regression_runs = runs[runs['tags.task_type'] == 'regression'].copy()
if not regression_runs.empty:
    print("\n📈 REGRESSION MODELS")
    print("-" * 40)
    
    for _, run in regression_runs.iterrows():
        model_type = run.get('tags.model_type', 'Unknown')
        r2 = run.get('metrics.test_r2', 'N/A')
        rmse = run.get('metrics.test_rmse', 'N/A')
        run_id = run['run_id'][:8]
        
        print(f"  {model_type:20} | R²: {r2:6.4f} | RMSE: {rmse:7.4f} | Run: {run_id}")

print("\n" + "="*80)

# Create comparison visualization
if not classification_runs.empty:
    plt.figure(figsize=(12, 5))
    
    # Classification accuracy comparison
    plt.subplot(1, 2, 1)
    if 'metrics.test_accuracy' in classification_runs.columns:
        model_names = classification_runs['tags.model_type'].values
        accuracies = classification_runs['metrics.test_accuracy'].values
        
        bars = plt.bar(model_names, accuracies)
        plt.title('Classification Model Accuracy Comparison')
        plt.ylabel('Test Accuracy')
        plt.xticks(rotation=45)
        
        # Add value labels on bars
        for bar, acc in zip(bars, accuracies):
            plt.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.01, 
                    f'{acc:.3f}', ha='center', va='bottom')
    
    # Regression R² comparison
    plt.subplot(1, 2, 2)
    if not regression_runs.empty and 'metrics.test_r2' in regression_runs.columns:
        reg_model_names = regression_runs['tags.model_type'].values
        r2_scores = regression_runs['metrics.test_r2'].values
        
        bars = plt.bar(reg_model_names, r2_scores)
        plt.title('Regression Model R² Comparison')
        plt.ylabel('Test R²')
        plt.xticks(rotation=45)
        
        # Add value labels on bars
        for bar, r2 in zip(bars, r2_scores):
            plt.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.01, 
                    f'{r2:.3f}', ha='center', va='bottom')
    
    plt.tight_layout()
    plt.show()

print(f"\n🔗 View detailed results in MLflow UI: {TRACKING_URI}")
print(f"📁 Experiment: {EXPERIMENT_NAME}")

## 8. Loading and Using Logged Models

One of the key benefits of MLflow is the ability to easily load and use previously trained models.

In [None]:
# Load the best classification model (highest accuracy)
if not classification_runs.empty and 'metrics.test_accuracy' in classification_runs.columns:
    best_clf_run = classification_runs.loc[classification_runs['metrics.test_accuracy'].idxmax()]
    best_clf_run_id = best_clf_run['run_id']
    best_clf_model_type = best_clf_run['tags.model_type']
    best_clf_accuracy = best_clf_run['metrics.test_accuracy']
    
    print(f"🏆 Loading best classification model: {best_clf_model_type}")
    print(f"   Run ID: {best_clf_run_id}")
    print(f"   Accuracy: {best_clf_accuracy:.4f}")
    
    # Load model using MLflow
    model_uri = f"runs:/{best_clf_run_id}/model"
    loaded_clf_model = mlflow.sklearn.load_model(model_uri)
    
    # Test the loaded model
    test_predictions = loaded_clf_model.predict(X_test_clf.head(5))
    test_probabilities = loaded_clf_model.predict_proba(X_test_clf.head(5))
    
    print("\n🧪 Testing loaded model on first 5 test samples:")
    for i in range(5):
        actual = y_test_clf.iloc[i]
        predicted = test_predictions[i]
        prob = test_probabilities[i].max()
        print(f"   Sample {i+1}: Actual={actual}, Predicted={predicted}, Confidence={prob:.3f}")
    
    print("\n✅ Model loaded and tested successfully!")

# Load the best regression model (highest R²)
if not regression_runs.empty and 'metrics.test_r2' in regression_runs.columns:
    best_reg_run = regression_runs.loc[regression_runs['metrics.test_r2'].idxmax()]
    best_reg_run_id = best_reg_run['run_id']
    best_reg_model_type = best_reg_run['tags.model_type']
    best_reg_r2 = best_reg_run['metrics.test_r2']
    
    print(f"\n🏆 Loading best regression model: {best_reg_model_type}")
    print(f"   Run ID: {best_reg_run_id}")
    print(f"   R²: {best_reg_r2:.4f}")
    
    # Load model
    model_uri = f"runs:/{best_reg_run_id}/model"
    loaded_reg_model = mlflow.sklearn.load_model(model_uri)
    
    # Test the loaded model
    test_reg_predictions = loaded_reg_model.predict(X_test_reg.head(5))
    
    print("\n🧪 Testing loaded regression model on first 5 test samples:")
    for i in range(5):
        actual = y_test_reg.iloc[i]
        predicted = test_reg_predictions[i]
        error = abs(actual - predicted)
        print(f"   Sample {i+1}: Actual={actual:8.3f}, Predicted={predicted:8.3f}, Error={error:6.3f}")
    
    print("\n✅ Regression model loaded and tested successfully!")

## 9. Best Practices and Key Takeaways

Let's summarize what we've learned about MLflow autologging with scikit-learn.

### 9.1 What MLflow Autologging Captured

MLflow autologging automatically captured:

**Parameters:**
- All model hyperparameters (C, max_iter, n_estimators, etc.)
- Pipeline step parameters
- Preprocessing parameters

**Metrics:**
- Training score
- Cross-validation scores (when applicable)
- Custom metrics we logged manually

**Artifacts:**
- Trained model (serialized for reuse)
- Model signature (input/output schema)
- Input examples (for model serving)
- Custom artifacts (plots, reports, etc.)

**Metadata:**
- Model type and framework version
- Training duration
- Custom tags we added

### 9.2 Best Practices Demonstrated

1. **Consistent Tagging**: Use tags to categorize and filter runs
2. **Custom Metrics**: Log additional metrics beyond autologged ones
3. **Visualization**: Save plots as artifacts for later analysis
4. **Model Comparison**: Use MLflow data for systematic comparison
5. **Model Reuse**: Load and test saved models easily
6. **Documentation**: Use meaningful run names and descriptions

In [None]:
# Final summary
print("🎉 MLflow Autologging Demo Complete!")
print("=" * 50)

total_runs = len(runs) if 'runs' in locals() else 0
print(f"📊 Total ML experiments tracked: {total_runs}")
print(f"🔗 MLflow UI: {TRACKING_URI}")
print(f"📁 Experiment: {EXPERIMENT_NAME}")

print("\n✨ What you've learned:")
print("   ✅ MLflow autologging setup and configuration")
print("   ✅ Automatic parameter and metric tracking")
print("   ✅ Model artifact management")
print("   ✅ Custom logging and visualization")
print("   ✅ Model loading and reuse")
print("   ✅ Experiment comparison and analysis")

print("\n🚀 Next steps:")
print("   • Explore the MLflow UI to see all logged information")
print("   • Try modifying hyperparameters and compare results")
print("   • Register your best models for production use")
print("   • Set up model serving with MLflow Models")

print("\n💡 Remember: MLflow autologging works with minimal code changes")
print("   Just call mlflow.sklearn.autolog() and start training!")