# MLflow Experiment Tracking - Heart Disease Prediction
## MLOps Assignment - Task 3

**Objective:** Track ML experiments using MLflow to compare models, log parameters, metrics, and artifacts

**Key Features:**
- Automated experiment tracking
- Parameter logging for reproducibility
- Metric tracking across multiple runs
- Artifact logging (models, plots, data)
- Experiment comparison and visualization

**Models Tracked:**
1. Logistic Regression
2. Random Forest
3. XGBoost

## 1. Setup and Imports

In [None]:
import sys
sys.path.append('../src')

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
import mlflow
import mlflow.sklearn
import mlflow.xgboost
from pathlib import Path

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from xgboost import XGBClassifier
from sklearn.metrics import (
    accuracy_score, precision_score, recall_score, f1_score,
    roc_auc_score, confusion_matrix, roc_curve
)

from preprocessing import load_data, create_preprocessing_pipeline
from train_mlflow import MLflowModelTrainer

warnings.filterwarnings('ignore')
plt.style.use('seaborn-v0_8-darkgrid')
%matplotlib inline

pd.set_option('display.max_columns', None)
pd.set_option('display.precision', 4)

print("Libraries imported successfully!")
print(f"MLflow version: {mlflow.__version__}")

## 2. MLflow Setup

In [None]:
# Set experiment name
experiment_name = "heart-disease-prediction"
mlflow.set_experiment(experiment_name)

# Get experiment info
experiment = mlflow.get_experiment_by_name(experiment_name)

print(f"Experiment Name: {experiment_name}")
print(f"Experiment ID: {experiment.experiment_id}")
print(f"Artifact Location: {experiment.artifact_location}")
print(f"\nMLflow Tracking URI: {mlflow.get_tracking_uri()}")

## 3. Load and Prepare Data

In [None]:
# Load data
print("Loading data...")
X, y = load_data('../data/heart_disease_clean.csv')

print(f"\nDataset Shape: {X.shape}")
print(f"Target Distribution:\n{y.value_counts()}")

# Create preprocessing pipeline
print("\nCreating preprocessing pipeline...")
preprocessing_pipeline = create_preprocessing_pipeline(
    handle_outliers=True,
    feature_engineering=True
)

# Transform data
X_transformed = preprocessing_pipeline.fit_transform(X)
print(f"Transformed Shape: {X_transformed.shape}")
print(f"Features created: {X_transformed.shape[1] - X.shape[1]}")

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X_transformed, y,
    test_size=0.2,
    random_state=42,
    stratify=y
)

print(f"\nTrain set: {X_train.shape[0]} samples")
print(f"Test set: {X_test.shape[0]} samples")

## 4. Experiment 1 - Logistic Regression with MLflow

In [None]:
print("="*70)
print("EXPERIMENT 1: LOGISTIC REGRESSION")
print("="*70)

with mlflow.start_run(run_name="Logistic_Regression_Baseline") as run:
    
    # Set tags
    mlflow.set_tag("model_type", "Logistic Regression")
    mlflow.set_tag("framework", "sklearn")
    mlflow.set_tag("purpose", "baseline_model")
    
    # Define hyperparameters
    params = {
        'max_iter': 1000,
        'solver': 'liblinear',
        'C': 1.0,
        'random_state': 42
    }
    
    # Log parameters
    for param, value in params.items():
        mlflow.log_param(param, value)
    
    mlflow.log_param("train_samples", X_train.shape[0])
    mlflow.log_param("test_samples", X_test.shape[0])
    mlflow.log_param("n_features", X_train.shape[1])
    
    # Train model
    model = LogisticRegression(**params)
    model.fit(X_train, y_train)
    
    # Predictions
    y_pred = model.predict(X_test)
    y_proba = model.predict_proba(X_test)[:, 1]
    
    # Calculate metrics
    metrics = {
        'accuracy': accuracy_score(y_test, y_pred),
        'precision': precision_score(y_test, y_pred),
        'recall': recall_score(y_test, y_pred),
        'f1_score': f1_score(y_test, y_pred),
        'roc_auc': roc_auc_score(y_test, y_proba)
    }
    
    # Log metrics
    for metric, value in metrics.items():
        mlflow.log_metric(metric, value)
    
    # Log model
    mlflow.sklearn.log_model(model, "model")
    
    # Create and log confusion matrix
    cm = confusion_matrix(y_test, y_pred)
    fig, ax = plt.subplots(figsize=(8, 6))
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', ax=ax,
               xticklabels=['No Disease', 'Disease'],
               yticklabels=['No Disease', 'Disease'])
    ax.set_title('Logistic Regression - Confusion Matrix', fontweight='bold')
    ax.set_ylabel('True Label')
    ax.set_xlabel('Predicted Label')
    
    cm_path = '../screenshots/lr_confusion_matrix_mlflow.png'
    plt.savefig(cm_path, dpi=300, bbox_inches='tight')
    mlflow.log_artifact(cm_path)
    plt.show()
    
    # Print results
    print(f"\nRun ID: {run.info.run_id}")
    print(f"\nMetrics:")
    for metric, value in metrics.items():
        print(f"  {metric}: {value:.4f}")
    
    lr_run_id = run.info.run_id

## 5. Experiment 2 - Random Forest with MLflow

In [None]:
print("="*70)
print("EXPERIMENT 2: RANDOM FOREST")
print("="*70)

with mlflow.start_run(run_name="Random_Forest_Ensemble") as run:
    
    # Set tags
    mlflow.set_tag("model_type", "Random Forest")
    mlflow.set_tag("framework", "sklearn")
    mlflow.set_tag("purpose", "ensemble_model")
    
    # Define hyperparameters
    params = {
        'n_estimators': 100,
        'max_depth': 10,
        'min_samples_split': 5,
        'min_samples_leaf': 2,
        'random_state': 42,
        'n_jobs': -1
    }
    
    # Log parameters
    for param, value in params.items():
        mlflow.log_param(param, value)
    
    mlflow.log_param("train_samples", X_train.shape[0])
    mlflow.log_param("test_samples", X_test.shape[0])
    mlflow.log_param("n_features", X_train.shape[1])
    
    # Train model
    model = RandomForestClassifier(**params)
    model.fit(X_train, y_train)
    
    # Predictions
    y_pred = model.predict(X_test)
    y_proba = model.predict_proba(X_test)[:, 1]
    
    # Calculate metrics
    metrics = {
        'accuracy': accuracy_score(y_test, y_pred),
        'precision': precision_score(y_test, y_pred),
        'recall': recall_score(y_test, y_pred),
        'f1_score': f1_score(y_test, y_pred),
        'roc_auc': roc_auc_score(y_test, y_proba)
    }
    
    # Log metrics
    for metric, value in metrics.items():
        mlflow.log_metric(metric, value)
    
    # Log model
    mlflow.sklearn.log_model(model, "model")
    
    # Create and log feature importance
    importances = model.feature_importances_
    fig, ax = plt.subplots(figsize=(10, 8))
    sorted_idx = np.argsort(importances)[-15:]
    ax.barh(range(len(sorted_idx)), importances[sorted_idx],
           color='coral', alpha=0.8, edgecolor='black')
    ax.set_yticks(range(len(sorted_idx)))
    ax.set_yticklabels([f'Feature {i}' for i in sorted_idx])
    ax.set_xlabel('Importance')
    ax.set_title('Random Forest - Feature Importance (Top 15)', fontweight='bold')
    ax.grid(axis='x', alpha=0.3)
    
    fi_path = '../screenshots/rf_feature_importance_mlflow.png'
    plt.savefig(fi_path, dpi=300, bbox_inches='tight')
    mlflow.log_artifact(fi_path)
    plt.show()
    
    # Print results
    print(f"\nRun ID: {run.info.run_id}")
    print(f"\nMetrics:")
    for metric, value in metrics.items():
        print(f"  {metric}: {value:.4f}")
    
    rf_run_id = run.info.run_id

## 6. Experiment 3 - XGBoost with MLflow

In [None]:
print("="*70)
print("EXPERIMENT 3: XGBOOST")
print("="*70)

with mlflow.start_run(run_name="XGBoost_Production") as run:
    
    # Set tags
    mlflow.set_tag("model_type", "XGBoost")
    mlflow.set_tag("framework", "xgboost")
    mlflow.set_tag("purpose", "production_candidate")
    
    # Define hyperparameters
    params = {
        'n_estimators': 100,
        'max_depth': 5,
        'learning_rate': 0.1,
        'subsample': 0.8,
        'colsample_bytree': 0.8,
        'random_state': 42,
        'eval_metric': 'logloss'
    }
    
    # Log parameters
    for param, value in params.items():
        mlflow.log_param(param, value)
    
    mlflow.log_param("train_samples", X_train.shape[0])
    mlflow.log_param("test_samples", X_test.shape[0])
    mlflow.log_param("n_features", X_train.shape[1])
    
    # Train model
    model = XGBClassifier(**params)
    model.fit(X_train, y_train)
    
    # Predictions
    y_pred = model.predict(X_test)
    y_proba = model.predict_proba(X_test)[:, 1]
    
    # Calculate metrics
    metrics = {
        'accuracy': accuracy_score(y_test, y_pred),
        'precision': precision_score(y_test, y_pred),
        'recall': recall_score(y_test, y_pred),
        'f1_score': f1_score(y_test, y_pred),
        'roc_auc': roc_auc_score(y_test, y_proba)
    }
    
    # Log metrics
    for metric, value in metrics.items():
        mlflow.log_metric(metric, value)
    
    # Log model
    mlflow.xgboost.log_model(model, "model")
    
    # Create and log ROC curve
    fpr, tpr, _ = roc_curve(y_test, y_proba)
    fig, ax = plt.subplots(figsize=(8, 6))
    ax.plot(fpr, tpr, color='green', linewidth=2,
           label=f'XGBoost (AUC = {metrics["roc_auc"]:.3f})')
    ax.plot([0, 1], [0, 1], 'k--', linewidth=1, label='Random Classifier')
    ax.set_xlabel('False Positive Rate')
    ax.set_ylabel('True Positive Rate')
    ax.set_title('XGBoost - ROC Curve', fontweight='bold')
    ax.legend()
    ax.grid(alpha=0.3)
    
    roc_path = '../screenshots/xgb_roc_curve_mlflow.png'
    plt.savefig(roc_path, dpi=300, bbox_inches='tight')
    mlflow.log_artifact(roc_path)
    plt.show()
    
    # Print results
    print(f"\nRun ID: {run.info.run_id}")
    print(f"\nMetrics:")
    for metric, value in metrics.items():
        print(f"  {metric}: {value:.4f}")
    
    xgb_run_id = run.info.run_id

## 7. Compare All Experiments

In [None]:
# Retrieve all runs from the experiment
from mlflow.tracking import MlflowClient

client = MlflowClient()
experiment = mlflow.get_experiment_by_name("heart-disease-prediction")
runs = client.search_runs(experiment.experiment_id)

# Create comparison dataframe
comparison_data = []

for run in runs:
    comparison_data.append({
        'Run Name': run.data.tags.get('mlflow.runName', 'Unknown'),
        'Model Type': run.data.tags.get('model_type', 'Unknown'),
        'Accuracy': run.data.metrics.get('accuracy', 0),
        'Precision': run.data.metrics.get('precision', 0),
        'Recall': run.data.metrics.get('recall', 0),
        'F1 Score': run.data.metrics.get('f1_score', 0),
        'ROC-AUC': run.data.metrics.get('roc_auc', 0),
        'Run ID': run.info.run_id
    })

comparison_df = pd.DataFrame(comparison_data)
comparison_df = comparison_df.sort_values('ROC-AUC', ascending=False)

print("\n" + "="*70)
print("EXPERIMENT COMPARISON")
print("="*70)
print(comparison_df.to_string(index=False))
print("="*70)

## 8. Visualize Experiment Comparison

In [None]:
# Create comprehensive comparison plot
fig, axes = plt.subplots(2, 3, figsize=(18, 10))
axes = axes.ravel()

metrics_to_plot = ['Accuracy', 'Precision', 'Recall', 'F1 Score', 'ROC-AUC']
colors = ['steelblue', 'coral', 'lightgreen']

for idx, metric in enumerate(metrics_to_plot):
    data = comparison_df.sort_values('Model Type')
    axes[idx].bar(range(len(data)), data[metric], 
                  color=colors[:len(data)], alpha=0.8, edgecolor='black')
    axes[idx].set_xticks(range(len(data)))
    axes[idx].set_xticklabels(data['Model Type'], rotation=45, ha='right')
    axes[idx].set_ylabel('Score')
    axes[idx].set_title(f'{metric} Comparison', fontweight='bold')
    axes[idx].set_ylim([0, 1.1])
    axes[idx].grid(axis='y', alpha=0.3)
    
    for i, v in enumerate(data[metric]):
        axes[idx].text(i, v + 0.02, f'{v:.3f}', ha='center', fontweight='bold')

# Summary table in last subplot
axes[5].axis('off')
table_data = comparison_df[['Model Type', 'ROC-AUC', 'Accuracy', 'F1 Score']].values
table = axes[5].table(cellText=table_data,
                      colLabels=['Model', 'ROC-AUC', 'Accuracy', 'F1'],
                      cellLoc='center',
                      loc='center',
                      bbox=[0, 0, 1, 1])
table.auto_set_font_size(False)
table.set_fontsize(10)
table.scale(1, 2)
axes[5].set_title('Summary Table', fontweight='bold', pad=20)

plt.tight_layout()
plt.savefig('../screenshots/mlflow_all_experiments_comparison.png', dpi=300, bbox_inches='tight')
plt.show()

print("Comparison plot saved!")

## 9. Load Best Model from MLflow

In [None]:
# Get best run
best_run = comparison_df.iloc[0]
best_run_id = best_run['Run ID']
best_model_type = best_run['Model Type']

print(f"Best Model: {best_model_type}")
print(f"Best ROC-AUC: {best_run['ROC-AUC']:.4f}")
print(f"Run ID: {best_run_id}")

# Load model from MLflow
model_uri = f"runs:/{best_run_id}/model"

if best_model_type == "XGBoost":
    loaded_model = mlflow.xgboost.load_model(model_uri)
else:
    loaded_model = mlflow.sklearn.load_model(model_uri)

print(f"\nModel loaded successfully from MLflow!")
print(f"Model type: {type(loaded_model).__name__}")

# Test prediction
test_prediction = loaded_model.predict(X_test[:5])
print(f"\nSample predictions: {test_prediction}")

## 10. MLflow UI Access Instructions

In [None]:
print("="*70)
print("MLFLOW UI ACCESS INSTRUCTIONS")
print("="*70)
print("\n1. Open a terminal and navigate to the project directory:")
print("   cd /Users/saif.afzal/Documents/M.Tech/MLOPS/heart-disease-mlops")
print("\n2. Start MLflow UI:")
print("   mlflow ui")
print("\n3. Open your browser and go to:")
print("   http://127.0.0.1:5000")
print("\n4. In the MLflow UI, you can:")
print("   - View all experiment runs")
print("   - Compare models side-by-side")
print("   - View logged parameters and metrics")
print("   - Download artifacts (models, plots)")
print("   - Filter and sort runs")
print("   - Create charts and visualizations")
print("\n5. Alternative: Use custom port")
print("   mlflow ui --port 8080")
print("   Then visit: http://127.0.0.1:8080")
print("\n" + "="*70)

# Show experiment location
experiment = mlflow.get_experiment_by_name("heart-disease-prediction")
print(f"\nExperiment artifacts stored at:")
print(f"{experiment.artifact_location}")
print("="*70)

## 11. Summary and Key Insights

In [None]:
print("="*70)
print("MLFLOW EXPERIMENT TRACKING - SUMMARY")
print("="*70)

print("\n1. EXPERIMENTS TRACKED:")
for idx, row in comparison_df.iterrows():
    print(f"   - {row['Model Type']}: ROC-AUC = {row['ROC-AUC']:.4f}")

print("\n2. BEST MODEL:")
print(f"   Model: {best_run['Model Type']}")
print(f"   ROC-AUC: {best_run['ROC-AUC']:.4f}")
print(f"   Accuracy: {best_run['Accuracy']:.4f}")
print(f"   F1 Score: {best_run['F1 Score']:.4f}")

print("\n3. LOGGED ARTIFACTS:")
print("   - Trained models (3)")
print("   - Confusion matrices")
print("   - Feature importance plots")
print("   - ROC curves")
print("   - Parameter configurations")
print("   - All metrics")

print("\n4. KEY CAPABILITIES DEMONSTRATED:")
print("   ✓ Automated experiment tracking")
print("   ✓ Parameter logging for reproducibility")
print("   ✓ Metric tracking across runs")
print("   ✓ Artifact management (models, plots)")
print("   ✓ Model versioning and registry")
print("   ✓ Experiment comparison")
print("   ✓ Model loading from runs")

print("\n5. BENEFITS FOR PRODUCTION:")
print("   - Reproducible experiments")
print("   - Easy model comparison")
print("   - Centralized model storage")
print("   - Experiment lineage tracking")
print("   - Collaboration enabled")

print("\n6. NEXT STEPS:")
print("   - Register best model in MLflow Model Registry")
print("   - Transition model to Production stage")
print("   - Set up model serving with MLflow")
print("   - Implement A/B testing framework")

print("\n" + "="*70)
print("TASK 3 COMPLETE - MLFLOW EXPERIMENT TRACKING")
print("="*70)

## Conclusion

### What We Accomplished

1. **MLflow Integration**: Successfully integrated MLflow tracking into the training pipeline
2. **Experiment Tracking**: Tracked 3 models with all parameters, metrics, and artifacts
3. **Reproducibility**: All experiments are reproducible with logged parameters
4. **Comparison**: Created comprehensive comparisons of all experiments
5. **Artifact Management**: Logged models, plots, and results for each run
6. **Model Loading**: Demonstrated loading models from MLflow runs

### MLflow Features Used

- **Experiments**: Organized runs under named experiment
- **Runs**: Each model training as separate run
- **Parameters**: All hyperparameters logged
- **Metrics**: Performance metrics tracked
- **Artifacts**: Models and visualizations saved
- **Tags**: Metadata for organization
- **Model Registry**: Models logged for versioning

### Production-Ready Features

- Centralized experiment tracking
- Model versioning and lineage
- Easy model comparison and selection
- Reproducible training runs
- Artifact storage and retrieval
- Web UI for visualization

**Status**: Task 3 Complete ✅