# MLflow Fundamentals: Hands-On Workshop lab

Welcome to the hands-on MLflow fundamentals workshop lab! This notebook will guide you through all the core MLflow concepts step by step, allowing you to practice and verify each construct on a real MLflow tracking server.

#### Prerequisites

**Important:** You need the SageMaker Managed MLflow tracking server ARN from the workshop prerequisites section. Replace the placeholder below with your actual tracking server ARN.

## Learning Objectives

By the end of this notebook, you will:
- Understand and practice with MLflow Experiments, Runs, Parameters, Metrics, and Artifacts
- Learn essential MLflow Python SDK usage with practical examples
- Apply best practices for effective ML experiment tracking
- Experience automatic logging capabilities
- Verify all concepts on a live MLflow tracking server

## Star GitHub repository
This notebook is sourced from the public github repository `https://github.com/aws-samples/sample-aiops-on-amazon-sagemakerai#`

In [None]:
%%html

<a class="github-button" href="https://github.com/aws-samples/sample-aiops-on-amazon-sagemakerai#" data-color-scheme="no-preference: light; light: light; dark: dark;" data-icon="octicon-star" data-size="large" data-show-count="true" aria-label="Star Amazon SageMaker AI AIOps on GitHub">Star</a>
<script async defer src="https://buttons.github.io/buttons.js"></script>

### Click this button ^^^ above ^^^

## Setup and Configuration

Let's start by setting up our environment and connecting to the MLflow tracking server.

In [None]:
# Install required packages
!pip install --upgrade pip install mlflow boto3 scikit-learn matplotlib seaborn pandas numpy -q

Note: you may need to restart the kernel to use updated packages.

In [None]:
# Import necessary libraries
import mlflow
import mlflow.sklearn
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from mlflow.tracking import MlflowClient
import warnings
warnings.filterwarnings('ignore')

print("✅ All packages imported successfully!")
print(f"MLflow version: {mlflow.__version__}")

In [None]:
# Configure MLflow tracking server
# Replace with your SageMaker Managed MLflow tracking server ARN from prerequisites
TRACKING_SERVER_ARN = "arn:aws:sagemaker:us-east-2:198346569064:mlflow-tracking-server/mlflow-labs"# "arn:aws:sagemaker:<REGION>:<ACCOUNT-ID>:mlflow-tracking-server/<NAME>"

# Set the tracking URI
mlflow.set_tracking_uri(TRACKING_SERVER_ARN)

print(f"✅ MLflow tracking server configured: {mlflow.get_tracking_uri()}")
print("\n📝 Note: Make sure to replace the TRACKING_SERVER_ARN with your actual ARN from the prerequisites section")

In [None]:
# Store some variables to keep the value between the notebooks
%store TRACKING_SERVER_ARN

## Sample Dataset Preparation

Let's create a sample dataset that we'll use throughout this workshop to demonstrate MLflow constructs.

In [None]:
# Create a sample classification dataset
X, y = make_classification(
    n_samples=1000,
    n_features=10,
    n_informative=5,
    n_redundant=2,
    n_classes=2,
    random_state=42
)

# Split the data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

print(f"✅ Dataset created:")
print(f"   Training samples: {X_train.shape[0]}")
print(f"   Test samples: {X_test.shape[0]}")
print(f"   Features: {X_train.shape[1]}")
print(f"   Classes: {len(np.unique(y))}")

Note: The SageMaker managed MLflow's IAM Role will need have sagemaker MLflow IAM permissions.
These IAM permissions are added in the pre-requesites sections

---

# 1. MLflow Experiments

**What is an Experiment?**
An experiment is like a project workspace that organizes all your ML runs for a specific problem or goal. Think of it as a smart folder that stores your model training attempts and helps you compare them.

Let's create our first experiment!

In [None]:
# Create and set an experiment
experiment_name = "mlflow-fundamentals-workshop"
mlflow.set_experiment(experiment_name)

# Get experiment details
experiment = mlflow.get_experiment_by_name(experiment_name)
print(f"✅ Experiment created successfully!")
print(f"   Name: {experiment.name}")
print(f"   ID: {experiment.experiment_id}")
print(f"   Artifact Location: {experiment.artifact_location}")

print("\n🎯 Best Practice: Use descriptive experiment names that reflect your project goals")
print("   Good: 'customer-churn-v2', 'fraud-detection-baseline'")
print("   Avoid: 'experiment1', 'test', 'my_exp'")

**🔍 Verification Step:**
1. Go to your sagemaker managed MLflow tracking server UI (Go to the mlflow app section in your SageMaker AI Studio and click `Open Mlflow` )
2. You should see the new experiment "mlflow-constructs-workshop" listed
3. Click on it to explore (it will be empty for now)

---

# 2. MLflow Runs

**What is a Run?**
A run is like a detailed lab notebook entry for a single model training session. It captures everything that happened during that attempt - settings, results, files, and metadata.

Let's create our first run!

In [None]:
# Basic run example
with mlflow.start_run(run_name="baseline-model") as run:
    print(f"✅ Run started successfully!")
    print(f"   Run ID: {run.info.run_id}")
    print(f"   Run Name: baseline-model")
    print(f"   Status: {run.info.status}")
    
    # Simple logging for demonstration
    mlflow.log_param("algorithm", "RandomForest")
    mlflow.log_metric("accuracy", 0.85)
    
    print("   ✅ Basic parameters and metrics logged")

print("\n🎯 Best Practice: Always use 'with mlflow.start_run():' to ensure runs are properly closed")

**🔍 Verification Step:**
1. Go back to your SageMaker managed MLflow app and Refresh your MLflow UI
2. You should see one run named "baseline-model" in your experiment
3. Click on the run to see the logged parameter and metric

---

# 3. Parameters and Metrics

**The Input/Output Story:**
- **Parameters**: Your "recipe ingredients" - settings you choose before training
- **Metrics**: Your "cooking results" - measurements of how well your model performed

Let's practice logging parameters and metrics properly!

In [None]:
# Train a Random Forest model with proper parameter and metric logging
with mlflow.start_run(run_name="random-forest-detailed") as run:
    # Model hyperparameters (Parameters - your choices)
    n_estimators = 100
    max_depth = 10
    random_state = 42
    
    # Log individual parameters
    mlflow.log_param("algorithm", "RandomForest")
    mlflow.log_param("n_estimators", n_estimators)
    mlflow.log_param("max_depth", max_depth)
    mlflow.log_param("random_state", random_state)
    
    # Log multiple parameters at once
    mlflow.log_params({
        "train_size": len(X_train),
        "test_size": len(X_test),
        "n_features": X_train.shape[1]
    })
    
    # Train the model
    model = RandomForestClassifier(
        n_estimators=n_estimators,
        max_depth=max_depth,
        random_state=random_state
    )
    model.fit(X_train, y_train)
    
    # Make predictions
    y_pred = model.predict(X_test)
    
    # Calculate metrics (Metrics - your results)
    accuracy = accuracy_score(y_test, y_pred)
    precision = precision_score(y_test, y_pred)
    recall = recall_score(y_test, y_pred)
    f1 = f1_score(y_test, y_pred)
    
    # Log individual metrics
    mlflow.log_metric("accuracy", accuracy)
    mlflow.log_metric("precision", precision)
    mlflow.log_metric("recall", recall)
    
    # Log multiple metrics at once
    mlflow.log_metrics({
        "f1_score": f1,
        "train_samples": len(X_train),
        "test_samples": len(X_test)
    })
    
    print(f"✅ Random Forest model trained and logged!")
    print(f"   Accuracy: {accuracy:.4f}")
    print(f"   Precision: {precision:.4f}")
    print(f"   Recall: {recall:.4f}")
    print(f"   F1-Score: {f1:.4f}")

print("\n🎯 Best Practice: Log parameters before training and metrics after evaluation")

Let's also demonstrate logging training curves with steps:

In [None]:
# Simulate training with steps (like epochs)
with mlflow.start_run(run_name="training-curves-demo") as run:
    mlflow.log_param("simulation", "training_curves")
    
    # Simulate training progress over epochs
    for epoch in range(10):
        # Simulate decreasing loss and increasing accuracy
        train_loss = 1.0 - (epoch * 0.08) + np.random.normal(0, 0.02)
        val_accuracy = 0.5 + (epoch * 0.04) + np.random.normal(0, 0.01)
        
        # Log metrics with step parameter for time series
        mlflow.log_metric("train_loss", train_loss, step=epoch)
        mlflow.log_metric("val_accuracy", val_accuracy, step=epoch)
    
    print(f"✅ Training curves logged for 10 epochs")
    print("   Check the MLflow UI to see the training curves!")

**🔍 Verification Step:**
1. Go to your MLflow UI and check the latest runs
2. Select the latest experiment run and open the Model metrics tab to verify the metrics
3. Look at the training curves in the "training-curves-demo" run

---

# 4. Artifacts and Models

**What are Artifacts?**
Artifacts are the "evidence files" from your ML experiments - everything beyond numbers that tells your model's story.

Let's create and log various types of artifacts!
(Note: Ignore warnings)

In [None]:
with mlflow.start_run(run_name="artifacts-and-models-demo") as run:
    # Train a model
    model = RandomForestClassifier(n_estimators=50, random_state=42)
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)
    
    # Log model parameters
    mlflow.log_params({
        "algorithm": "RandomForest",
        "n_estimators": 50,
        "random_state": 42
    })
    
    # Log metrics
    accuracy = accuracy_score(y_test, y_pred)
    mlflow.log_metric("accuracy", accuracy)
    
    # 1. Log the trained model
    mlflow.sklearn.log_model(model, "random_forest_model")
    print("✅ Model logged as artifact")
    
    # 2. Create and log a visualization plot
    plt.figure(figsize=(10, 6))
    
    # Feature importance plot
    feature_names = [f'feature_{i}' for i in range(X_train.shape[1])]
    importance = model.feature_importances_
    
    plt.subplot(1, 2, 1)
    plt.bar(range(len(importance)), importance)
    plt.title('Feature Importance')
    plt.xlabel('Features')
    plt.ylabel('Importance')
    
    # Training progress simulation
    plt.subplot(1, 2, 2)
    epochs = range(1, 11)
    accuracy_progress = [0.6 + i*0.03 + np.random.normal(0, 0.01) for i in epochs]
    plt.plot(epochs, accuracy_progress, 'b-', marker='o')
    plt.title('Training Progress')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy')
    
    plt.tight_layout()
    plt.savefig("model_analysis.png", dpi=150, bbox_inches='tight')
    plt.show()
    
    # Log the plot as artifact
    mlflow.log_artifact("model_analysis.png")
    print("✅ Visualization plot logged as artifact")
    
    # 3. Log configuration as JSON
    config = {
        "model_type": "RandomForest",
        "version": "1.0",
        "training_date": "2024-01-15",
        "data_preprocessing": {
            "scaling": "none",
            "feature_selection": "none"
        },
        "performance": {
            "accuracy": float(accuracy),
            "model_size_mb": 0.5
        }
    }
    mlflow.log_dict(config, "model_config.json")
    print("✅ Configuration logged as JSON artifact")
    
    # 4. Log a text report
    report = f"""
Model Training Report
===================
Algorithm: Random Forest
Training Samples: {len(X_train)}
Test Samples: {len(X_test)}
Features: {X_train.shape[1]}

Hyperparameters:
- n_estimators: 50
- random_state: 42

Results:
- Accuracy: {accuracy:.4f}
- Top 3 Important Features: {', '.join([f'feature_{i}' for i in np.argsort(importance)[-3:]])}

Notes:
- Model shows good performance on test set
- Feature importance analysis completed
- Ready for further evaluation
"""
    
    with open("training_report.txt", "w") as f:
        f.write(report)
    
    mlflow.log_artifact("training_report.txt")
    print("✅ Training report logged as text artifact")
    
    print(f"\n📊 Summary of logged artifacts:")
    print(f"   - Trained model (random_forest_model/)")
    print(f"   - Visualization plot (model_analysis.png)")
    print(f"   - Configuration file (model_config.json)")
    print(f"   - Training report (training_report.txt)")

**🔍 Verification Step:**
1. Go to your MLflow UI and open the "artifacts-and-models-demo" run
2. Check the "Artifacts" section - you should see all logged files
3. Note how the model is stored with its metadata and dependencies. The Model Directory `random_forest_model/` acts as the self-contained package for your trained machine learning model.

| Filename | Purpose in Model Deployment
|-----------|---------|
| MLmodel | The Model Contract (YAML): This is the single most important metadata file. It defines the model's "flavor" (e.g., python_function, sklearn, pytorch), its signature (required inputs/expected outputs), and crucially, points to all dependent files (like model.pkl). SageMaker uses this file to understand how to load and serve the model without requiring custom code on the inference side. | 
| model.pkl | The Serialized Model: This file contains the trained machine learning object, usually serialized using Python's pickle or cloudpickle. This is the binary file representing the learned weights, coefficients, or structure of your Random Forest model. |
| conda.yaml | Conda Environment Definition: This YAML file specifies the exact Python and library dependencies, including the specific versions required for the model to run correctly. SageMaker uses this file to construct the runtime environment (the Docker container) for the inference endpoint, guaranteeing environment parity with training. | 
| requirements.txt | Pip Dependencies: A simplified file listing the Python package dependencies (like scikit-learn==1.3.2). This is often used alongside conda.yaml to ensure all required libraries are installed during deployment. | 
| python_env.yaml| Python Environment Details: Contains the full specification needed to restore the Python environment via virtualenv and pip, complementing the conda.yaml for environment reproducibility.

---

# 5. Model Registry

**What is the Model Registry?**
The Model Registry is like a sophisticated library system for your trained models. It provides version control and manages the journey from development to production.

Let's register and manage models!

In [None]:
# Train and register a model
with mlflow.start_run(run_name="model-for-registry") as run:
    # Train a model
    model = RandomForestClassifier(n_estimators=100, max_depth=8, random_state=42)
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)
    
    # Log parameters and metrics
    mlflow.log_params({
        "algorithm": "RandomForest",
        "n_estimators": 100,
        "max_depth": 8
    })
    
    accuracy = accuracy_score(y_test, y_pred)
    mlflow.log_metric("accuracy", accuracy)
    
    # Log and register the model in one step
    model_name = "workshop-classifier"
    mlflow.sklearn.log_model(
        model, 
        "model",
        registered_model_name=model_name
    )
    
    print(f"✅ Model trained and registered!")
    print(f"   Model Name: {model_name}")
    print(f"   Accuracy: {accuracy:.4f}")
    print(f"   Run ID: {run.info.run_id}")

In [None]:
# Model lifecycle management
client = MlflowClient()
model_name = "workshop-classifier"

try:
    # Get the latest version
    latest_versions = client.get_latest_versions(model_name)
    if latest_versions:
        latest_version = latest_versions[0]
        print(f"✅ Found registered model:")
        print(f"   Name: {latest_version.name}")
        print(f"   Version: {latest_version.version}")
        
        # Add description
        client.update_model_version(
            name=model_name,
            version=latest_version.version,
            description="Random Forest classifier trained in MLflow workshop. Shows good performance on test data."
        )
        print(f"✅ Model description added!")
        
    else:
        print("❌ No model versions found. Make sure the previous cell ran successfully.")
        
except Exception as e:
    print(f"❌ Error managing model: {e}")
    print("This might happen if the model registry is not fully set up yet.")

**🔍 Verification Step:**
1. Go to your MLflow UI and click on "Models" in the navigation
2. You should see "workshop-classifier" in the model registry
3. Click on it to see version history and further click on the model version to see more
4. Verify the model version with the description

---

# 6. Organization with Tags

**What are Tags?**
Tags are like sticky notes for your ML runs - simple labels that help you organize and find experiments later.

Let's practice using tags effectively!

In [None]:
# Create multiple runs with different tags for organization
algorithms = ["RandomForest", "LogisticRegression"]
data_versions = ["v1.0", "v1.1"]

for algorithm in algorithms:
    for data_version in data_versions:
        with mlflow.start_run(run_name=f"{algorithm.lower()}-{data_version}") as run:
            # Basic tags
            mlflow.set_tag("team", "data-science")
            mlflow.set_tag("environment", "development")
            mlflow.set_tag("algorithm", algorithm)
            mlflow.set_tag("data_version", data_version)
            
            # Multiple tags at once
            mlflow.set_tags({
                "experiment_type": "baseline_comparison",
                "priority": "high" if algorithm == "RandomForest" else "medium",
                "reviewer": "workshop-participant",
                "status": "completed"
            })
            
            # Train appropriate model
            if algorithm == "RandomForest":
                model = RandomForestClassifier(n_estimators=50, random_state=42)
            else:
                model = LogisticRegression(random_state=42, max_iter=1000)
            
            model.fit(X_train, y_train)
            y_pred = model.predict(X_test)
            
            # Log basic info
            mlflow.log_param("algorithm", algorithm)
            mlflow.log_param("data_version", data_version)
            mlflow.log_metric("accuracy", accuracy_score(y_test, y_pred))
            
            print(f"✅ {algorithm} with {data_version} - Tagged and logged")

print("\n🏷️ Tags added for easy filtering:")
print("   - team: data-science")
print("   - environment: development")
print("   - algorithm: RandomForest/LogisticRegression")
print("   - data_version: v1.0/v1.1")
print("   - experiment_type: baseline_comparison")
print("   - priority: high/medium")
print("   - reviewer: workshop-participant")
print("   - status: completed")

**🔍 Verification Step:**
1. Go to your MLflow UI and view all runs in your experiment
2. Use the filter functionality to filter by tags. For example enter `tags.algorithm = "RandomForest"` in the search bar
3. Try different tag combinations to see how they help organize experiments
4. Notice how tags make it easy to find specific types of runs

---

# 7. Automatic Logging

**What is Automatic Logging?**
Automatic logging is MLflow's smart assistant that automatically captures parameters, metrics, models, and artifacts when you use popular ML libraries.

Let's see the magic of autolog!

In [None]:
# Enable automatic logging for scikit-learn
mlflow.sklearn.autolog()

print("✅ Automatic logging enabled for scikit-learn!")
print("   MLflow will now automatically log:")
print("   - All hyperparameters")
print("   - Training metrics")
print("   - Model artifacts")
print("   - Feature importance plots (when available)")
print("   - And much more!")

In [None]:
# Train models with automatic logging - no manual logging needed!
with mlflow.start_run(run_name="autolog-random-forest") as run:
    # Just your regular ML code - MLflow automatically logs everything!
    model = RandomForestClassifier(
        n_estimators=100, 
        max_depth=10, 
        min_samples_split=5,
        random_state=42
    )
    
    # This single line triggers automatic logging of everything!
    model.fit(X_train, y_train)
    
    # Make predictions (also automatically logged)
    y_pred = model.predict(X_test)
    
    print(f"✅ Random Forest trained with automatic logging!")
    print(f"   Check the MLflow UI to see all the automatically logged information!")
    print(f"   Run ID: {run.info.run_id}")

In [None]:
# Compare with Logistic Regression using autolog
with mlflow.start_run(run_name="autolog-logistic-regression") as run:
    # Different algorithm, same automatic logging magic!
    model = LogisticRegression(
        C=1.0,
        solver='liblinear',
        random_state=42,
        max_iter=1000
    )
    
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)
    
    print(f"✅ Logistic Regression trained with automatic logging!")
    print(f"   Run ID: {run.info.run_id}")

In [None]:
# You can still add manual logging on top of automatic logging
with mlflow.start_run(run_name="autolog-plus-manual") as run:
    # Automatic logging handles the basics
    model = RandomForestClassifier(n_estimators=150, random_state=42)
    model.fit(X_train, y_train)
    
    # Add your custom tags and metrics
    mlflow.set_tags({
        "custom_experiment": "autolog_demo",
        "notes": "Combining autolog with manual logging"
    })
    
    # Add custom metrics
    y_pred = model.predict(X_test)
    custom_score = accuracy_score(y_test, y_pred) * 100  # Percentage
    mlflow.log_metric("accuracy_percentage", custom_score)
    
    print(f"✅ Combined automatic + manual logging completed!")
    print(f"   Automatic: All sklearn parameters, metrics, and model")
    print(f"   Manual: Custom tags and percentage accuracy")

**🔍 Verification Step:**
1. Go to your MLflow UI and compare the autolog runs with manual logging runs
2. Notice how autolog captured many more parameters automatically
3. Check if feature importance plots were automatically generated
4. Compare the amount of information captured with minimal code

---

# 🎯 Workshop Summary and Best Practices

Congratulations! You've successfully practiced all core MLflow constructs.

## 🔗 Useful Resources

- **MLflow Documentation**: https://mlflow.org/docs/latest/
- **MLflow Tracking**: https://mlflow.org/docs/latest/tracking.html
- **MLflow Models**: https://mlflow.org/docs/latest/models.html
- **MLflow Model Registry**: https://mlflow.org/docs/latest/model-registry.html
- **Automatic Logging**: https://mlflow.org/docs/latest/tracking.html#automatic-logging

## 🎓 Workshop Complete!

You have successfully completed the MLflow fundamentals hands-on workshop! You now have practical experience with all core MLflow concepts and are ready to apply them to your machine learning projects.

**Remember**: The key to mastering MLflow is consistent practice. Start incorporating these constructs into your daily ML workflow, and you'll soon see the benefits of organized, reproducible, and collaborative machine learning experiments.