# MLflow for Production AI and ML

## The Scenario

üõ©Ô∏è **Leadership just gave you the order:** Your team has IoT sensor data streaming in from aircraft engines across 5 factories. By the end of the week, you need to deploy a predictive model to identify potential defects before they cause failures. This notebook gets you from raw data to a production-ready model in 30 minutes.

## What You'll Learn

‚úÖ **Experiment** - Track model training with MLflow autologging  
‚úÖ **Register** - Version control models in Unity Catalog  
‚úÖ **Predict** - Load and use models for batch inference  

**Key Concepts:**
- **MLflow Tracking**: Automatically log parameters, metrics, and models
- **Unity Catalog Model Registry**: Enterprise-grade model versioning and governance
- **Model Aliases**: Tag models as "Champion" or "Challenger" for deployment

---

**References:**
- [MLflow Tracking](https://docs.databricks.com/aws/en/mlflow/tracking)
- [Databricks Autologging](https://docs.databricks.com/aws/en/mlflow/databricks-autologging)
- [Unity Catalog Model Registry](https://docs.databricks.com/aws/en/machine-learning/manage-model-lifecycle/index.html)

## Why MLflow?

Without MLflow, data scientists face challenges like:
- **Lost experiments** - "Which hyperparameters gave us that 95% accuracy?"
- **Model chaos** - "Where's the model we deployed last week?"
- **No reproducibility** - "I can't recreate these results"

MLflow solves this by providing:
- **Experiment Tracking**: Automatic logging of parameters, metrics, and artifacts
- **Model Registry**: Centralized model versioning with Unity Catalog
- **Deployment**: Seamless path from experiment to production

**The MLOps Workflow:**
```
1. EXPERIMENT ‚Üí Train models, MLflow tracks everything
2. REGISTER   ‚Üí Save best model to Unity Catalog
3. PREDICT    ‚Üí Use model for batch or real-time inference
```

## Setup: Connect to IoT Data

We'll use the sensor and inspection data from our aircraft engine monitoring system.

In [0]:
# Configuration - update with your catalog/schema
catalog = "josh_melton"  # Update to your catalog
schema = "default"       # Update to your schema

# Display available tables
print("Available IoT tables:")
tables = spark.sql(f"SHOW TABLES IN {catalog}.{schema}").filter("tableName LIKE '%sensor%' OR tableName LIKE '%inspection%'")
display(tables)

## Load and Prepare Training Data

We'll join sensor readings with inspection results to create a labeled dataset for defect prediction.

In [0]:
import pyspark.sql.functions as F
from pyspark.sql.window import Window

# Load sensor and inspection data
sensor_df = spark.table(f"{catalog}.{schema}.sensor_bronze")
inspection_df = spark.table(f"{catalog}.{schema}.inspection_bronze")

# Join sensor data with inspection labels
# For each device, take the most recent sensor reading before each inspection
window_spec = Window.partitionBy("device_id").orderBy(F.col("sensor_timestamp").desc())

training_data = (
    sensor_df
    .withColumnRenamed("timestamp", "sensor_timestamp")
    .join(
        inspection_df.withColumnRenamed("timestamp", "inspection_timestamp"),
        ["device_id"]
    )
    .filter(F.col("sensor_timestamp") <= F.col("inspection_timestamp"))
    .withColumn("row_num", F.row_number().over(window_spec))
    .filter(F.col("row_num") == 1)
    .select(
        "device_id",
        "factory_id", 
        "model_id",
        "airflow_rate",
        "rotation_speed",
        "air_pressure",
        "temperature",
        "delay",
        "density",
        F.col("defect").cast("int").alias("defect")
    )
)

print(f"Training dataset size: {training_data.count():,} records")
print(f"Defect rate: {training_data.filter('defect = 1').count() / training_data.count() * 100:.2f}%")

display(training_data.limit(10))

## Convert to Pandas for Sklearn

For this quick example, we'll use scikit-learn. For larger datasets, consider using Spark MLlib or distributed training.

In [0]:
import pandas as pd
from sklearn.model_selection import train_test_split

# Convert to Pandas
pdf = training_data.toPandas()

# Prepare features and target
feature_cols = ["airflow_rate", "rotation_speed", "air_pressure", "temperature", "delay", "density"]
X = pdf[feature_cols]
y = pdf["defect"]

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)

print(f"Training set: {len(X_train):,} samples")
print(f"Test set: {len(X_test):,} samples")

## 1Ô∏è‚É£ EXPERIMENT: Train Model with MLflow Autologging

**Key Point:** Use `mlflow.autolog()` to automatically track everything! No need to manually log parameters, metrics, or models.

**What gets auto-logged:**
- Model architecture and parameters
- Training metrics (accuracy, precision, recall, etc.)
- Model artifacts
- Feature importances
- Training dataset signature

In [0]:
import mlflow
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score

# Enable autologging - this is the magic! ‚ú®
mlflow.autolog()

# Train model - MLflow automatically tracks everything
with mlflow.start_run(run_name="IoT Defect Prediction - RF") as run:
    # Train Random Forest
    rf_model = RandomForestClassifier(
        n_estimators=100,
        max_depth=10,
        random_state=42
    )
    rf_model.fit(X_train, y_train)
    
    # Make predictions
    y_pred = rf_model.predict(X_test)
    y_pred_proba = rf_model.predict_proba(X_test)[:, 1]
    
    # Calculate additional metrics (autolog captures most, but we can add custom ones)
    accuracy = accuracy_score(y_test, y_pred)
    precision = precision_score(y_test, y_pred)
    recall = recall_score(y_test, y_pred)
    f1 = f1_score(y_test, y_pred)
    auc = roc_auc_score(y_test, y_pred_proba)
    run_id = run.info.run_id

### üîç Explore the Databricks MLflow UI

**Click the "Experiment" button at the top right of this notebook** to open the MLflow UI. You'll see:

1. **Runs table** - All your experiments in one place
2. **Parameters** - Hyperparameters used (n_estimators, max_depth, etc.)
3. **Metrics** - Model performance (accuracy, precision, recall, etc.)
4. **Artifacts** - Saved model files, feature importances, and more
5. **Charts** - Visualize metric comparisons across runs

Try clicking on your run to see all the details that were automatically logged!

## Train Another Model to Compare

Let's train a Gradient Boosting model to compare performance.

In [0]:
from sklearn.ensemble import GradientBoostingClassifier

# Autologging is still enabled from earlier
with mlflow.start_run(run_name="IoT Defect Prediction - GBM") as run:
    # Train Gradient Boosting
    gbm_model = GradientBoostingClassifier(
        n_estimators=100,
        learning_rate=0.1,
        max_depth=5,
        random_state=42
    )
    gbm_model.fit(X_train, y_train)
    
    # Make predictions
    y_pred = gbm_model.predict(X_test)
    y_pred_proba = gbm_model.predict_proba(X_test)[:, 1]
    
    # Calculate metrics
    accuracy = accuracy_score(y_test, y_pred)
    precision = precision_score(y_test, y_pred)
    recall = recall_score(y_test, y_pred)
    f1 = f1_score(y_test, y_pred)
    auc = roc_auc_score(y_test, y_pred_proba)
    
    print(f"‚úÖ Run ID: {run.info.run_id}")
    print(f"üìä Accuracy: {accuracy:.4f}")
    print(f"üéØ Precision: {precision:.4f}")
    print(f"üîç Recall: {recall:.4f}")
    print(f"üìà F1 Score: {f1:.4f}")
    print(f"üìâ AUC: {auc:.4f}")

üí° **Pro Tip:** Go back to the MLflow UI and compare the two runs side-by-side. Which model performs better?

## 2Ô∏è‚É£ REGISTER: Save Model to Unity Catalog

The **Unity Catalog Model Registry** is your enterprise model store. It provides:
- **Versioning**: Every model update creates a new version
- **Lineage**: Track which data and code produced each model
- **Governance**: Control who can access and deploy models
- **Aliases**: Tag models as "Champion", "Challenger", "Staging", etc.

In [0]:
# Register the best model (using the Random Forest run_id from earlier)
model_name = f"{catalog}.{schema}.iot_defect_predictor"
model_uri = f"runs:/{run_id}/model"

print(f"üì¶ Registering model: {model_name}")
model_details = mlflow.register_model(model_uri=model_uri, name=model_name)

print(f"‚úÖ Registered model version: {model_details.version}")

### Set Model Alias to "Champion"

Model aliases let you tag specific versions for deployment (e.g., "Champion" for production, "Challenger" for testing).

In [0]:
from mlflow.tracking import MlflowClient

client = MlflowClient()

# Add model description
client.update_registered_model(
    name=model_name,
    description="Random Forest model to predict defects in aircraft engine IoT sensors. Trained on sensor readings (airflow, rotation speed, temperature, pressure) and inspection results."
)

# Set the "Champion" alias to this version
client.set_registered_model_alias(
    name=model_name,
    alias="Champion",
    version=model_details.version
)

print(f"‚úÖ Model version {model_details.version} tagged as 'Champion'")

üéØ **View your model in Unity Catalog:**
1. Click "Catalog" in the left sidebar
2. Navigate to your catalog ‚Üí schema ‚Üí "iot_defect_predictor"
3. See model versions, lineage, and metadata

## 3Ô∏è‚É£ PREDICT: Load and Use the Model

Load the "Champion" model and use it for predictions. This is how you'd use the model in production.

In [0]:
import mlflow.pyfunc

# Load the Champion model by alias
champion_model_uri = f"models:/{model_name}@Champion"
print(f"üì• Loading model from: {champion_model_uri}")

champion_model = mlflow.pyfunc.load_model(champion_model_uri)

print("‚úÖ Model loaded successfully!")

### Make Batch Predictions

Use the loaded model to predict defects on new sensor data.

In [0]:
# Make predictions on test set
predictions = champion_model.predict(X_test)

# Create results DataFrame
results_df = pd.DataFrame({
    "actual_defect": y_test.values,
    "predicted_defect": predictions,
    "airflow_rate": X_test["airflow_rate"].values,
    "rotation_speed": X_test["rotation_speed"].values,
    "temperature": X_test["temperature"].values
})

print("üîÆ Predictions:")
display(results_df.head(20))

# Calculate accuracy
accuracy = (results_df["actual_defect"] == results_df["predicted_defect"]).mean()
print(f"\n‚úÖ Prediction Accuracy: {accuracy:.2%}")

## ‚úÖ Mission Accomplished!

**What you just did:**
1. ‚úÖ **EXPERIMENT** - Trained models with automatic MLflow tracking
2. ‚úÖ **REGISTER** - Saved the best model to Unity Catalog
3. ‚úÖ **PREDICT** - Loaded and used the model for inference

**You're now ready to:**
- Show leadership you have a working predictive model ‚ú®
- Deploy this model to production (see "Try This Out" below)
- Track model performance over time
- Iterate and improve with new versions

## üöÄ Try This Out: Next Steps

Now that you have the MLOps basics down, here are ways to level up:

### 1. Real-Time Model Serving
Deploy your model as a REST API endpoint:
```python
# Enable Model Serving (UI: Machine Learning ‚Üí Serving)
# Your model will be available at an API endpoint for real-time predictions
# Example: https://<workspace>.cloud.databricks.com/serving-endpoints/iot-defect-predictor/invocations
```

**Use cases:**
- Real-time defect detection as sensor data streams in
- Embed predictions in dashboards or operational tools
- Low-latency (<100ms) predictions

**Learn more:** [Model Serving Documentation](https://docs.databricks.com/aws/en/machine-learning/model-serving/index.html)

---

### 2. Streaming Predictions with Structured Streaming
Apply your model to streaming sensor data:
```python
# Load model as UDF
predict_udf = mlflow.pyfunc.spark_udf(spark, model_uri=champion_model_uri)

# Apply to streaming data
stream_df = spark.readStream.table("sensor_bronze")
predictions = stream_df.withColumn("predicted_defect", predict_udf(*feature_cols))

# Write to output table
predictions.writeStream.table("sensor_predictions")
```

**Use cases:**
- Continuous monitoring of all devices
- Automated alerting when defects are predicted
- Real-time dashboards with predictions

**Learn more:** [Structured Streaming + MLflow](https://docs.databricks.com/aws/en/structured-streaming/apply-ml-models.html)

---

### 3. Model Monitoring and Drift Detection
Track model performance over time:
```python
# Log inference data
mlflow.log_table(predictions, artifact_file="predictions.json")

# Monitor for:
# - Data drift (are input features changing?)
# - Concept drift (is the defect pattern changing?)
# - Performance drift (is accuracy decreasing?)
```

**Learn more:** [Lakehouse Monitoring](https://docs.databricks.com/aws/en/lakehouse-monitoring/index.html)

---

### 4. A/B Testing with Multiple Models
Compare "Champion" vs "Challenger" models in production:
```python
# Tag new model as Challenger
client.set_registered_model_alias(model_name, "Challenger", new_version)

# Route 90% traffic to Champion, 10% to Challenger
# Measure which performs better in production
```

---

### 5. Hyperparameter Tuning with Hyperopt
Automatically find the best parameters:
```python
from hyperopt import fmin, tpe, hp, Trials
import mlflow

def objective(params):
    with mlflow.start_run(nested=True):
        mlflow.autolog()
        model = RandomForestClassifier(**params)
        model.fit(X_train, y_train)
        return -accuracy_score(y_test, model.predict(X_test))

search_space = {
    'n_estimators': hp.choice('n_estimators', [50, 100, 200]),
    'max_depth': hp.choice('max_depth', [5, 10, 15, 20])
}

best_params = fmin(fn=objective, space=search_space, algo=tpe.suggest, max_evals=10)
```

**Learn more:** [Hyperparameter Tuning](https://docs.databricks.com/aws/en/machine-learning/automl-hyperparam-tuning/index.html)

---

### 6. Feature Store for Reusable Features
Create a centralized feature repository:
```python
from databricks.feature_store import FeatureStoreClient

fs = FeatureStoreClient()

# Create feature table
fs.create_table(
    name=f"{catalog}.{schema}.sensor_features",
    primary_keys=["device_id"],
    df=feature_df
)

# Models automatically log feature dependencies
```

**Learn more:** [Feature Store](https://docs.databricks.com/aws/en/machine-learning/feature-store/index.html)

## üìö Additional Resources

- [MLflow Quickstart](https://docs.databricks.com/aws/en/mlflow/quick-start.html)
- [MLflow 3 Migration Guide](https://docs.databricks.com/aws/en/mlflow/mlflow-3-install.html)
- [Unity Catalog Model Registry](https://docs.databricks.com/aws/en/machine-learning/manage-model-lifecycle/index.html)
- [Databricks Autologging](https://docs.databricks.com/aws/en/mlflow/databricks-autologging.html)
- [Model Deployment Guide](https://docs.databricks.com/aws/en/machine-learning/model-serving/index.html)

&copy; 2024 Databricks, Inc. All rights reserved.<br/>
Apache, Apache Spark, Spark and the Spark logo are trademarks of the <a href="https://www.apache.org/">Apache Software Foundation</a>.<br/>