# Phase 4.1: Preparing a Model for Serving

This comprehensive notebook demonstrates:
1. **Training a Production-Ready Model** - Build a model ready for deployment
2. **Proper Signature** - Ensure input/output schemas are defined
3. **Input Examples** - Include sample inputs with the model
4. **Registration & Promotion** - Deploy to Production stage

## What is Model Serving?

**Model Serving** means making your trained model available as an API endpoint that can receive requests and return predictions.

```
Client Request          Model Server          Response
    │                       │                    │
    │  {"features": [...]}  │                    │
    │ ─────────────────────>│                    │
    │                       │  Load Model        │
    │                       │  Make Prediction   │
    │                       │<───────────────────│
    │   {"prediction": 0}   │                    │
    │<──────────────────────│                    │
```

## Prerequisites for Serving

A model needs these things to be served:
1. ✅ **Signature** - Defines expected input/output format
2. ✅ **Input Example** - Sample data for testing
3. ✅ **Registered** - In the Model Registry
4. ✅ **Stage** - Promoted to Production (or Staging)

## Learning Goals
- Prepare a model for serving
- Include proper signatures and examples
- Register and promote to Production
- Understand the serving command

## Step 1: Import Libraries

In [None]:
# MLflow imports
import mlflow
import mlflow.sklearn
from mlflow.tracking import MlflowClient

# sklearn imports
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report

# Data handling
import pandas as pd
import time
import os

# Suppress warnings
import warnings
warnings.filterwarnings('ignore')

print("All libraries imported successfully!")
print("Ready to prepare a model for serving!")

## Step 2: Connect to MLflow

In [None]:
# Get MLflow tracking server URL
TRACKING_URI = os.getenv("MLFLOW_TRACKING_URI", "http://localhost:5000")

# Connect to MLflow
mlflow.set_tracking_uri(TRACKING_URI)
mlflow.set_experiment("phase4-serving")

# Model name for serving
MODEL_NAME = "iris-serving-model"

# Create client
client = MlflowClient()

print(f"Connected to MLflow at: {TRACKING_URI}")
print(f"Experiment: phase4-serving")
print(f"Model name: {MODEL_NAME}")

## Step 3: Clean Up Existing Model (For Demo)

In [None]:
# Clean up existing model for fresh demo
try:
    client.delete_registered_model(MODEL_NAME)
    print(f"Cleaned up existing model: {MODEL_NAME}")
except:
    print("No existing model to clean up.")

## Step 4: Load and Prepare Data

In [None]:
# Load Iris dataset
iris = load_iris()

# Create DataFrame with feature names
# Using a DataFrame is important for proper signatures!
X = pd.DataFrame(iris.data, columns=iris.feature_names)

# Create target as a Series with name
y = pd.Series(iris.target, name="species")

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

print("="*60)
print("Preparing Model for Serving")
print("="*60)

print(f"\nDataset: Iris")
print(f"Features: {list(X.columns)}")
print(f"Classes: {list(iris.target_names)}")
print(f"Train size: {len(X_train)}, Test size: {len(X_test)}")

## Step 5: Train Model with Full Logging

We'll train a model and log everything needed for serving.

In [None]:
print("\n[1] Training Model")
print("-" * 40)

with mlflow.start_run(run_name="serving-model") as run:
    # Train a production-quality model
    model = RandomForestClassifier(
        n_estimators=100,
        max_depth=10,
        random_state=42
    )
    model.fit(X_train, y_train)
    
    # Evaluate
    y_pred = model.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    
    print(f"  Accuracy: {accuracy:.4f}")
    
    # Show classification report
    print(f"\n  Classification Report:")
    report = classification_report(y_test, y_pred, target_names=iris.target_names)
    for line in report.split('\n'):
        print(f"    {line}")
    
    # Log parameters
    mlflow.log_param("n_estimators", 100)
    mlflow.log_param("max_depth", 10)
    
    # Log metrics
    mlflow.log_metric("accuracy", accuracy)
    
    # ========================================
    # CRITICAL FOR SERVING: Create Signature
    # ========================================
    print("\n[2] Creating Model Signature")
    print("-" * 40)
    
    # The signature defines what input the model expects
    # and what output it produces
    signature = mlflow.models.infer_signature(
        X_train,                    # Sample input
        model.predict(X_train)      # Sample output
    )
    
    print(f"  Input schema:")
    print(f"    {signature.inputs}")
    print(f"  Output schema:")
    print(f"    {signature.outputs}")
    
    # ========================================
    # CRITICAL FOR SERVING: Log with Metadata
    # ========================================
    print("\n[3] Logging Model with Serving Metadata")
    print("-" * 40)
    
    # Log the model with:
    # 1. signature - Input/output schema
    # 2. input_example - Sample data for testing
    # 3. registered_model_name - Auto-register
    mlflow.sklearn.log_model(
        model,
        "model",
        signature=signature,
        input_example=X_train.iloc[:2],  # First 2 rows as example
        registered_model_name=MODEL_NAME  # Auto-register!
    )
    
    print(f"  Model logged and registered!")
    print(f"  Run ID: {run.info.run_id}")
    
    saved_run_id = run.info.run_id

## Step 6: Promote to Production

In [None]:
print("\n[4] Promoting to Production")
print("-" * 40)

# Wait for registration to complete
time.sleep(2)

# Get the latest version
latest_versions = client.get_latest_versions(MODEL_NAME, stages=["None"])

if latest_versions:
    latest = latest_versions[0]
    
    # Promote to Production
    client.transition_model_version_stage(
        name=MODEL_NAME,
        version=latest.version,
        stage="Production"
    )
    
    print(f"  Version {latest.version} -> Production")
else:
    print("  No version found to promote")

## Step 7: Verify Model is Ready

In [None]:
print("\n[5] Verifying Model is Ready for Serving")
print("-" * 40)

# Load the production model
prod_model = mlflow.pyfunc.load_model(f"models:/{MODEL_NAME}/Production")

# Test predictions
test_predictions = prod_model.predict(X_test[:3])

print(f"  Test predictions: {list(test_predictions)}")
print(f"  Expected:         {list(y_test[:3])}")
print(f"\n  Model is ready for serving!")

## Summary: Model Ready for Serving

Here's what we've done to prepare the model:

In [None]:
print("\n" + "="*60)
print("MODEL READY FOR SERVING")
print("="*60)

# Get version info
latest = client.get_latest_versions(MODEL_NAME, stages=["Production"])[0]

print(f"\n  Model name: {MODEL_NAME}")
print(f"  Version: {latest.version}")
print(f"  Stage: Production")
print(f"  Accuracy: {accuracy:.4f}")

print("\n" + "-"*60)
print("HOW TO SERVE THIS MODEL")
print("-"*60)

print(f"""
Option 1: MLflow CLI
  mlflow models serve -m 'models:/{MODEL_NAME}/Production' -p 5001 --no-conda

Option 2: Docker
  mlflow models build-docker -m 'models:/{MODEL_NAME}/Production' -n my-model
  docker run -p 5001:8080 my-model

Option 3: Custom script
  Use ./scripts/serve-model.sh {MODEL_NAME} Production
""")

print("-"*60)
print("MAKING PREDICTIONS (after starting server)")
print("-"*60)

print(f"""
curl -X POST http://localhost:5001/invocations \
  -H "Content-Type: application/json" \
  -d '{{"inputs": [[5.1, 3.5, 1.4, 0.2]]}}'
""")

## Key Points for Model Serving

### Checklist Before Serving

| Requirement | Why It's Needed |
|-------------|----------------|
| **Signature** | Validates input format at runtime |
| **Input Example** | Enables testing and documentation |
| **Registered** | Enables versioning and stage management |
| **Production Stage** | Indicates model is ready for live traffic |

### Model Logging for Serving

```python
# Include ALL metadata when logging
mlflow.sklearn.log_model(
    model,
    "model",
    signature=signature,           # Required for validation
    input_example=sample_data,     # For testing
    registered_model_name="name"   # Auto-register
)
```

In [None]:
print("="*60)
print("Preparing Model for Serving - Complete!")
print("="*60)
print(f"\nView at: {TRACKING_URI}")
print("\nWhat you learned:")
print("  1. What model serving is and why it matters")
print("  2. How to create model signatures")
print("  3. How to include input examples")
print("  4. How to register and promote models")
print("  5. Commands to serve the model")
print("\nNext: Run 02_test_serving.ipynb to test the serving endpoint!")