# üöÄ MLOps & Production - Exercises

Practice deployment, monitoring, and production ML.

## Level 1: Fill-in-the-Blank (‚≠ê)

In [None]:
# Exercise 1.1: Create FastAPI endpoint
from ________ import FastAPI
from pydantic import BaseModel

app = ________()

class PredictRequest(BaseModel):
    features: list

@app.________(________)  # POST /predict
def predict(request: PredictRequest):
    return {"prediction": [0, 1]}

In [None]:
# Exercise 1.2: Save model with MLflow
import ________

with mlflow.________():
    mlflow.log_______('accuracy', 0.95)
    mlflow.sklearn.log_______('model', model)

## Level 2: Code Completion (‚≠ê‚≠ê)

In [None]:
# Exercise 2.1: Implement model serving API
# Complete the prediction endpoint with error handling

from fastapi import FastAPI, HTTPException
import joblib
import numpy as np

app = FastAPI()
model = joblib.load('model.joblib')

@app.post("/predict")
def predict(data: dict):
    try:
        # 1. Extract features from data
        features = # YOUR CODE
        
        # 2. Validate input shape
        # YOUR CODE
        
        # 3. Make prediction
        prediction = # YOUR CODE
        
        # 4. Return response
        return # YOUR CODE
        
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

## Level 3: Debug (‚≠ê‚≠ê‚≠ê)

In [None]:
# Exercise 3.1: Fix the Dockerfile (3 bugs)

dockerfile = '''
FROM python:latest  # Bug 1: Too large, use slim

WORKDIR /app

COPY . .  # Bug 2: Should copy requirements first for caching
RUN pip install -r requirements.txt

# Bug 3: Missing EXPOSE for port
CMD ["python", "app.py"]  # Bug 4: Should use proper WSGI server
'''

# Write the corrected Dockerfile

## Level 4: Business Case (‚≠ê‚≠ê‚≠ê‚≠ê)

In [None]:
# CHALLENGE: Production ML System Design
#
# Scenario: Design a recommendation system for a streaming platform
#
# Requirements:
# 1. Serve 10M users, 100K requests/second
# 2. Latency < 50ms p99
# 3. Update recommendations daily
# 4. A/B testing for new models
# 5. Monitor for drift and performance
#
# Design:
# - Architecture diagram
# - Technology choices (with justification)
# - Cost estimate (monthly)
# - Monitoring strategy

# YOUR DESIGN DOCUMENT

## üìä Quiz

**Q1**: What's the difference between batch and real-time inference?

**Q2**: Why use Docker for ML deployment?

**Q3**: What is model drift and how do you detect it?

---
## üîë Solutions
<details>
<summary>Click to reveal</summary>

**1.1**: fastapi, FastAPI, post, "/predict"  
**1.2**: mlflow, start_run, metric, model  
**Q1**: Batch processes many at once (offline); real-time processes one at a time (online)  
**Q2**: Reproducibility, portability, isolation  
**Q3**: Model performance degrades over time; detect via monitoring accuracy on labeled samples

</details>