## Phase 7: Model Deployment

### What is Deployment?

Deployment means taking a trained model and putting it into production so it can make real predictions on new data.

### Deployment Architectures

**1. Batch Prediction (Offline)**

Used when: Predictions needed daily/weekly, not in real-time


In [None]:
# Save trained model
import joblib
joblib.dump(best_model, 'models/churn_model.joblib')

# In production, load and make predictions on new data
def batch_predict(new_data_path):
    """
    Load new data and make predictions
    """
    model = joblib.load('models/churn_model.joblib')
    new_data = pd.read_csv(new_data_path)
    
    # Preprocess
    new_data_processed = preprocessor.transform(new_data)
    
    # Predict
    predictions = model.predict_proba(new_data_processed)[:, 1]
    
    # Save results
    results = pd.DataFrame({
        'customer_id': new_data['customer_id'],
        'churn_probability': predictions,
        'prediction_date': pd.Timestamp.now()
    })
    
    results.to_csv('outputs/churn_predictions.csv', index=False)
    return results

# Schedule with Airflow
# Runs daily at 2 AM, processes all customers, outputs results to database


**2. Real-Time API (Online)**

Used when: Predictions needed immediately (fraud detection, recommendations)


In [None]:
# Using FastAPI (modern Python web framework)
from fastapi import FastAPI
from pydantic import BaseModel
import joblib

app = FastAPI()
model = joblib.load('models/churn_model.joblib')

class PredictionInput(BaseModel):
    customer_id: int
    age: int
    total_spent: float
    transaction_count: int
    account_age_days: int
    account_type: str
    country: str

@app.post("/predict-churn")
async def predict_churn(input_data: PredictionInput):
    """
    Endpoint to predict churn probability
    """
    # Prepare data
    X = pd.DataFrame([input_data.dict()])
    
    # Preprocess
    X_processed = preprocessor.transform(X)
    
    # Predict
    churn_prob = float(model.predict_proba(X_processed)[0, 1])
    prediction = int(model.predict(X_processed)[0])
    
    return {
        'customer_id': input_data.customer_id,
        'churn_probability': churn_prob,
        'prediction': 'Will Churn' if prediction == 1 else 'Will Not Churn',
        'recommendation': 'Send retention offer' if churn_prob > 0.7 else 'No action needed'
    }

# Deploy with Docker
# curl -X POST "http://localhost:8000/predict-churn" \
#      -H "Content-Type: application/json" \
#      -d '{"customer_id": 123, "age": 45, "total_spent": 5000.0, ...}'


**3. Containerization with Docker**

Ensures model runs identically everywhere (laptop, server, cloud)

```dockerfile
# Dockerfile
FROM python:3.9-slim

WORKDIR /app

# Install dependencies
COPY requirements.txt .
RUN pip install -r requirements.txt

# Copy model and code
COPY churn_model.joblib .
COPY api.py .
COPY preprocessor.joblib .

# Expose port
EXPOSE 8000

# Run API
CMD ["uvicorn", "api:app", "--host", "0.0.0.0", "--port", "8000"]
```

```bash
# Build and run
docker build -t churn-model .
docker run -p 8000:8000 churn-model
```

**4. Model Serving Platforms**

For production-scale serving:

```
Option 1: TensorFlow Serving
- Optimized for deep learning
- High throughput, low latency
- Used by Google internally

Option 2: Seldon
- Kubernetes-native model serving
- Supports any ML framework
- Includes A/B testing, canary deployments

Option 3: KServe
- Built on Kubernetes
- Multi-framework support
- Automatic scaling

Option 4: Cloud Platforms
- AWS SageMaker
- Google Vertex AI
- Azure Machine Learning
- Simple one-click deployment, scales automatically
```

### Deployment Checklist

```
Pre-Deployment:

[✓] Model trained and validated
[✓] Preprocessing pipeline saved with model
[✓] Performance metrics documented
[✓] API endpoints defined
[✓] Input validation implemented
[✓] Error handling added
[✓] Logging configured
[✓] Security reviewed (authentication, rate limiting)

Post-Deployment:

[✓] Monitor prediction latency
[✓] Monitor error rates
[✓] Track model performance (prediction accuracy)
[✓] Monitor data drift (input distribution changes)
[✓] Set up alerting for failures
[✓] Document API usage
[✓] Plan for model updates
```

### Tools Used in Deployment

| Tool | Purpose |
|------|---------|
| Docker | Containerization |
| FastAPI | REST API framework |
| Flask | Lightweight API framework |
| TensorFlow Serving | ML model serving |
| Seldon | Model serving on Kubernetes |
| AWS SageMaker | Cloud model deployment |
| Kubernetes | Container orchestration |

---
