# Week 16 — Deployment & Capstone

This notebook walks through packaging a trained model, exposing a minimal inference API, containerizing it, and adding basic health and metrics endpoints. The goal is a small reproducible demo for your capstone.

## Setup

We use `FastAPI` + `uvicorn` for a minimal ASGI service. For model loading we'll use PyTorch; containerization uses Docker. Install dependencies locally before testing.

In [None]:
# Imports for a minimal API example
import os, pickle
import torch
from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

class InputPayload(BaseModel):
    data: list

# Cache utilities
CACHE_DIR = "cache_week20"
os.makedirs(CACHE_DIR, exist_ok=True)

def save_result(key, obj):
    with open(os.path.join(CACHE_DIR, f"{key}.pkl"), "wb") as f:
        pickle.dump(obj, f)

def load_result(key):
    path = os.path.join(CACHE_DIR, f"{key}.pkl")
    if os.path.exists(path):
        with open(path, "rb") as f:
            return pickle.load(f)
    return None

def cached(key, compute_fn):
    result = load_result(key)
    if result is not None:
        print(f"[cache] loaded '{key}'")
        return result
    print(f"[cache] computing '{key}'...")
    result = compute_fn()
    save_result(key, result)
    return result

# Replace with path to your saved model
MODEL_PATH = 'model.pt'
# model = torch.load(MODEL_PATH)

@app.get('/health')
def health():
    return {'status': 'ok'}

@app.post('/predict')
def predict(payload: InputPayload):
    # Convert payload.data -> tensor, run model, return result
    return {'pred': None, 'note': 'Fill in conversion and model inference'}

## Exercise 1 — Minimal API

Run the API locally with `uvicorn starter:app --reload` (or similar). Test with sample JSON payloads and measure latency for simple inputs.

## Exercise 2 — Containerize

Create a `Dockerfile` that installs requirements, copies the model and app, and exposes the port. A slim Python base image is recommended for small demos.

## Exercise 3 — Health & Metrics

Add `/health` endpoint and simple request latency logging. Optionally integrate a Prometheus client to expose metrics at `/metrics`. Keep monitoring lightweight.

## Deliverables Checklist

- [ ] Minimal FastAPI app or script in `capstone/src/`
- [ ] `Dockerfile` to build and run the API locally
- [ ] `deploy.md` with build/run instructions and sample requests
- [ ] Basic `/health` endpoint and simple latency metric logging