# Homework Starter — Stage 14: Deployment & Monitoring

Use this template to draft your reflection and (optionally) sketch a dashboard.

## 1) Reflection (200–300 words)

## Reflections

### What went well
- **Clear API surface**: Endpoints were kept minimal (`/health`, `/predict`, `/plot`), which made testing and handoff straightforward.  
- **Deterministic feature set**: Centralizing `FEATURE_COLUMNS` in `src/features.py` reduced train/inference drift.  
- **Artifact loading contract**: Standardized `load_artifacts()` to return `(MODEL, IMPUTER, SCALER, META)`, simplifying app startup checks.  

### What was hard / surprising
- **Import path issues**: Running the app from a subfolder broke `import src`. This was fixed with a **sys.path bootstrap** that searches parent directories for `src/__init__.py`.  
- **Port conflict on macOS**: Port `5000` was hijacked by **AirPlay (AirTunes)**, returning `403`. Switching Flask to **5050** resolved the issue.  
- **Empty plots**: The “blank image” issue was caused by JSON error messages being saved as `.png`. Adding robust error handling, logging, and forcing `matplotlib` to use the `Agg` backend fixed this.  

### Key decisions & rationale
- **Pipeline consistency**: Adopted a consistent sequence of transforms (imputer → optional scaler → model). SMOTE is applied only during training, not inference.  
- **Top-level exports**: Re-exported `download_data`, `build_features`, and `FEATURE_COLUMNS` in `src/__init__.py` for cleaner imports in notebooks and the app.  
- **Minimal external state**: Endpoint logic avoids hidden globals beyond loaded artifacts; thresholds are kept in `META` for clarity and reproducibility.  

### Bugs found & fixes
- **`ModuleNotFoundError: src`** → Fixed with sys.path bootstrap and ensuring repo root execution.  
- **`ImportError: save_artifacts`** → Cleaned up imports (chose either `from src.io import ...` or re-exported in `__init__.py`).  
- **Blank `/plot` images** → Ensured the route always returns valid PNG bytes and added a fallback “mock series” if yfinance returns empty.  
- **403 errors on `/plot`** → Diagnosed via headers; resolved by moving server to port `5050`.  

### Risks & mitigations
- **Feature order mismatch** → Store and enforce `META["feature_order"]`; validate incoming features length in `/predict`.  
- **Silent data gaps** from yfinance → Added logging, HTTP 404 error handling, and mock fallback for plot previews.  
- **Class imbalance issues with SMOTE** → Added guardrails for single-class splits; advise using larger date ranges or disabling SMOTE for debugging.  

### Observability & monitoring
- **Liveness**: `/health` indicates `ok` or `degraded` based on artifact load.  
- **Logging**: Clear log messages for artifact load, data download, and prediction errors.  
- **Future metrics**: Add request counts, latency, error rates, and model drift monitoring.  

### Reproducibility & handoff
- **`requirements.txt`** included for reproducibility; pin versions before handoff if strict consistency is required.  
- **README.md** documents setup, run commands, curl examples, and common pitfalls (e.g., macOS port conflicts).  
- **Artifacts contract** documented; if missing, `/predict` clearly returns HTTP 503.  

### Ethical & practical considerations
- **Financial predictions**: Clearly state that outputs are experimental and **not financial advice**. Provide probability outputs to avoid overconfidence in raw predictions.  
- **Data source reliability**: yfinance can lag or change schema; ensure graceful degradation and user messaging.  

### If we had one more week
- Add `/plot/features` to overlay **MA5/MA10/MA20** on top of closing price.  
- Wrap preprocessing and model into a single serialized pipeline to simplify artifact management.  
- Add **input validation** with Pydantic and **OpenAPI docs** (potentially moving to FastAPI).  
- Implement basic monitoring (Prometheus/StatsD) with a small dashboard.  
- Add `/train` endpoint (protected) to support retraining and artifact export end-to-end.  

### Final checklist
- [x] README updated with setup, ports, and curl examples  
- [x] `requirements.txt` in place and installs cleanly  
- [x] `src/` exports a stable API with unambiguous imports  
- [x] Artifacts load path standardized (`artifacts/`)  
- [x] `/health`, `/predict`, `/plot` tested locally on port `5050`  

## 2) Optional: Dashboard Sketch
Describe panels and key charts. You can also attach an image file in your repo (png/pdf).

In [None]:
# Optional helper: simple structure to list metrics
monitoring = {
    'data': ['freshness_minutes', 'null_rate', 'schema_hash'],
    'model': ['rolling_mae_or_auc', 'calibration_error'],
    'system': ['p95_latency_ms', 'error_rate'],
    'business': ['approval_rate', 'bad_rate']
}
monitoring