# ðŸ“Š Week 14: Monitoring & Chaos Testing

**Learning Objectives:**
1. Set up Prometheus/Grafana monitoring
2. Implement stress testing
3. Build chaos engineering practices
4. Create alerting systems

---

# Section 1: Prometheus Metrics

In [None]:
prometheus_code = '''
# FastAPI with Prometheus
from prometheus_client import Counter, Histogram, generate_latest

REQUEST_COUNT = Counter("requests_total", "Total requests", ["method", "endpoint"])
REQUEST_LATENCY = Histogram("request_latency_seconds", "Request latency")

@app.middleware("http")
async def metrics_middleware(request, call_next):
    start = time.time()
    response = await call_next(request)
    
    REQUEST_COUNT.labels(request.method, request.url.path).inc()
    REQUEST_LATENCY.observe(time.time() - start)
    
    return response

@app.get("/metrics")
def metrics():
    return Response(generate_latest(), media_type="text/plain")
'''
print(prometheus_code)

# Section 2: Load Testing

In [None]:
locust_file = '''
# locustfile.py
from locust import HttpUser, task, between

class ChatUser(HttpUser):
    wait_time = between(1, 3)
    
    @task(3)
    def search(self):
        self.client.post("/search", json={"query": "test"})
    
    @task(1)
    def chat(self):
        self.client.post("/chat", json={"message": "hello"})
'''
print(locust_file)

# Section 3: Chaos Engineering

In [None]:
chaos_tests = '''
Chaos Engineering Experiments:

1. Network latency injection
   - Add 500ms delay to DB calls
   - Verify timeouts work

2. Kill service instance
   - Stop one container
   - Verify load balancer routes around

3. Memory pressure
   - Limit container memory
   - Verify graceful degradation

4. Dependency failure
   - Block LLM API
   - Verify fallback response
'''
print(chaos_tests)

# Section 4: Interview Prep

### Q1: What is observability?
**Answer:** Logs + Metrics + Traces. Understanding system behavior.

### Q2: Why chaos engineering?
**Answer:** Find weaknesses before production fails. Build confidence.

---
# Section 5: Deliverable

**Created:** Monitoring dashboards + stress tests

**Next Week:** Feedback Loop