# Lesson 3: Prometheus & Grafana Basics

**Module 6: Monitoring & CI/CD**  
**Estimated Time**: 60 mins  
**Difficulty**: Advanced

---

## üéØ Learning Objectives

By the end of this lesson, you will:

‚úÖ Understand **System Monitoring** (Latency, Throughput, Errors).  
‚úÖ Use the `prometheus_client` library to expose metrics from Python.  
‚úÖ Learn the architecture of **Prometheus** (Scraping) and **Grafana** (Visualizing).  
‚úÖ Simulate a metrics endpoint.

---

## üìö Table of Contents

1. [Functional vs Operational Monitoring](#1-monitoring-types)
2. [Prometheus Architecture](#2-prometheus)
3. [Hands-On: Exposing Metrics from Python](#3-hands-on)
4. [Interview Preparation](#4-interview)

---

### üõ†Ô∏è Setup
We need `prometheus_client`.

In [None]:
pip install prometheus_client -q

## 1. Functional vs Operational Monitoring

| Support Type | Questions Answered | Tools |
|---|---|---|
| **Operational** | Is the server up? Is it slow? Is it running out of RAM? | Prometheus, Grafana, Datadog |
| **Functional** | Is the model predicting garbage? Is the data drifting? | Evidently AI, Arize, Fiddler |

## 2. Prometheus Architecture

Prometheus is a **Time-Series Database (TSDB)**.

### How it works (Pull Model):
1. **You** add an endpoint `/metrics` to your FastAPI app.
2. **Prometheus Server** visits `/metrics` every 15 seconds (scrapes).
3. **Grafana** queries Prometheus to draw charts.

## 3. Hands-On: Exposing Metrics from Python

We will simulate a web server and expose:
1. **Counter**: Total Requests (Only goes up).
2. **Gauge**: Current Memory Usage (Goes up and down).
3. **Histogram**: Latency (Buckets).

In [None]:
from prometheus_client import start_http_server, Counter, Gauge, Histogram
import time
import random

# 1. Define Metrics
REQUEST_COUNT = Counter('app_requests_total', 'Total number of requests')
MEMORY_USAGE = Gauge('app_memory_usage_bytes', 'Current memory usage')
LATENCY = Histogram('app_request_latency_seconds', 'Request latency')

# 2. Start Metrics Server
# In a real app, this runs alongside FastAPI. Here we run it on port 8000.
# start_http_server(8000)

# 3. Simulate Traffic
print("Simulating traffic... (Press Stop to end)")

for i in range(5):
    # Increment Counter
    REQUEST_COUNT.inc()
    
    # Set Gauge
    mem = random.randint(200, 500) * 1024 * 1024 # 200-500 MB
    MEMORY_USAGE.set(mem)
    
    # Observe Histogram
    latency = random.random() * 0.5 # 0.0 - 0.5 seconds
    LATENCY.observe(latency)
    
    print(f"Request {i+1}: Latency={latency:.2f}s, Mem={mem/1024/1024:.0f}MB")
    time.sleep(0.5)

print("Done. If you ran start_http_server, you could see this at localhost:8000/metrics")

## 4. Interview Preparation

**Q1: Why use Prometheus (Pull) vs pushing metrics?**  
*A1: Pulling prevents the server from being overwhelmed by a flood of apps trying to push data. It also makes it easier to tell if an app is DOWN (it stops responding to scrapes).*

**Q2: What is the difference between a Counter and a Gauge?**  
*A2: A Counter strictly increases (e.g., Total Errors). A Gauge can go up and down (e.g., CPU Temperature, Memory Usage).*