Skip to content

Monitoring FastAPI

James Brucker edited this page Nov 27, 2025 · 3 revisions

This page describes how we expose metrics from the FastAPI web service code in the Prometheus format. This is used for monitoring the application performance and state using Prometheus and Grafana.

It will create an endpoint path /metrics that returns the current value of all metrics defined in the code.

Instrument FastAPI Back-end

Method 1: Use prometheus-fastapi-instrumentator

This method is recommended by Deepseek :-) as it provides "comprehensive metrics out of the box with minimal configuration".

  1. Use pip to install the prometheus-fastapi-instrumentator package.

  2. In main.py (where the FastAPI object is created), add:

    from fastapi import FastAPI
    from prometheus_fastapi_instrumentator import Instrumentator
    
    app = FastAPI()
    
    # Instrument the application
    instrumentator = Instrumentator( <options> )
    instrucmentor.instrument(app)
    instrumentator.expose(app)
  3. (Optional) Advanced configuration.

    from fastapi import FastAPI, Request
    from prometheus_fastapi_instrumentator import Instrumentator, metrics
    
    app = FastAPI()
    
    # Custom instrumentation with more metrics
    instrumentator = Instrumentator(
        should_group_status_codes=True,
        should_ignore_untemplated=True,
        should_respect_env_var=True,
        should_instrument_requests_inprogress=True,
        excluded_handlers=["/metrics", "/health"],
        env_var_name="ENABLE_METRICS",
        inprogress_name="inprogress",
        inprogress_labels=True,
    )
    
    # Add default metrics
    instrumentator.add(
        metrics.request_size(
            should_include_handler=True,
            should_include_method=True,
            should_include_status=True,
        )
    ).add(
        metrics.response_size(
            should_include_handler=True,
            should_include_method=True,
            should_include_status=True,
        )
    ).add(
        metrics.latency(
            bucket_type="exponential",
            should_include_handler=True,
            should_include_method=True,
            should_include_status=True,
        )
    ).add(
        metrics.requests(
            should_include_handler=True,
            should_include_method=True,
            should_include_status=True,
        )
    )
    
    # Instrument the app
    instrumentator.instrument(app)
    instrumentator.expose(app)

Method 2: Use prometheus-client Directly

I added some additional metrics to the standard ones collected by prometheus-fastapi-instrumentor.

This requires the prometheus-client package (also required by prometheus-fastapi-instrumentor).

Create custom metrics (in main.py).

from fastapi import FastAPI, Request, Response
from prometheus_client import generate_latest, CONTENT_TYPE_LATEST, Counter, Histogram, Gauge
import time

app = FastAPI()

# Define metrics
REQUEST_COUNT = Counter(
    'http_requests_total',
    'Total HTTP Requests',
    ['method', 'endpoint', 'status_code']
)

REQUEST_LATENCY = Histogram(
    'http_request_duration_seconds',
    'HTTP Request Latency',
    ['method', 'endpoint']
)

IN_PROGRESS = Gauge(
    'http_requests_in_progress',
    'HTTP Requests in Progress',
    ['method', 'endpoint']
)

@app.middleware("http")
async def monitor_requests(request: Request, call_next):
    method = request.method
    endpoint = request.url.path
    
    # Skip metrics endpoint
    if endpoint == "/metrics":
        return await call_next(request)
    
    IN_PROGRESS.labels(method=method, endpoint=endpoint).inc()
    start_time = time.time()
    
    try:
        response = await call_next(request)
        status_code = response.status_code
    except Exception as e:
        status_code = 500
        raise e
    finally:
        latency = time.time() - start_time
        REQUEST_LATENCY.labels(method=method, endpoint=endpoint).observe(latency)
        REQUEST_COUNT.labels(method=method, endpoint=endpoint, status_code=status_code).inc()
        IN_PROGRESS.labels(method=method, endpoint=endpoint).dec()
    
    return response

@app.get("/metrics")
async def metrics():
    return Response(
        content=generate_latest(),
        media_type=CONTENT_TYPE_LATEST
    )

Method 3: Complete Custom Implementation

This method also requires the prometheus-client package.

from fastapi import FastAPI, Request, Response, Depends
from prometheus_client import (
    generate_latest, CONTENT_TYPE_LATEST, Counter, Histogram, 
    Gauge, Summary, REGISTRY, CollectorRegistry
)
import time
from typing import Callable

app = FastAPI(title="My FastAPI App")

# Custom metrics registry
registry = CollectorRegistry()

# Application metrics
REQUEST_COUNT = Counter(
    'fastapi_requests_total',
    'Total count of HTTP requests',
    ['method', 'endpoint', 'status_code'],
    registry=registry
)

REQUEST_DURATION = Histogram(
    'fastapi_request_duration_seconds',
    'HTTP request duration in seconds',
    ['method', 'endpoint'],
    buckets=[0.01, 0.05, 0.1, 0.5, 1.0, 5.0],
    registry=registry
)

REQUESTS_IN_PROGRESS = Gauge(
    'fastapi_requests_in_progress',
    'Number of HTTP requests in progress',
    ['method', 'endpoint'],
    registry=registry
)

ACTIVE_USERS = Gauge(
    'fastapi_active_users',
    'Number of active users',
    registry=registry
)

# Business logic metrics
ITEMS_CREATED = Counter(
    'fastapi_items_created_total',
    'Total number of items created',
    registry=registry
)

ERROR_COUNT = Counter(
    'fastapi_errors_total',
    'Total number of errors',
    ['error_type'],
    registry=registry
)

@app.middleware("http")
async def metrics_middleware(request: Request, call_next):
    method = request.method
    endpoint = request.url.path
    
    if endpoint == "/metrics":
        return await call_next(request)
    
    REQUESTS_IN_PROGRESS.labels(method=method, endpoint=endpoint).inc()
    start_time = time.time()
    
    try:
        response = await call_next(request)
        status_code = response.status_code
    except Exception as e:
        status_code = 500
        ERROR_COUNT.labels(error_type=type(e).__name__).inc()
        raise e
    finally:
        duration = time.time() - start_time
        REQUEST_DURATION.labels(method=method, endpoint=endpoint).observe(duration)
        REQUEST_COUNT.labels(method=method, endpoint=endpoint, status_code=status_code).inc()
        REQUESTS_IN_PROGRESS.labels(method=method, endpoint=endpoint).dec()
    
    return response

@app.get("/metrics")
async def metrics():
    return Response(
        content=generate_latest(registry),
        media_type=CONTENT_TYPE_LATEST
    )

Docker Configuration for Backend

Dockerfile

FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

EXPOSE 8000
EXPOSE 8001  # If metrics endpoint is on different port

# Remove "--reload" for production use
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", 
     "--workers", "4", "--reload"]

Docker Compose Configuration File

version: '3.8'
services:
  fastapi-app:
    build: .
    # We use nginx as proxy server for backend, so
    # in production do NOT expose port 8000.
    # This is for dev and testing only.
    ports:
      - "8000:8000"
    environment:
      - PROMETHEUS_MULTIPROC_DIR=/tmp
    volumes:
      - /tmp:/tmp

  prometheus:
    image: prom/prometheus
    ports:
      - "9090:9090"
    volumes:
      - ./monitor/prometheus.yml:/etc/prometheus/prometheus.yml

Prometheus Configuration to Gather Back-end Metrics

# prometheus.yml
global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'fastapi-app'
    static_configs:
      - targets: ['fastapi-app:8000']
    metrics_path: '/metrics'
    scrape_interval: 10s

Running and Testing

Start the application

  1. Run app natively

    uvicorn main:app --host 0.0.0.0 --port 8000

    or...

  2. Run app and Prometheus in docker.

    docker compose up -d backend prometheus

Test the metrics endpoint

Use a web browser or curl:

  1. FastAPI monitoring endpoint. http://localhost:8000/metrics

  2. Prometheus metrics. http://localhost:9090/metrics
    Prometheus targets. http://localhost:9090/targets
    Run a query in browser: http://localhost:9090/query

Generate some traffic

Use OpenAPI at http://localhost:8000/docs/

or

# Make some requests to generate metrics
curl http://localhost:8000/
# Get data sources
curl http://localhost:8000/sources/
# Get readings for data source #1
curl http://localhost:8000/readings/1/

Key Metrics

  • http_requests_total - Request counts by method, endpoint, status
  • http_request_duration_seconds - Request latency distribution
  • http_requests_in_progress - Current in-progress requests

Clone this wiki locally