fastapi-observer

Zero-glue observability for FastAPI.

fastapi-observer gives you structured JSON logs, request correlation, Prometheus metrics, OpenTelemetry tracing, security redaction presets, and runtime controls in one install step and one function call.

Supported Python versions: 3.10 to 3.14

Compatibility Matrix

Component	Supported / Tested
Python	`3.10` to `3.14` (CI matrix)
FastAPI	`>=0.129.0`
Starlette	`>=0.52.1`
pydantic-settings	`>=2.10.1`
Prometheus backend	`prometheus-client>=0.24.1` (optional extra)
OpenTelemetry	`opentelemetry-api/sdk/exporter>=1.39.1` (optional extra)
Loguru bridge	`loguru>=0.7.2` (optional extra)

Why This Package Exists

Most FastAPI services eventually need the same observability plumbing:

Structured JSON logging
Request and trace correlation
Metrics for dashboards and alerts
OpenTelemetry setup
Redaction/sanitization for sensitive data
Runtime controls for incident response

Teams usually implement this as custom glue code in every service. That costs engineering time and creates drift between services.

fastapi-observer replaces this repeated wiring with a consistent, secure-by-default setup.

Sponsor

If this library saves you engineering time, you can support maintenance here:

buymeacoffee.com/FYbPCSu

What You Get Immediately

After one call to install_observability():

Capability	Included	Default
Structured JSON logs	Yes	Enabled
Request ID correlation	Yes	Enabled
Trace/span IDs in logs	Yes (with OTel)	Off until OTel enabled
Prometheus `/metrics`	Yes	Off until `metrics_enabled=True`
Sensitive-data redaction	Yes	Enabled
Security presets (`strict`, `pci`, `gdpr`)	Yes	Available
Runtime control endpoint	Yes	Off until enabled
Plugin hooks for enrichment/hooks	Yes	Available

Install

# Core (logging + metrics + security)
pip install fastapi-observer

# Prometheus metrics support
pip install "fastapi-observer[prometheus]"

# Loguru coexistence bridge support
pip install "fastapi-observer[loguru]"

# OpenTelemetry tracing/logs support
pip install "fastapi-observer[otel]"

# Everything
pip install "fastapi-observer[all]"

Import path:

import fastapiobserver

5-Minute Quick Start

from fastapi import FastAPI
from fastapiobserver import ObservabilitySettings, install_observability

app = FastAPI()

settings = ObservabilitySettings(
    app_name="orders-api",
    service="orders",
    environment="production",
    version="0.1.0",
    metrics_enabled=True,
)

install_observability(app, settings)


@app.get("/orders/{order_id}")
def get_order(order_id: int) -> dict[str, int]:
    return {"order_id": order_id}

Run:

uvicorn main:app --reload

Now you have:

Structured request logs on every request
Request ID propagation
Sanitized event payloads
Prometheus metrics at /metrics

Security Defaults and Presets

Default protections

Protection	Default	Why
Body logging	`OFF`	Avoid leaking request/response secrets
Sensitive key masking	`ON`	Protect fields like `password`, `token`, `secret`
Sensitive header masking	`ON`	Protect `authorization`, `cookie`, `x-api-key`
Query string in logged path	Excluded	Prevent accidental token leakage
Request ID trust boundary	Trusted CIDRs only	Prevent spoofed correlation IDs

Presets for regulated environments

from fastapiobserver import SecurityPolicy

# Strictest option: drop sensitive values and keep minimal safe headers
strict_policy = SecurityPolicy.from_preset("strict")

# PCI-focused redaction fields
pci_policy = SecurityPolicy.from_preset("pci")

# GDPR-focused hashed PII fields
gdpr_policy = SecurityPolicy.from_preset("gdpr")

Use a preset in installation:

install_observability(app, settings, security_policy=SecurityPolicy.from_preset("pci"))

Allowlist-only logging (audit-style)

If your compliance model is "log only approved fields", use allowlists:

from fastapiobserver import SecurityPolicy

policy = SecurityPolicy(
    header_allowlist=("x-request-id", "content-type", "user-agent"),
    event_key_allowlist=("method", "path", "status_code"),
)

Body capture media-type guard

policy = SecurityPolicy(
    log_request_body=True,
    body_capture_media_types=("application/json",),
)

Runtime Control Plane (No Restart)

Use runtime controls when you need higher log verbosity or different trace sampling during an incident.

export OBSERVABILITY_CONTROL_TOKEN="replace-me"

from fastapiobserver import RuntimeControlSettings, install_observability

runtime_control = RuntimeControlSettings(enabled=True)
install_observability(app, settings, runtime_control_settings=runtime_control)

Inspect current runtime values:

curl -X GET http://localhost:8000/_observability/control \
  -H "Authorization: Bearer replace-me"

Update runtime values:

curl -X POST http://localhost:8000/_observability/control \
  -H "Authorization: Bearer replace-me" \
  -H "Content-Type: application/json" \
  -d '{"log_level":"DEBUG","trace_sampling_ratio":0.25}'

What changes immediately:

Root logger level (and uvicorn loggers)
Dynamic OTel trace sampling ratio

OpenTelemetry (Traces + Optional OTLP Logs + Optional OTLP Metrics)

from fastapiobserver import (
    OTelLogsSettings,
    OTelMetricsSettings,
    OTelSettings,
    install_observability,
)

otel_settings = OTelSettings(
    enabled=True,
    service_name="orders-api",
    service_version="2.0.0",
    environment="production",
    otlp_endpoint="http://localhost:4317",
    protocol="grpc",                  # or "http/protobuf"
    trace_sampling_ratio=1.0,
    extra_resource_attributes={
        "k8s.namespace": "prod",
        "team": "backend",
    },
)

otel_logs_settings = OTelLogsSettings(
    enabled=True,
    logs_mode="both",                 # "local_json", "otlp", or "both"
    otlp_endpoint="http://localhost:4317",
    protocol="grpc",
)

otel_metrics_settings = OTelMetricsSettings(
    enabled=True,
    otlp_endpoint="http://localhost:4317",
    protocol="grpc",                  # or "http/protobuf"
    export_interval_millis=60000,
)

install_observability(
    app,
    settings,
    otel_settings=otel_settings,
    otel_logs_settings=otel_logs_settings,
    otel_metrics_settings=otel_metrics_settings,
)

Design details:

Reuses an externally configured tracer provider if one already exists.
Injects trace IDs into application logs for log-trace correlation.
Supports runtime sampling updates through the control plane.
Sends OTel logs in OTLP mode with the same sanitization policy.
Supports optional OTLP metrics export for unified OTel backends.
Registers graceful shutdown hooks to flush provider buffers on app exit.

Baggage propagation

inject_trace_headers() uses OpenTelemetry propagation, so it forwards traceparent, tracestate, and baggage when baggage is present in the active context.

from opentelemetry import baggage
from opentelemetry.context import attach, detach

from fastapiobserver import inject_trace_headers

token = attach(baggage.set_baggage("tenant_id", "acme"))
try:
    headers = inject_trace_headers({})
    # headers["baggage"] == "tenant_id=acme"
finally:
    detach(token)

What `install_observability()` Wires Up

Structured logging pipeline (JSON formatter + bounded async queue handler).
Metrics backend and /metrics endpoint when metrics are enabled.
OTel tracing setup when OTel is enabled.
Optional OTel logs/metrics setup when OTLP settings are enabled.
Request logging middleware with sanitization and context cleanup.
Runtime control endpoint when runtime control is enabled.

Request path lifecycle (high-level):

Request arrives
  -> request ID / trace context resolved
  -> app handler executes
  -> response classified (ok/client_error/server_error/exception)
  -> payload sanitized by policy
  -> log emitted + metrics recorded
  -> context cleared

Internal Package Layout (Contributor Map)

The project is now organized as focused subpackages instead of large monolithic modules:

fastapiobserver/logging/: formatter, queueing, filters, setup lifecycle, sink circuit-breakers.
fastapiobserver/middleware/: request logging orchestration, context, IP resolution, headers, body capture, metrics hooks.
fastapiobserver/sinks/: sink protocol, registry/discovery, built-ins, factory wiring, Logtail + DLQ implementation.
fastapiobserver/metrics/: backend contracts/registry/builder/endpoint, Prometheus integration subpackage.
fastapiobserver/security/: policy/settings models, normalization helpers, redaction engine, trusted-proxy utilities.
fastapiobserver/otel/: OTel settings/resource/tracing/logs/metrics/lifecycle helpers.

Public imports remain backward-compatible via package facades (__init__.py re-exports).

Example JSON Log Event

{
  "timestamp": "2026-02-18T10:30:00.000000+00:00",
  "level": "INFO",
  "logger": "fastapiobserver.middleware",
  "message": "request.completed",
  "app_name": "orders-api",
  "service": "orders",
  "environment": "production",
  "version": "0.1.0",
  "log_schema_version": "1.0.0",
  "library": "fastapiobserver",
  "request_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "trace_id": "0af7651916cd43dd8448eb211c80319c",
  "span_id": "b7ad6b7169203331",
  "event": {
    "method": "GET",
    "path": "/orders/42",
    "status_code": 200,
    "http.request.method": "GET",
    "url.path": "/orders/42",
    "http.response.status_code": 200,
    "duration_ms": 3.456,
    "client_ip": "10.0.0.1",
    "error_type": "ok"
  }
}

On exception logs, a structured error object is included for indexed queries, featuring a stable AST-based fingerprint hash which ignores transient memory locations or exact line numbers, allowing zero-dependency alerting directly in your search backend.

{
  "error": {
    "type": "RuntimeError",
    "message": "boom",
    "stacktrace": "Traceback (most recent call last): ...",
    "fingerprint": "a1b2c3d4e5f67890abcd12345678bbcc"
  }
}

Production Deployment Guide

This section is deployment-first. A new engineer should be able to ship this stack without reading the source code.

Reference architecture

flowchart LR
  A["FastAPI services (fastapi-observer)"] --> C["OTel Collector"]
  C --> D["Tempo (traces)"]
  C --> E["Loki (logs)"]
  A --> F["Prometheus (/metrics scrape)"]
  F --> G["Grafana"]
  D --> G
  E --> G

Minimal collector config

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  memory_limiter:
    limit_mib: 512
    spike_limit_mib: 128
    check_interval: 5s
  batch:
    send_batch_size: 512
    timeout: 5s

exporters:
  otlphttp/tempo:
    endpoint: http://tempo:4318
  otlphttp/loki:
    endpoint: http://loki:3100/otlp

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [memory_limiter, batch]
      exporters: [otlphttp/tempo]
    logs:
      receivers: [otlp]
      processors: [memory_limiter, batch]
      exporters: [otlphttp/loki]

Rollout strategy

Baseline current service SLOs before migration (latency, error rate, availability).
Enable fastapi-observer in one service with conservative settings (no body capture).
Run canary rollout (5-10% traffic) and compare: latency p95, 5xx rate, and log/traces pipeline health.
Expand rollout to all replicas/services after 24-48h stable canary.
Enable advanced controls in phases: security presets, allowlists, runtime control plane, OTLP logs mode.

Failure modes and expected behavior

Failure mode	Expected behavior	Immediate action
OTel Collector down	App still serves traffic; local logs still available if `OTEL_LOGS_MODE=both`	Fail over Collector or temporarily switch to local-json mode
Tempo down	Traces unavailable; logs/metrics continue	Restore Tempo, keep incident correlation via logs
Loki down	Logs unavailable in Grafana; metrics/traces continue	Restore Loki, use app stdout logs temporarily
Prometheus down	No metrics/alerts; app traffic unaffected	Restore Prometheus and alertmanager path
High cardinality on paths	Prometheus pressure increases	Use route templates and exclude noisy paths
Spoofed forwarded headers	Incorrect client IP/request ID trust	Tighten `OBS_TRUSTED_CIDRS` and proxy chain config

SLO and alert checklist

Recommended SLOs:

Availability: >= 99.9% over 30 days
p95 latency: < 500ms for core APIs
5xx rate: < 1% per service
Error-budget burn alerting: fast burn (1h), slow burn (6h)

Starter alert queries:

# 5xx rate per service (5 minutes)
sum(rate(http_requests_total{status_code=~"5.."}[5m])) by (service)

# p95 latency per service
histogram_quantile(
  0.95,
  sum(rate(http_request_duration_seconds_bucket[5m])) by (le, service)
)

# Traffic drop detection
sum(rate(http_requests_total[5m])) by (service)

Incident playbook (first 15 minutes)

Confirm blast radius in Grafana: affected services, status codes, latency shifts, deployment changes.
Increase signal quality without restart: use runtime control plane to raise log level and tracing sample ratio.
Identify dependency failures: check Collector, Loki, Tempo, Prometheus health and ingestion queues.
Mitigate: roll back latest app change, scale affected service, or disable expensive capture options.
Verify recovery: p95 + 5xx return to baseline, trace volume normalized, alert clears.

Kubernetes quickstart (copy/paste)

Use the bundled manifests:

kubectl kustomize --load-restrictor=LoadRestrictionsNone examples/k8s | kubectl apply -f -
kubectl -n observability rollout status deployment/app-a
kubectl -n observability rollout status deployment/app-b
kubectl -n observability rollout status deployment/app-c
kubectl -n observability rollout status deployment/otel-collector
kubectl -n observability rollout status deployment/prometheus
kubectl -n observability rollout status deployment/loki
kubectl -n observability rollout status deployment/tempo
kubectl -n observability rollout status deployment/grafana
kubectl -n observability rollout status deployment/traffic-generator
kubectl -n observability port-forward svc/grafana 3000:3000

Open http://localhost:3000. Full guide: kubernetes.md

Low-Overhead & Production Tuning (Advanced)

fastapi-observer integrates natively with the core OpenTelemetry Python SDK, meaning you can aggressively tune its resource usage purely via standard environment variables without altering your application code.

For high-throughput services (e.g. 10k+ RPS), apply these exact variables to minimize the observer footprint:

1. Head-Based Sampling

Tracing 100% of requests is too expensive at scale. You should configure fastapi-observer to respect upstream trace flags, while only sampling a fraction of net-new requests:

# Keep the parent's sample decision if it exists, otherwise sample 5%
export OTEL_TRACES_SAMPLER="parentbased_traceidratio"
export OTEL_TRACES_SAMPLER_ARG="0.05"

2. Exclude Noisy URLs from the SDK

Do not waste cycles generating spans for health checks or static assets. fastapi-observer will auto-derive metrics exclusions, but you can explicitly drop them from tracing at the C-extension level:

export OTEL_PYTHON_FASTAPI_EXCLUDED_URLS="healthz,metrics,favicon.ico"

3. Cap Span Attributes

Prevent large, unmanageable spans from consuming excessive memory in the BatchSpanProcessor:

export OTEL_SPAN_ATTRIBUTE_COUNT_LIMIT="128"
export OTEL_SPAN_EVENT_COUNT_LIMIT="128"
export OTEL_SPAN_LINK_COUNT_LIMIT="128"

4. Optimize Output Buffers

The default OpenTelemetry batch limits are too conservative for high-throughput ASGI microservices. Increase the max queue limits so spikes aren't dropped, but decrease the timeout so the process memory is flushed faster:

export OTEL_BSP_MAX_QUEUE_SIZE="10000"
export OTEL_BSP_MAX_EXPORT_BATCH_SIZE="5000"
export OTEL_BSP_SCHEDULE_DELAY="1000"

Examples

The examples/ directory contains runnable demos:

Example	What it shows
`basic_app.py`	Minimal setup and request logging
`security_presets_app.py`	Preset-based security policy
`allowlist_app.py`	Allowlist-only sanitization
`otel_app.py`	OTel tracing and resource attributes
`audit_app.py`	Tamper-evident cryptographic log signatures
`db_tracing_app.py`	SQLCommenter and database trace injection
`graphql_app.py`	Native Strawberry GraphQL observability
`benchmarks/`	Baseline vs observer benchmark harness
`k8s/`	Kubernetes-native stack with Prometheus + Loki + Tempo + Grafana
`full_stack/`	Docker Compose stack: 3 FastAPI services + Grafana + Prometheus + Loki + Tempo

Run an example:

uvicorn examples.basic_app:app --reload

Dashboard Screenshots (Full-Stack Demo)

From examples/full_stack, these are real Grafana views generated by fastapi-observer telemetry:

Overview panels (latency heatmap, route throughput, errors, CPU/memory):

Percentiles, request rate, and structured JSON logs in Loki:

Environment Variables

The library supports configuration from code and env vars. Below are the most relevant env vars by area.

Identity and logging

Variable	Default	Description
`APP_NAME`	`app`	Namespace for app-level identity
`SERVICE_NAME`	`api`	Service label for logs/metrics
`ENVIRONMENT`	`development`	Environment label
`APP_VERSION`	`0.0.0`	Service version
`LOG_LEVEL`	`INFO`	Root log level
`LOG_DIR`	-	Optional file log directory
`LOG_QUEUE_MAX_SIZE`	`10000`	Max in-memory records in core log queue
`LOG_QUEUE_OVERFLOW_POLICY`	`drop_oldest`	Queue overflow behavior: `drop_oldest`, `drop_newest`, `block`
`LOG_QUEUE_BLOCK_TIMEOUT_SECONDS`	`1.0`	Timeout used by `block` policy before dropping newest
`LOG_SINK_CIRCUIT_BREAKER_ENABLED`	`true`	Enable sink circuit-breaker protection
`LOG_SINK_CIRCUIT_BREAKER_FAILURE_THRESHOLD`	`5`	Consecutive sink failures before opening circuit
`LOG_SINK_CIRCUIT_BREAKER_RECOVERY_TIMEOUT_SECONDS`	`30.0`	Open-state cooldown before half-open probe
`REQUEST_ID_HEADER`	`x-request-id`	Incoming request ID header
`RESPONSE_REQUEST_ID_HEADER`	`x-request-id`	Response request ID header

Metrics

Variable	Default	Description
`METRICS_ENABLED`	`false`	Enable metrics backend
`METRICS_BACKEND`	`prometheus`	Registered backend name used by `install_observability()`
`METRICS_PATH`	`/metrics`	Metrics endpoint path
`METRICS_EXCLUDE_PATHS`	`/metrics,/health,/healthz,/docs,/openapi.json`	Skip metrics for noisy endpoints
`METRICS_EXEMPLARS_ENABLED`	`false`	Enable exemplars where supported
`METRICS_FORMAT`	`negotiate`	`prometheus`, `openmetrics`, or `negotiate`

Caution

The /metrics endpoint is unauthenticated by default. In production it should be restricted to internal networks (e.g. behind a Kubernetes NetworkPolicy, VPC security group, or ingress rule that only allows your Prometheus scraper). Exposing it publicly leaks service topology, error rates, and request patterns.

Security and trust boundary

Variable	Default	Description
`OBS_REDACTION_PRESET`	-	`strict`, `pci`, `gdpr`
`OBS_REDACTED_FIELDS`	built-in list	CSV keys to redact
`OBS_REDACTED_HEADERS`	built-in list	CSV headers to redact
`OBS_REDACTION_MODE`	`mask`	`mask`, `hash`, `drop`
`OBS_MASK_TEXT`	`***`	Mask replacement text
`OBS_LOG_REQUEST_BODY`	`false`	Enable request body logging
`OBS_LOG_RESPONSE_BODY`	`false`	Enable response body logging
`OBS_MAX_BODY_LENGTH`	`256`	Max captured body bytes
`OBS_HEADER_ALLOWLIST`	-	CSV headers allowed in logs
`OBS_EVENT_KEY_ALLOWLIST`	-	CSV event keys allowed in logs
`OBS_BODY_CAPTURE_MEDIA_TYPES`	-	CSV allowed media types for body capture
`OBS_TRUSTED_PROXY_ENABLED`	`true`	Enable trusted-proxy policy
`OBS_TRUSTED_CIDRS`	RFC1918 + loopback	CSV trusted CIDRs
`OBS_HONOR_FORWARDED_HEADERS`	`false`	Trust forwarded headers

Notes:

OBS_HEADER_ALLOWLIST, OBS_EVENT_KEY_ALLOWLIST, and OBS_BODY_CAPTURE_MEDIA_TYPES accept none, null, or unset to clear values.

OpenTelemetry tracing/log export

Variable	Default	Description
`OTEL_ENABLED`	`false`	Enable tracing instrumentation
`OTEL_SERVICE_NAME`	`SERVICE_NAME`	OTel service name override
`OTEL_SERVICE_VERSION`	`APP_VERSION`	OTel service version override
`OTEL_ENVIRONMENT`	`ENVIRONMENT`	OTel environment override
`OTEL_EXPORTER_OTLP_ENDPOINT`	-	OTLP endpoint
`OTEL_EXPORTER_OTLP_PROTOCOL`	`grpc`	`grpc` or `http/protobuf`
`OTEL_TRACE_SAMPLING_RATIO`	`1.0`	Initial trace sampling ratio
`OTEL_EXTRA_RESOURCE_ATTRIBUTES`	-	CSV `key=value` pairs
`OTEL_EXCLUDED_URLS`	auto-derived	CSV excluded paths for tracing
`OTEL_LOGS_ENABLED`	`false`	Enable OTLP log export
`OTEL_LOGS_MODE`	`local_json`	`local_json`, `otlp`, `both`
`OTEL_LOGS_ENDPOINT`	-	OTLP logs endpoint
`OTEL_LOGS_PROTOCOL`	`grpc`	`grpc` or `http/protobuf`
`OTEL_METRICS_ENABLED`	`false`	Enable OTLP metrics export
`OTEL_METRICS_ENDPOINT`	-	OTLP metrics endpoint
`OTEL_METRICS_PROTOCOL`	`grpc`	`grpc` or `http/protobuf`
`OTEL_METRICS_EXPORT_INTERVAL_MILLIS`	`60000`	OTLP metrics export interval in milliseconds

Runtime control plane

Variable	Default	Description
`OBS_RUNTIME_CONTROL_ENABLED`	`false`	Enable runtime control endpoint
`OBS_RUNTIME_CONTROL_PATH`	`/_observability/control`	Control endpoint path
`OBS_RUNTIME_CONTROL_TOKEN_ENV_VAR`	`OBSERVABILITY_CONTROL_TOKEN`	Name of env var containing bearer token
`OBSERVABILITY_CONTROL_TOKEN`	-	Bearer token value used for auth

Optional Logtail sink

Variable	Default	Description
`LOGTAIL_ENABLED`	`false`	Enable Better Stack Logtail sink
`LOGTAIL_SOURCE_TOKEN`	-	Logtail source token
`LOGTAIL_BATCH_SIZE`	`50`	Batch size for shipping
`LOGTAIL_FLUSH_INTERVAL`	`2.0`	Flush interval (seconds)
`LOGTAIL_DLQ_ENABLED`	`false`	Enable resilient local disk fallback for dropped logs
`LOGTAIL_DLQ_DIR`	`.dlq/logtail`	Directory to archive dropped NDJSON messages
`LOGTAIL_DLQ_MAX_BYTES`	`52428800`	Max bytes per DLQ file before rotation (50MB)
`LOGTAIL_DLQ_COMPRESS`	`true`	GZIP compress rotated DLQ files

Tip

The Logtail Dead Letter Queue (DLQ) provides best-effort local durability. If the internal memory queue overflows under immense API pressure (queue.Full), or if an external network partition completely exhausts the outbound HTTP retry backoff, the dropped log payloads are immediately salvaged into local NDJSON envelopes. You can replay these files to BetterStack later using the provided scripts/replay_dlq.py utility.

Tamper-evident audit logging

For regulated industries (Fintech, Healthcare, SOC 2) where you must prove logs were not altered, deleted, or reordered:

pip install "fastapi-observer[audit]"
export OBS_AUDIT_SECRET_KEY="your-signing-secret"
export OBS_AUDIT_LOGGING_ENABLED="true"

install_observability(app, settings)
# Every JSON log now contains _audit_seq and _audit_sig fields

Each log record is chained via HMAC-SHA256: record B's signature includes the signature of record A. Breaking any link (tamper, delete, reorder) invalidates the chain.

Variable	Default	Description
`OBS_AUDIT_LOGGING_ENABLED`	`false`	Enable HMAC-SHA256 hash chain
`OBS_AUDIT_KEY_ENV_VAR`	`OBS_AUDIT_SECRET_KEY`	Name of env var containing the signing key
`OBS_AUDIT_SECRET_KEY`	-	The HMAC signing key (read by `LocalHMACProvider`)

Custom key provider (e.g. KMS / Vault):

from fastapiobserver import AuditKeyProvider

class VaultKeyProvider:
    def get_key(self) -> bytes:
        return vault_client.get_secret("audit-signing-key").encode()

install_observability(app, settings, audit_key_provider=VaultKeyProvider())

Verify logs with the CLI:

export OBS_AUDIT_SECRET_KEY="your-signing-secret"
python scripts/verify_audit_chain.py exported_logs.ndjson
# PASS — 1042 records verified, chain intact.

SQLCommenter & database tracing

Bridges the gap between FastAPI application traces and database performance monitoring. When enabled, the trace_id is automatically injected into raw SQL queries as a comment:

SELECT * FROM users /*traceparent='00-abc123def456-01',route='/api/users'*/

This lets DBAs correlate slow Postgres/MySQL queries directly back to the originating HTTP request.

pip install "fastapi-observer[otel-sqlalchemy]"

Automatic (via install_observability):

from sqlalchemy import create_engine

engine = create_engine("postgresql://...")
install_observability(app, settings, otel_settings=otel_settings, db_engine=engine)

Multiple engines (e.g. read/write replicas):

write_engine = create_engine("postgresql://primary/...")
read_engine = create_engine("postgresql://replica/...")
install_observability(
    app, settings,
    otel_settings=otel_settings,
    db_engine=[write_engine, read_engine],
    db_commenter_options={"opentelemetry_values": True, "route": False},
)

Manual (standalone):

from fastapiobserver import instrument_sqlalchemy, instrument_sqlalchemy_async

# Sync engine
instrument_sqlalchemy(engine)

# Async engine (extracts .sync_engine automatically)
instrument_sqlalchemy_async(async_engine)

Custom commenter options:

instrument_sqlalchemy(engine, commenter_options={
    "opentelemetry_values": True,  # traceparent (default: True)
    "db_driver": True,              # e.g. psycopg2 (default: True)
    "route": True,                  # e.g. /api/users (default: True)
    "db_framework": True,           # e.g. sqlalchemy (default: False)
})

Caution

Database statement PII risk: The OpenTelemetry SQLAlchemy instrumentor captures raw SQL as the db.statement span attribute and exports it directly to your tracing backend (Jaeger, Tempo, Datadog). These span attributes are not scrubbed by fastapi-observer's SecurityPolicy. If your application has poorly parameterized queries (e.g. WHERE email = 'user@example.com'), PII will leak into your trace storage. Always use parameterized queries and review your db.statement exports in production.

Advanced Operations

Middleware ordering for body capture

If body capture is enabled, install observability before other middleware:

from fastapi.middleware.cors import CORSMiddleware
from fastapiobserver import SecurityPolicy, install_observability

install_observability(app, settings, security_policy=SecurityPolicy(log_request_body=True))
app.add_middleware(CORSMiddleware, allow_origins=["*"])

Multi-worker Gunicorn

Warning

--preload-app is dangerous if visibility is initialized at the module level.

When --preload-app is used, the application is loaded into the master process before forking the worker processes. fastapi-observer uses a highly-performant background thread (QueueListener) for non-blocking logging, and OpenTelemetry similarly spawns background export threads.

Threads do not survive a process fork. If you call install_observability() at the module level (e.g., right under app = FastAPI()), the background threads will be created in the master process, and all workers will silently drop logs because their logging threads never started.

How to fix: Either remove --preload-app, OR initialize observability inside the FastAPI lifespan context manager so it starts safely after the fork inside each worker:

from contextlib import asynccontextmanager
from fastapi import FastAPI
from fastapiobserver import ObservabilitySettings, install_observability

settings = ObservabilitySettings(service="api")

@asynccontextmanager
async def lifespan(app: FastAPI):
    install_observability(app, settings) # Safely initializes per-worker
    yield
    # fastapiobserver automatically registers shutdown hooks so no teardown needed here

app = FastAPI(lifespan=lifespan)

Prometheus Multiprocess Mode

If you are using Prometheus with multiple Gunicorn workers, you must configure a shared metrics directory:

export PROMETHEUS_MULTIPROC_DIR=/tmp/prometheus-metrics
rm -rf "$PROMETHEUS_MULTIPROC_DIR"
mkdir -p "$PROMETHEUS_MULTIPROC_DIR"

gunicorn.conf.py:

from fastapiobserver import mark_prometheus_process_dead


def child_exit(server, worker):
    mark_prometheus_process_dead(worker.pid)

Bounded queue and overflow policy

Use queue controls to define behavior under sustained log pressure:

settings = ObservabilitySettings(
    app_name="orders-api",
    service="orders",
    environment="production",
    log_queue_max_size=20000,
    log_queue_overflow_policy="drop_oldest",  # or "drop_newest" / "block"
    log_queue_block_timeout_seconds=0.5,
)

Queue pressure metrics exposed on /metrics (Prometheus mode):

fastapiobserver_log_queue_size
fastapiobserver_log_queue_capacity
fastapiobserver_log_queue_enqueued_total
fastapiobserver_log_queue_dropped_total{reason="drop_oldest|drop_newest"}
fastapiobserver_log_queue_blocked_total
fastapiobserver_log_queue_block_timeouts_total

Sink circuit breaker

Every output sink is wrapped with a circuit breaker so a failing sink does not degrade request-path logging. This includes custom sinks registered via the LogSink protocol. The core package stays intentionally lean; provider-specific sinks can be added as optional packages without changing install_observability().

settings = ObservabilitySettings(
    app_name="orders-api",
    service="orders",
    environment="production",
    sink_circuit_breaker_enabled=True,
    sink_circuit_breaker_failure_threshold=5,
    sink_circuit_breaker_recovery_timeout_seconds=30.0,
)

Breaker metrics exposed on /metrics:

fastapiobserver_sink_circuit_breaker_state_info{sink,state}
fastapiobserver_sink_circuit_breaker_failures_total{sink}
fastapiobserver_sink_circuit_breaker_skipped_total{sink}
fastapiobserver_sink_circuit_breaker_opens_total{sink}
fastapiobserver_sink_circuit_breaker_half_open_total{sink}
fastapiobserver_sink_circuit_breaker_closes_total{sink}

Logging Shutdown Lifecycle

install_observability() now registers graceful logging teardown on FastAPI shutdown and also uses an atexit fallback. This reduces lost log records during process termination.

If you embed logging setup outside FastAPI lifecycle management, you can stop the queue pipeline explicitly:

from fastapiobserver import shutdown_logging

shutdown_logging()

Loguru Coexistence

If your service already uses loguru, forward those logs into fastapi-observer instead of maintaining two independent pipelines.

from fastapiobserver import install_loguru_bridge

# loguru -> stdlib -> fastapi-observer queue/sinks
bridge_id = install_loguru_bridge()

Detailed migration/coexistence guide: loguru.md

GraphQL Integrations (Strawberry)

If you use strawberry-graphql, routing all traffic through POST /graphql severely blinds your logs and traces. fastapi-observer ships a native Strawberry extension (via Duck Typing, meaning NO bloated pip dependencies) to automatically extract GraphQL operations.

import strawberry
from fastapiobserver.integrations.strawberry import StrawberryObservabilityExtension

@strawberry.type
class Query:
    @strawberry.field
    def hello(self) -> str:
        return "world"

schema = strawberry.Schema(
    query=Query,
    extensions=[StrawberryObservabilityExtension],  # Inject this!
)

With this extension, your logs will automatically get a graphql context key containing the extracted operation_name:

{
  "event": {
    "method": "POST",
    "path": "/graphql"
  },
  "user_context": {
    "graphql": {
      "operation_name": "GetUsersQuery"
    }
  }
}

If OpenTelemetry is enabled, your traces will dynamically rename from POST /graphql to graphql.operation.GetUsersQuery.

Plugin Hooks

Extend behavior without editing package internals:

from fastapiobserver import (
    register_log_enricher,
    register_log_filter,
    register_metric_hook,
)


def add_git_sha(payload: dict) -> dict:
    payload["git_sha"] = "abc123"
    return payload


def drop_health_probe(record) -> bool:
    return "health" not in record.getMessage().lower()


def track_slow_requests(request, response, duration):
    if duration > 1.0:
        print(f"slow request: {request.url.path} {duration:.2f}s")


register_log_enricher("git_sha", add_git_sha)
register_log_filter("drop_health_probe", drop_health_probe)
register_metric_hook("slow_requests", track_slow_requests)

Plugin failures are isolated and do not crash request handling.

Custom Metrics Backend Registry

Use register_metrics_backend() to plug in non-Prometheus backends without modifying core code:

from fastapiobserver import register_metrics_backend


class MyBackend:
    def observe(self, method, path, status_code, duration_seconds):
        ...

    def mount_endpoint(self, app, *, path="/metrics", metrics_format="negotiate"):
        # Optional: mount a backend-specific endpoint
        ...


def build_my_backend(*, service: str, environment: str, exemplars_enabled: bool):
    return MyBackend()


register_metrics_backend("my_backend", build_my_backend)

Formatter Dependency Injection

StructuredJsonFormatter accepts injectable callables for enrichment and sanitization, keeping defaults unchanged while improving testability:

formatter = StructuredJsonFormatter(
    settings,
    enrich_event=my_enricher,
    sanitize_payload=my_sanitizer,
)

OTel Test Coverage

Repository integration tests include:

tests/test_otel_log_correlation.py: verifies trace/span IDs in logs map to real spans.
tests/test_otlp_export_integration.py: validates OTLP HTTP export with local collector fixtures.

Benchmarking

Reproducible benchmark harness and methodology:

Guide: benchmarks.md
Apps: examples/benchmarks/app.py
Runner: examples/benchmarks/harness.py

Release Tracks

0.1.x: secure-by-default core
0.2.x: OTel interoperability, security presets, allowlists
0.3.x: GraphQL observability, error fingerprinting, and Logtail DLQ durability
0.4.x: package modularization, sink/registry hardening, and runtime control token rotation
1.0.x: first stable release contract for production deployments
1.2.0: tamper-evident audit logging and SQLAlchemy trace/commenter integration

Current release version: 1.2.0

Changelog Policy

Breaking changes must be listed under a Breaking Changes section in CHANGELOG.md.

Packaging and Publishing (Maintainers)

Recommended release command (uses .env with PYPI_TOKEN):

scripts/deploy_pypi.sh --tag v1.2.0 --push-tag

1) Build distributions

python -m pip install --upgrade pip build
python -m build

2) Upload to TestPyPI

python -m pip install --upgrade twine
python -m twine upload --repository testpypi dist/*

3) Validate install from TestPyPI

python -m pip install \
  --extra-index-url https://test.pypi.org/simple/ \
  fastapi-observer

4) Upload to production PyPI

python -m twine upload dist/*

Local Git Hook (Recommended)

git config core.hooksPath .githooks

The pre-push hook runs:

uv run ruff check
uv run mypy src
uv run pytest -q

Roadmap Tracking

See NEXT_STEPS.md for the active roadmap and release checklist.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.githooks		.githooks
.github		.github
examples		examples
scripts		scripts
src/fastapiobserver		src/fastapiobserver
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
NEXT_STEPS.md		NEXT_STEPS.md
README.md		README.md
benchmarks.md		benchmarks.md
kubernetes.md		kubernetes.md
loguru.md		loguru.md
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Uh oh!

License

Vitaee/FastapiObserver

Folders and files

Latest commit

History

Repository files navigation