-
Notifications
You must be signed in to change notification settings - Fork 209
Description
Proposal
Agent-SRE is an AI-native Site Reliability Engineering framework with SLI/SLO tracking, chaos testing, and canary deployments for AI agent systems (1,071+ tests).
Integration Opportunity
Logfire is the natural observability backend for agent-sre's telemetry. The integration would:
-
Export SLI Measurements as Logfire Spans - Each SLI measurement (latency, error rate, throughput) becomes a Logfire span with structured attributes, enabling Logfire's timeline view for SLO debugging.
-
Error Budget Burns as Logfire Alerts - When agent-sre detects error budget burn rate exceeding thresholds, emit Logfire events that integrate with Logfire's alerting.
-
Chaos Test Traces - Each chaos experiment (fault injection, latency injection, resource exhaustion) creates a parent span in Logfire, with child spans for each affected component.
-
Canary Deployment Visualization - Canary vs baseline metrics as side-by-side Logfire traces.
Proposed API
from agent_sre.exporters.logfire import LogfireExporter
import logfire
logfire.configure()
exporter = LogfireExporter()
# SLI measurements automatically appear in Logfire
slo_engine.add_exporter(exporter)
# Chaos tests create structured spans
with logfire.span('chaos-experiment', experiment_type='latency-injection'):
chaos_runner.execute(experiment)Context
We already have a related integration with PydanticAI (see pydantic-ai#4335). An agent-sre + Logfire integration completes the observability story: PydanticAI agents monitored by agent-sre, visualized in Logfire.
Our Agent-Hypervisor (v2.0) also has a structured event bus with 40+ event types that could export to Logfire.
Happy to submit a PR with an agent-sre-logfire exporter package.