Skip to content

Add metrics endpoint and monitoring for event processing #94

@robertocarlous

Description

@robertocarlous

Summary

Provide observability: expose Prometheus-style metrics for event processing, agent loop health, processing latency, error counts, and DLQ size.

What to do

  • Add GET /metrics (or integrate prom-client) to expose counters/histograms:
    • events_processed_total
    • events_failed_total
    • event_processing_duration_seconds
    • dlq_size
    • event_cursor_lag
    • agent_loop_heartbeat_timestamp
  • Instrument startEventListener(), event processing pipeline, DLQ retries, and agent loop with metrics.
  • Add documentation on key metrics and recommended alerting thresholds.
  • Optionally add sample Grafana dashboard panels in docs.

Acceptance criteria

  • /metrics endpoint returns Prometheus-format metrics.
  • Key metrics are collected and documented.
  • Example alerts documented (e.g., dlq_size > 50, event_cursor_lag > X).

Metadata

Metadata

Assignees

Labels

Stellar WaveIssues in the Stellar wave program

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions