Skip to content

Add service observability and rollout metrics#355

Merged
kacy merged 8 commits intomainfrom
phase7-service-observability
Mar 29, 2026
Merged

Add service observability and rollout metrics#355
kacy merged 8 commits intomainfrom
phase7-service-observability

Conversation

@kacy
Copy link
Copy Markdown
Owner

@kacy kacy commented Mar 29, 2026

Summary

  • Add service observability Prometheus metrics for reconcile activity, BPF sync failures, health latency, endpoint flaps, and L7 proxy state.
  • Extend rollout observability via /v1/status?mode=service_rollout and /v1/metrics?format=prometheus.
  • Update the user guide to document the rollout observability surfaces and the new per-service metrics.
  • Stabilize proxy, steering, listener, registry, and status-metrics tests so the observability branch validates cleanly in this environment.

Validation

  • env YOQ_SKIP_SLOW_TESTS=1 ZIG_GLOBAL_CACHE_DIR=.zig-global-cache ZIG_LOCAL_CACHE_DIR=.zig-local-cache zig build test
  • Result: 1686 passed; 10 skipped; 0 failed

Review Notes

  • Please verify that the Prometheus metric names and labels are acceptable for downstream consumers.
  • Please verify that the health-gated endpoint behavior reflected in the status-metrics test is the intended runtime behavior.
  • Listener and steering tests were made sandbox-aware so constrained environments skip or behave deterministically instead of failing spuriously.

@kacy kacy merged commit 30d0c3b into main Mar 29, 2026
6 of 7 checks passed
@kacy kacy deleted the phase7-service-observability branch March 29, 2026 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant