Skip to content

Add chain-entry counter and dispatcher-level metrics#159

Merged
pavanputhra merged 1 commit into
mainfrom
pavankumar/con-522-bds-monitoring-alerts
May 11, 2026
Merged

Add chain-entry counter and dispatcher-level metrics#159
pavanputhra merged 1 commit into
mainfrom
pavankumar/con-522-bds-monitoring-alerts

Conversation

@pavanputhra
Copy link
Copy Markdown
Contributor

Summary

Three OTel metric additions in conserver/main.py to give a complete view of chain processing independent of per-backend instrumentation coverage.

  • conserver.main_loop.count_vcons_received{ingress_list} after r.blpop — pairs with the existing count_vcons_processed to compute a true completion ratio, and to detect a stall (received > 0 AND processed == 0).
  • conserver.storage.count{backend, outcome} and conserver.storage.duration_ms{backend} in _process_storage — uniform metrics across all storage backends, regardless of whether the backend's own code emits.
  • conserver.link.count{link_name, outcome} on both success and error paths of _process_link — same shape, dispatcher-level.

Existing per-backend metrics (milvus, supabase_direct, etc.) keep firing unchanged; the dispatcher metrics are additive.

Why

Today only ~3 of 16 storage backends and ~11 of 23 link backends emit error counters of their own. Wrapping at the dispatcher gives a single source of truth for "did this backend fail" without having to backfill instrumentation across 25 modules.

Test plan

  • Local unit tests pass
  • Verify new metric names appear in any OTLP-receiving collector after a real chain run

Three additions in conserver/main.py to give us a complete view of the chain
without depending on every backend remembering to self-instrument:

- count_vcons_received{ingress_list} after r.blpop, paired with the
  existing count_vcons_processed for a true completion ratio
- conserver.storage.count{backend, outcome} + duration_ms histogram in
  _process_storage, covering all backends uniformly
- conserver.link.count{link_name, outcome} on both success and error
  paths in _process_link

Existing per-backend metrics (milvus, supabase_direct, etc.) keep firing
unchanged - the dispatcher metrics are additive.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pavanputhra pavanputhra merged commit 4288999 into main May 11, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant