Skip to content

feat(profiling): implement Phase 2 deep instrumentation (#2851-#2857)#2869

Merged
bug-ops merged 3 commits intomainfrom
profiling-deep-instrumentation
Apr 10, 2026
Merged

feat(profiling): implement Phase 2 deep instrumentation (#2851-#2857)#2869
bug-ops merged 3 commits intomainfrom
profiling-deep-instrumentation

Conversation

@bug-ops
Copy link
Copy Markdown
Owner

@bug-ops bug-ops commented Apr 10, 2026

Summary

Phase 2 of the profiling and tracing system (epic #2846). Adds spans to all 5 subsystem crates, introduces InstrumentedChannel<T> wrappers for channel-level metrics, and implements MetricsBridge — a custom tracing layer that derives TurnTimings from span durations.

  • 45+ new span sites across zeph-memory, zeph-tools, zeph-skills, zeph-mcp, zeph-channels
  • InstrumentedSender<T> / InstrumentedReceiver<T> / InstrumentedUnboundedSender<T> (zero-cost when profiling disabled)
  • MetricsBridge layer with on_enter/on_exit accumulation for correct async timing
  • All instrumentation feature-gated: zero binary overhead without --features profiling
  • Security audit: 0 findings — all 45+ spans use skip_all, no PII in span fields

Closes

Closes #2851, #2852, #2853, #2854, #2855, #2856, #2857
Part of #2846

Test plan

  • cargo +nightly fmt --check — PASS
  • cargo clippy --workspace --features profiling -- -D warnings — PASS
  • cargo build --workspace (no profiling) — PASS
  • cargo build --workspace --features profiling — PASS
  • cargo nextest run --workspace --features profiling --lib --bins — 7811 passed, 17 skipped
  • Live session: start agent with [telemetry] enabled = true, backend = "local", verify spans visible in Perfetto UI (see .local/testing/playbooks/profiling.md)

@github-actions github-actions Bot added documentation Improvements or additions to documentation skills zeph-skills crate memory zeph-memory crate (SQLite) channels zeph-channels crate (Telegram) rust Rust code changes core zeph-core crate dependencies Dependency updates enhancement New feature or request size/XL Extra large PR (500+ lines) labels Apr 10, 2026
Add profiling spans to all 5 subsystem crates and new core infrastructure
for channel metrics and span-derived timings.

Subsystem spans (all feature-gated, zero overhead when profiling disabled):
- zeph-memory: memory.remember, memory.recall, memory.summarize,
  memory.graph_extract, memory.persona_extract, memory.cross_session,
  memory.vector_store (BoxFuture via .instrument()), memory.admission,
  memory.compaction_probe, memory.forgetting, memory.consolidation
- zeph-tools: tool.execute_call, tool.shell, tool.file, tool.web_scrape,
  tool.search_code, tool.diagnostics; send_chunk excluded (hot path)
- zeph-skills: skill.registry_load, skill.match, skill.matcher_sync,
  skill.qdrant_match, skill.generate, skill.hot_reload, skill.bm25_search
- zeph-mcp: mcp.connect, mcp.connect_url, mcp.list_tools, mcp.call_tool,
  mcp.shutdown, mcp.tool_refresh, mcp.connect_all, mcp.shutdown_all
- zeph-channels: channel.cli.recv/send, channel.telegram.recv/send

New InstrumentedChannel infrastructure (zeph-core):
- InstrumentedSender<T> / InstrumentedReceiver<T> (bounded mpsc)
- InstrumentedUnboundedSender<T> (unbounded, sent count only)
- Applied to skill_reload_rx and config_reload_rx in bootstrap
- AtomicU64 counters, queue depth sampled every 16th send

New MetricsBridge tracing layer (zeph-core):
- Custom Layer using on_enter/on_exit accumulation for correct async timing
- Watches agent.prepare_context, llm.chat, agent.tool_loop, agent.persist_message
- Derives TurnTimings from span durations, replaces manual Instant::now()
  bookkeeping when profiling is active
- on_new_span uses attrs.metadata().name() (zero-cost) before registry lock

Closes #2851, #2852, #2853, #2854, #2855, #2856, #2857
@bug-ops bug-ops force-pushed the profiling-deep-instrumentation branch from fa0f127 to c9d0544 Compare April 10, 2026 21:49
@bug-ops bug-ops enabled auto-merge (squash) April 10, 2026 21:49
@bug-ops bug-ops merged commit b2e7df0 into main Apr 10, 2026
30 checks passed
@bug-ops bug-ops deleted the profiling-deep-instrumentation branch April 10, 2026 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

channels zeph-channels crate (Telegram) core zeph-core crate dependencies Dependency updates documentation Improvements or additions to documentation enhancement New feature or request memory zeph-memory crate (SQLite) rust Rust code changes size/XL Extra large PR (500+ lines) skills zeph-skills crate

Projects

None yet

1 participant