Skip to content

test: improve tracing span attribution and add span regression tests#264

Merged
l50 merged 10 commits into
mainfrom
feat/telemetry-tracing-test-regressions
May 8, 2026
Merged

test: improve tracing span attribution and add span regression tests#264
l50 merged 10 commits into
mainfrom
feat/telemetry-tracing-test-regressions

Conversation

@l50
Copy link
Copy Markdown
Contributor

@l50 l50 commented May 7, 2026

Key Changes:

  • Enhanced tracing span attribution with proper op.id and task.id separation
  • Instrumented automation and dispatcher task submission with correlated spans
  • Added comprehensive regression tests to verify span emission and attributes
  • Introduced tracing-test and custom span capture utilities for testing

Added:

  • Regression tests for span emission and correctness in ares-llm/tests/span_regressions.rs
  • Common test helpers and a custom SpanCapture tracing subscriber in ares-llm/tests/common/
  • tracing-test and tracing-subscriber dependencies for enhanced test coverage and span capture

Changed:

  • Instrumented all automation dispatcher task builder methods in ares-cli with #[instrument] to emit detailed spans for each automation action
  • Updated throttled_submit and submission flow in ares-cli to record decision and task IDs in spans, ensuring correlation across async boundaries
  • Modified spawn_automation_tasks to wrap spawned automation tasks in their own root spans with automation.kind for better trace correlation in Tempo
  • Enhanced telemetry span builders in ares-core to support both op.id and task.id attributes and propagate them throughout the operation and agent task lifecycle
  • Refactored agent loop in ares-llm to emit a single parent span per agent task, with all child spans inheriting correct operation and task context
  • Updated all internal tracing helpers in ares-core to accept and propagate both operation_id and task_id
  • Bumped actions/upload-artifact and actions/setup-go versions in GitHub Actions workflows for security and compatibility
  • Updated several dependencies in Cargo.lock, including windows-sys, and added tracing-related crates for test instrumentation

Removed:

  • Orphaned parent context for spawned automation tasks—now every automation task and dispatcher submission is properly correlated in tracing backends

l50 and others added 5 commits May 7, 2026 11:51
The runner emitted no parent span for an agent loop run, so tool spans
landed as orphan siblings in Tempo: a single failing task could not be
recovered as one trace, only as a fan of unrelated spans. Worse, four
call sites passed `task_id` into the helpers' `operation_id` parameter
(runner.rs:311 et al.), so the `attack_operation_id` field on tool
spans actually held a task ID — silently breaking any dashboard that
filtered by operation.

- Open `agent.loop` once at the top of run_agent_loop, carrying op.id,
  task.id, agent.role, agent.model. Move the existing body into
  run_agent_loop_inner and instrument the future with that span; every
  child span now inherits the parent.
- Add `task_id` to AgentSpanBuilder + the four trace_* helpers and
  emit `op.id` / `task.id` alongside the existing
  `attack_operation_id` field for back-compat with current dashboards.
- Fix the four runner.rs sites that were passing task_id where
  operation_id was expected, and pass real op.id and task.id to the
  worker discovery span and the orchestrator domain-admin span.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…tokens

Without per-call attribution there was no way to tell whether a slow
agent loop was burning time inside provider.chat (network/LLM) or
between calls (tool dispatch, queue waits). Token spend was visible
only on the session log, not in Tempo traces.

Open one `llm.call` span per retry attempt around `provider.chat`.
After the call returns, record duration_ms plus the four token-usage
counters and stop_reason as span attributes; on error, record the
formatted error. Each retry gets its own span so a 429 backoff does
not inflate the duration attributed to the eventual successful call.
The span also carries `task.id`, `llm.model`, `llm.attempt`, and the
request shape (tool/message counts) so Tempo queries can isolate
slow calls without joining other spans.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… and task builders

**Added:**

- Added `tracing::info_span` and `.instrument()` to automation task spawning for correlating spans by `automation.kind` in `automation_spawner.rs`
- Added `tracing::info_span` to `Dispatcher::throttled_submit` and propagated span context to inner calls for dispatch, throttle, and defer decisions in `submission.rs`
- Added detailed span field recording for throttle decisions and task IDs in `submission.rs`
- Added `#[instrument]` attribute with relevant fields to all public task request methods in `task_builders.rs` to improve observability of automation requests

**Changed:**

- Refactored task submission logic in `Dispatcher` to utilize structured spans, enabling improved tracing and debugging of automation task lifecycles
- Updated imports to include necessary `tracing` macros and types in affected modules
**Added:**

- Added `tracing-test` and `tracing-subscriber` as dev dependencies in `ares-llm/Cargo.toml` and updated `Cargo.lock`
- Introduced `ares-llm/tests/common/mod.rs` and `span_capture.rs` providing a test-only tracing layer and helpers to capture and inspect spans for integration testing
- Added `ares-llm/tests/span_regressions.rs` with regression tests to verify the presence and correctness of tracing spans and their attributes in the agent loop, covering op/task ID separation, token usage, error reporting, and retry attempt span emission

**Changed:**

- Updated `windows-sys` dependency from 0.48.0 to 0.61.2 in `Cargo.lock`
@codecov
Copy link
Copy Markdown

codecov Bot commented May 7, 2026

Codecov Report

❌ Patch coverage is 61.53846% with 50 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.29%. Comparing base (205ae6f) to head (763eb08).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
ares-cli/src/orchestrator/dispatcher/submission.rs 0.00% 41 Missing ⚠️
ares-cli/src/orchestrator/automation_spawner.rs 0.00% 3 Missing ⚠️
ares-llm/src/agent_loop/runner.rs 94.00% 3 Missing ⚠️
ares-core/src/telemetry/spans/helpers.rs 84.61% 2 Missing ⚠️
ares-cli/src/worker/tool_executor.rs 0.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #264      +/-   ##
==========================================
+ Coverage   75.11%   75.29%   +0.17%     
==========================================
  Files         383      384       +1     
  Lines       81492    81639     +147     
==========================================
+ Hits        61215    61470     +255     
+ Misses      20277    20169     -108     
Files with missing lines Coverage Δ
...s-cli/src/orchestrator/dispatcher/task_builders.rs 0.00% <ø> (ø)
...li/src/orchestrator/state/publishing/milestones.rs 94.89% <100.00%> (+0.07%) ⬆️
ares-core/src/telemetry/spans/builder.rs 86.76% <100.00%> (+0.71%) ⬆️
ares-core/src/telemetry/spans/mod.rs 95.83% <100.00%> (+0.43%) ⬆️
ares-cli/src/worker/tool_executor.rs 53.02% <0.00%> (-0.13%) ⬇️
ares-core/src/telemetry/spans/helpers.rs 75.86% <84.61%> (+0.86%) ⬆️
ares-cli/src/orchestrator/automation_spawner.rs 0.00% <0.00%> (ø)
ares-llm/src/agent_loop/runner.rs 62.92% <94.00%> (+2.88%) ⬆️
ares-cli/src/orchestrator/dispatcher/submission.rs 0.00% <0.00%> (ø)

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

l50 added 5 commits May 7, 2026 20:52
**Changed:**

- Reformatted the `info_span!` macro call in `spawn_automation_tasks` for improved code consistency and readability by moving parameters onto a single line
…and agent loop

**Changed:**

- Simplified the creation of `info_span!` macro invocation by removing unnecessary newlines and formatting in `automation_spawner.rs`
- Updated several calls in `runner.rs` to pass `op_id` directly instead of as a reference, streamlining argument types and reducing unnecessary references
- Reformatted line wrapping in `span_capture.rs` for improved readability without changing logic
**Added:**

- Introduced `codecov.yml` to configure coverage thresholds, status checks, and comment behavior for automated code coverage reporting in CI
…mentation' into feat/telemetry-tracing-test-regressions
@l50 l50 merged commit 8f68fa1 into main May 8, 2026
11 checks passed
@l50 l50 deleted the feat/telemetry-tracing-test-regressions branch May 8, 2026 03:41
l50 added a commit that referenced this pull request May 9, 2026
…264)

**Key Changes:**

- Enhanced tracing span attribution with proper `op.id` and `task.id`
separation
- Instrumented automation and dispatcher task submission with correlated
spans
- Added comprehensive regression tests to verify span emission and
attributes
- Introduced `tracing-test` and custom span capture utilities for
testing

**Added:**

- Regression tests for span emission and correctness in
`ares-llm/tests/span_regressions.rs`
- Common test helpers and a custom `SpanCapture` tracing subscriber in
`ares-llm/tests/common/`
- `tracing-test` and `tracing-subscriber` dependencies for enhanced test
coverage and span capture

**Changed:**

- Instrumented all automation dispatcher task builder methods in
`ares-cli` with `#[instrument]` to emit detailed spans for each
automation action
- Updated `throttled_submit` and submission flow in `ares-cli` to record
decision and task IDs in spans, ensuring correlation across async
boundaries
- Modified `spawn_automation_tasks` to wrap spawned automation tasks in
their own root spans with `automation.kind` for better trace correlation
in Tempo
- Enhanced telemetry span builders in `ares-core` to support both
`op.id` and `task.id` attributes and propagate them throughout the
operation and agent task lifecycle
- Refactored agent loop in `ares-llm` to emit a single parent span per
agent task, with all child spans inheriting correct operation and task
context
- Updated all internal tracing helpers in `ares-core` to accept and
propagate both `operation_id` and `task_id`
- Bumped `actions/upload-artifact` and `actions/setup-go` versions in
GitHub Actions workflows for security and compatibility
- Updated several dependencies in `Cargo.lock`, including `windows-sys`,
and added tracing-related crates for test instrumentation

**Removed:**

- Orphaned parent context for spawned automation tasks—now every
automation task and dispatcher submission is properly correlated in
tracing backends

---------
l50 added a commit that referenced this pull request May 9, 2026
…264)

**Key Changes:**

- Enhanced tracing span attribution with proper `op.id` and `task.id`
separation
- Instrumented automation and dispatcher task submission with correlated
spans
- Added comprehensive regression tests to verify span emission and
attributes
- Introduced `tracing-test` and custom span capture utilities for
testing

**Added:**

- Regression tests for span emission and correctness in
`ares-llm/tests/span_regressions.rs`
- Common test helpers and a custom `SpanCapture` tracing subscriber in
`ares-llm/tests/common/`
- `tracing-test` and `tracing-subscriber` dependencies for enhanced test
coverage and span capture

**Changed:**

- Instrumented all automation dispatcher task builder methods in
`ares-cli` with `#[instrument]` to emit detailed spans for each
automation action
- Updated `throttled_submit` and submission flow in `ares-cli` to record
decision and task IDs in spans, ensuring correlation across async
boundaries
- Modified `spawn_automation_tasks` to wrap spawned automation tasks in
their own root spans with `automation.kind` for better trace correlation
in Tempo
- Enhanced telemetry span builders in `ares-core` to support both
`op.id` and `task.id` attributes and propagate them throughout the
operation and agent task lifecycle
- Refactored agent loop in `ares-llm` to emit a single parent span per
agent task, with all child spans inheriting correct operation and task
context
- Updated all internal tracing helpers in `ares-core` to accept and
propagate both `operation_id` and `task_id`
- Bumped `actions/upload-artifact` and `actions/setup-go` versions in
GitHub Actions workflows for security and compatibility
- Updated several dependencies in `Cargo.lock`, including `windows-sys`,
and added tracing-related crates for test instrumentation

**Removed:**

- Orphaned parent context for spawned automation tasks—now every
automation task and dispatcher submission is properly correlated in
tracing backends

---------
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant