Skip to content

test: add tests for token usage extraction in blue and red worker agents#152

Merged
l50 merged 5 commits intomainfrom
chore/cleanup-and-fixes
Mar 17, 2026
Merged

test: add tests for token usage extraction in blue and red worker agents#152
l50 merged 5 commits intomainfrom
chore/cleanup-and-fixes

Conversation

@l50
Copy link
Copy Markdown
Contributor

@l50 l50 commented Mar 17, 2026

Key Changes:

  • Added comprehensive tests for token usage extraction in blue and red worker agents
  • Verified correct inclusion of usage metrics in result payloads for various scenarios
  • Ensured graceful handling of edge cases such as missing or malformed usage data
  • Improved test coverage for new metrics/tracing features

Added:

  • BlueRedisWorkerAgent usage extraction tests:
    • Added TestBlueRedisWorkerAgentUsageExtraction test class to
      tests/core/blue_worker/test_redis_worker.py with tests for:
      • Including usage metrics in normal and partial results
      • Handling of missing or large usage values
      • Correct extraction and result payload formatting
  • RedisWorkerAgent usage extraction tests:
    • Added TestRedisWorkerAgentUsageExtraction test class to
      tests/core/worker/test_worker.py covering:
      • Extraction of usage metrics from agent results with various token values
      • Handling of absent or exception-throwing usage attributes
      • Verification of default and edge cases for token counting

Changed:

  • Expanded test files to cover new code paths introduced by metrics/tracing changes
  • Improved mocking and assertions to validate presence, absence, and correctness of
    usage fields in agent results
  • Enhanced docstrings and comments in new tests for clarity and maintainability

…er agents

**Added:**

- Added extraction of token usage metrics (input_tokens, output_tokens, total_tokens)
  from agent results in BlueRedisWorkerAgent, BlueWorkerAgent, RedisWorkerAgent,
  and WorkerAgent. These metrics are included in the result payload for
  downstream aggregation and tracing.
- Instrumented OpenTelemetry/Tempo spans with GenAI semantic conventions for
  token usage in worker agents.
- Introduced unit tests in both `tests/core/blue_worker/test_redis_worker.py`
  and `tests/core/worker/test_worker.py` to verify token usage extraction and
  result payload inclusion for both blue and red worker agents.

**Changed:**

- Updated blue and red worker agent task processing logic to check for agent
  result usage metrics and include them in the result payload, whether the task
  completes via callback or partial completion.
- Improved docstrings for helper and internal methods throughout the codebase to
  clarify argument and return types, increasing maintainability and clarity.
- Refactored the evaluation workflow to aggregate token usage statistics from
  completed task results when available, falling back to estimation only if no
  usage data is present.
- Enhanced logging and metrics reporting for token usage in orchestrators and
  workers.

**Removed:**

- Removed redundant or outdated comments where code and docstrings are now
  self-explanatory.
- Eliminated placeholder or non-functional code for attack chain reconstruction
  and other "future work" comments, replacing with concise markers for
  unimplemented features.
l50 added 4 commits March 17, 2026 12:24
**Changed:**

- Removed explanatory comments that repeated obvious code logic in several modules,
  including agent initialization, prompt building, and data extraction routines
- Improved code readability and maintainability by reducing noise and focusing
  on non-obvious operations
**Changed:**

- Removed explanatory and redundant inline comments that simply restate code logic
  across multiple modules, including blue agent tools, investigation orchestration,
  CLI commands, core models, recovery routines, task queue, tracing, workflows,
  evaluation logic, and blue team tools
- Improved code readability and reduced noise by eliminating comments that do not
  provide additional context or rationale beyond what the code already expresses
**Changed:**

- Removed redundant or obsolete commented-out code throughout multiple modules
  to improve code readability and maintainability. This includes eliminating
  comments that merely restate the following code or describe obvious steps,
  focusing on reducing visual clutter in the codebase.
@l50 l50 merged commit 9e98a77 into main Mar 17, 2026
7 checks passed
@l50 l50 deleted the chore/cleanup-and-fixes branch March 17, 2026 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant