feat: add agent decision reasoning extraction and tracing for tool selection#153
Merged
feat: add agent decision reasoning extraction and tracing for tool selection#153
Conversation
**Added:** - Decision reasoning extraction helpers (`_extract_reasoning_text`, `_estimate_confidence`) for capturing agent LLM reasoning behind tool selection - Hook to log and persist agent tool selection decisions with reasoning, confidence, and OTel span correlation for traceability - TimelineEvent `extra_data_json` field to store structured metadata (e.g., decision details) and `extra_data` property for deserialization - `trace_decision` function to record agent decision spans with tool, reasoning, confidence, and MITRE/attack category enrichment for Tempo/OTel - Multi-forest helpers to track DA status per trusted domain/forest (`all_forests_dominated`, `get_undominated_forests`) for red team ops - Tests for decision reasoning helpers and decision tracing **Changed:** - Role-specific agent instruction loading now injects multi-forest context (`multi_forest_mode`, `undominated_forests`) for template rendering - Timeline event serialization now consistently uses `to_dict()` and ensures ISO timestamp formatting for backend storage - DA hash acquisition now updates per-domain DA tracking for multi-forest ops and only sets global DA flag on first acquisition - TimelineEvent `to_dict()` now outputs `extra_data` as a dict for storage if present **Removed:** - Legacy manual TimelineEvent dict construction in publishing, replaced with `to_dict()` for consistency
**Added:** - Introduced `multi_forest_mode` option to operation configuration to allow operations to continue until domain admin is achieved on all trusted forests - Added `get_multi_forest_mode()` function to access the new config option **Changed:** - Updated internal config builder to support the new `multi_forest_mode` field **Removed:** - None
…est environments **Changed:** - Updated SharedRedTeamState logic to ensure domain admin achievement is only logged, weaknesses added, and OpenTelemetry traces emitted once per domain, supporting multi-forest scenarios and preventing duplicate signals for the same domain - src/ares/core/models.py - Improved unit test to verify that traces for domain admin achievement are emitted once for each unique domain, and not duplicated for repeated hashes from the same domain - tests/core/dispatcher/test_dispatcher.py - Fixed test monkeypatch for loading agent instructions to accept generic kwargs, preventing argument errors in agent factory tests - tests/core/factories/test_red_agents.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Key Changes:
Added:
_extract_reasoning_textand_estimate_confidenceto parse LLM responses and estimate confidence heuristicallycreate_role_hooksto capture agent reasoning and tool selection, storing results in both the operation timeline and Redis for crash recoverytrace_decision, recording tool selection decisions and reasoning for post-hoc analysisextra_data_jsonfield for structured, JSON-encoded decision metadataSharedRedTeamStateto identify and track undominated forests for multi-domain operationstest_red_agents.pyandtest_tracing.pyChanged:
to_dict(), with consistent timestamp formatting and extra_data handlingRemoved:
to_dict()call for consistency