Skip to content

feat: add red-blue correlation engine and learning system for investigation coverage#26

Merged
l50 merged 6 commits intomainfrom
jayson/cap-822-add-aws-ssm-remote-execution-enhance-bluered-tooling
Jan 11, 2026
Merged

feat: add red-blue correlation engine and learning system for investigation coverage#26
l50 merged 6 commits intomainfrom
jayson/cap-822-add-aws-ssm-remote-execution-enhance-bluered-tooling

Conversation

@l50
Copy link
Copy Markdown
Contributor

@l50 l50 commented Jan 11, 2026

Key Changes:

  • Introduced a red-blue correlation engine to assess detection coverage and gaps
  • Added investigation result persistence and learning system with SQLite backend
  • Implemented learning tools for querying past investigations and query effectiveness
  • Enhanced blue agent workflow with query rate limiting, improved evidence handling, and robust timeouts

Added:

  • Red-Blue Correlation Engine: New src/ares/core/correlation.py parses red and blue reports, correlates activities to detections, generates gap/coverage reports, and outputs markdown
  • Investigation Persistence: src/ares/core/persistence.py provides SQLite-backed storage for investigation results, query effectiveness stats, and similarity lookup
  • Learning Tools: src/ares/tools/blue/learning.py exposes tools for querying historical investigations, effective queries, and false positive patterns for agent learning
  • Query Resilience: src/ares/core/query_resilience.py adds automatic retry, time range reduction, and chunking for large log queries
  • Remote Command Execution: src/ares/core/remote.py enables AWS SSM-based remote execution for red team tools, with robust SSO credential validation
  • Comprehensive red-blue correlation and learning tests: tests/test_correlation.py, tests/test_persistence.py, tests/test_learning.py, tests/test_query_resilience.py
  • Query template tools: src/ares/tools/blue/query_templates.py provides pre-built LogQL queries mapped to MITRE techniques

Changed:

  • Blue agent now enforces strict query rate limiting (default 5 per investigation), duplicate query detection, and improved evidence extraction
  • Investigation orchestrator adds watchdog thread for hard timeouts and generates partial reports on timeout
  • Taskfile and documentation updated for new log/coverage workflow, reduced default max steps, and log management commands
  • Agent system instructions and investigation prompt templates improved for IOC extraction and anti-loop guidance
  • Red team agent tools now execute on remote Kali via SSM, with robust error handling and output parsing
  • Blue agent now records executed queries and integrates query effectiveness into persistence/learning
  • Added boto3 as a required dependency

Removed:

  • Legacy local subprocess usage for red team tools; all execution now via remote SSM executor
  • Unused aiobotocore/aioitertools dependencies from lockfile to resolve S3 compatibility

l50 added 5 commits January 9, 2026 19:19
…n query templates

**Added:**

- Introduced `src/ares/core/remote.py` for remote command execution on the Kali attack box via AWS SSM, including SSO credential validation, error handling, and a `run_remote` convenience function
- Added `QueryTemplateTools` to `src/ares/tools/blue/query_templates.py`, providing MITRE-mapped LogQL query templates for detecting red team attack patterns and AD attacks
- Registered `QueryTemplateTools` in blue team toolset and included in agent factory for investigation agent
- Added `boto3>=1.42.25` as a dependency for AWS API integration

**Changed:**

- Updated all red team network toolsets in `src/ares/tools/red/network.py` to execute commands remotely via SSM instead of subprocess, centralizing command execution and error handling
- Refactored Taskfile and documentation defaults: lowered polling mode steps to 50 and once mode steps to 15 for agent timeouts; clarified timeout behaviors in `README.md` and `docs/taskfile_usage.md`
- Updated AWS region defaults in `Taskfile.yaml` from `us-west-2` to `us-west-1`
- In red team orchestrator, added fail-fast SSO credential validation before starting operations
- Improved admin access finding validation in red team reporting to reject error-containing results and require success indicators
- Improved blue agent orchestrator with a hard signal-based timeout and robust MCP connection handling
- Registered new blue team tools and query templates in import/export lists
- Updated dependency and lock files (`pyproject.toml`, `uv.lock`) to add and pin `boto3` and compatible AWS packages, and remove unused aiobotocore/aioitertools
- Cleaned up subprocess error handling in red team tools, removing timeouts and local file usage in favor of remote SSM execution

**Removed:**

- Eliminated all local subprocess execution for red team operations in favor of SSM-based remote execution
- Removed unused and incompatible `aiobotocore` and `aioitertools` packages from lock file
…igation agent

**Added:**

- Introduced `WatchdogTimer` class for enforcing hard investigation timeout using
  a background thread, enabling forced exit and partial report generation even if
  the event loop is blocked

**Changed:**

- Replaced Unix-only signal-based hard timeout with cross-platform watchdog thread
  in `InvestigationOrchestrator`
- Updated timeout handling logic to use the new watchdog and improved partial
  report generation upon timeout
- Cleaned up code by removing signal handler setup and exception raising for
  timeout, delegating forced exit to the watchdog
- Adjusted logging to reflect new watchdog mechanism and clarify timeout events

**Removed:**

- Removed dependency on `signal` module and associated signal handler logic for
  timeouts
- Eliminated `InvestigationTimeoutError` usage and related exception handling
  from the orchestration flow
- Removed code for restoring old signal handlers and alarm cleanup, as they're
  no longer needed
…vestigation flow

**Added:**

- Introduced /logs/ directory for agent log files and updated .gitignore to exclude it
- Added log directory configuration and automatic log file creation for blue and red team
  tasks in Taskfile.yaml
- Implemented Taskfile log management tasks: list, tail (latest/all/blue/red), and clean
- Added log management usage docs to `docs/taskfile_usage.md`
- Created timeline event from alert at investigation start for improved reporting
- Added `reset_query_tracking()` and query counting utilities to blue_factory to enforce
  query and tool call limits per investigation
- Wrapped Grafana MCP query tools with rate limiting and duplicate query detection
- Added max queries/tool calls stop conditions to investigation agent
- Blue `record_evidence()` tool now resolves and caches MITRE technique names/tactics
- Red agent event logging now debounces rapid/duplicate events for cleaner logs
- Red team `secretsdump` tool now includes SMB connectivity check, dc_ip param, and
  connection timeouts

**Changed:**

- Default max_steps for blue investigation agent lowered from 150 to 30 for tighter
  control
- Updated all relevant blue and red team tasks to log to per-run logfiles in /logs/
- Blue team investigation flow now enforces strict query and tool call limits; agent is
  forced to complete if limits are hit
- Blue `complete_investigation()` tool now auto-extracts recommendations from alert
  annotations if none provided, generates fallback synopsis from evidence, and logs more
  completion details
- Enhanced evidence recording: technique metadata resolved and timeline event auto-added
  from alert
- Initial alert prompt and system instructions templates now emphasize query limits,
  correct IOC extraction, and completion criteria; anti-patterns highlighted
- Investigation docs and usage updated to clarify new stop conditions, log management,
  and completion requirements
- Improved blue investigation docs and templates to stress the importance of IOC
  extraction, evidence recording, and attack synopsis requirements

**Removed:**

- Removed unused/obsolete warnings and manual validations from blue completion tool
- Legacy query loop detection logic replaced by new global query/tool call limiters
…and query resilience

**Added:**

- Introduced a Red-Blue Correlation Engine for mapping red team activities to blue
  team detections, generating coverage metrics and detailed markdown reports
  (`src/ares/core/correlation.py`)
- Implemented a persistence layer for storing investigation results, tracking
  query effectiveness, and similarity-based lookup for new alerts
  (`src/ares/core/persistence.py`)
- Added query resilience module to provide automatic retry, time range reduction,
  and chunking for large queries to Loki/Prometheus backends
  (`src/ares/core/query_resilience.py`)
- Added `LearningTools` agent toolset to expose past investigation data,
  effective queries, false positive patterns, and statistics to the agent
  (`src/ares/tools/blue/learning.py`)
- Introduced workflow for generating and updating coverage badge in CI
  (`.github/workflows/coverage-badge.yaml`)
- Added static badge for code coverage to repo (`.github/badges/coverage.svg`)
- Added comprehensive test suites for correlation, learning, persistence, and
  query resilience modules (`tests/test_correlation.py`, `tests/test_learning.py`,
  `tests/test_persistence.py`, `tests/test_query_resilience.py`)

**Changed:**

- Extended `InvestigationOrchestrator` to persist all completed, escalated,
  timed out, and failed investigations for later learning and analysis
- Updated query tool wrapping in `blue_factory.py` to integrate rate limiting,
  duplicate detection, and resilient execution via the new resilience module
- Added `LearningTools` to agent toolset for blue investigations
- Updated `.pre-commit-config.yaml` to exclude `tests/` from mypy type checks
- Modified test workflow to output coverage as XML and upload coverage artifact
  for badge generation (`.github/workflows/tests.yaml`)
- Updated `src/ares/tools/blue/__init__.py` to export new learning tools
- Various code comments and docstrings cleaned up for clarity and conciseness

**Removed:**

- None
**Changed:**

- Refactored LearningTools to use a public `store` attribute instead of a private
  `_store` with property logic, simplifying initialization and access
- Replaced all direct store accesses with a `get_store()` method to ensure
  store is initialized when needed
- Updated tests to use the public `store` attribute and `get_store()` method,
  reflecting the new initialization and access pattern
- Improved class and attribute documentation for clarity
@linear
Copy link
Copy Markdown

linear Bot commented Jan 11, 2026

CAP-822 Add AWS SSM Remote Execution & Enhance Blue/Red Tooling

Description:
Implement AWS SSM-based remote command execution, expand blue team Grafana query templates, and improve both blue and red team tools. This upgrade aims to streamline SOC investigations, strengthen network scanning for red teams, and clarify system configuration defaults.


Objective:

Enable remote command execution on EC2 instances via AWS SSM, enhance blue team investigation capabilities with new query templates and SOC tools, improve red team network scanning, and update configuration defaults and documentation for better usability.


Scope of Work:

  • Implement remote.py module to support AWS SSM command execution on remote EC2 (Kali attack box)
  • Create and integrate Grafana query templates in query_templates.py for blue team use
  • Enhance soc_investigator.py to improve SOC investigation workflows
  • Update actions.py with new or improved blue team investigation actions
  • Improve red team network scanning capabilities in network.py
  • Update default configuration (max-steps) and document timeout behavior in README, Taskfile.yaml, and docs

Dependencies:

  • AWS IAM permissions for SSM access
  • EC2 instances registered with AWS Systems Manager
  • Grafana environment for query template integration
  • None identified beyond above

Acceptance Criteria:

  1. remote.py enables successful execution of remote shell commands on registered EC2 instances via AWS SSM.
  2. Blue team Grafana query templates are present, well-documented, and usable within investigations.
  3. SOC investigator tool improvements are functional and demonstrably enhance investigation workflows.
  4. Red team network scanning tool updates are implemented and tested.
  5. Configuration defaults (max-steps) and timeout behaviors are clearly updated in documentation.
  6. All changes are reflected in the relevant source files and documentation.

Additional Notes:


@dreadnode-renovate-bot dreadnode-renovate-bot Bot added area/docs Changes made to project documentation area/python area/pre-commit Changes made to pre-commit hooks labels Jan 11, 2026
@dreadnode-renovate-bot dreadnode-renovate-bot Bot added area/templates Changes made to warpgate template configurations area/github Changes made to GitHub Actions workflows type/core labels Jan 11, 2026
@l50 l50 merged commit 5454ff0 into main Jan 11, 2026
8 checks passed
@l50 l50 deleted the jayson/cap-822-add-aws-ssm-remote-execution-enhance-bluered-tooling branch January 11, 2026 01:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/docs Changes made to project documentation area/github Changes made to GitHub Actions workflows area/pre-commit Changes made to pre-commit hooks area/templates Changes made to warpgate template configurations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant