Skip to content

refactor: centralize detection config and lateral movement logic in ares-core#214

Merged
l50 merged 11 commits intomainfrom
fix/blue-detection-gaps
Apr 17, 2026
Merged

refactor: centralize detection config and lateral movement logic in ares-core#214
l50 merged 11 commits intomainfrom
fix/blue-detection-gaps

Conversation

@l50
Copy link
Copy Markdown
Contributor

@l50 l50 commented Apr 16, 2026

Key Changes:

  • Moved YAML-driven detection configuration and logic from ares-tools to ares-core
  • Converted all lateral movement mappings, patterns, and MITRE technique lookups to be YAML-driven
  • Refactored query building and pattern detection to use shared config and types
  • Improved host/computer label handling and made LogQL selectors more accurate

Added:

  • Shared detection config module in ares-core/src/detection/mod.rs for loading and querying detections.yaml
  • Centralized ares-core/src/detection/detections.yaml with lateral movement patterns, template connection types, and MITRE mappings
  • New tests for lateral pattern loading, brute force query logic, and MSSQL template resolution in ares-tools

Changed:

  • Lateral movement analyzer and patterns now load connection types and regexes from YAML config, eliminating hardcoded match arms
  • MITRE technique mapping for connection types now derives from template YAML, with hardcoded values as fallback only
  • Refactored detection config usage in ares-tools to re-export types from ares-core
  • LogQL query builder and detection template lookup in ares-tools now use shared types and support negative/exclusion patterns
  • Changed LogQL label selection to use the computer label instead of hostname for greater accuracy and coverage
  • Refined suggested queries in investigation tools and reports to use correct job labels and computer names
  • Added more accurate event ID, pattern, and connection type metadata to detection templates
  • Improved report generator to aggregate MITRE techniques from both state and evidence items

Removed:

  • All hardcoded detection patterns, connection type match arms, and technique ID maps in ares-core lateral movement modules
  • Redundant detection config types and logic from ares-tools in favor of using the shared ares-core module
  • Unnecessary use of Box::leak for static string lifetimes in lateral pattern code

l50 added 8 commits April 16, 2026 16:14
…tern support

**Added:**

- Moved and unified detection configuration (types, YAML, MITRE mapping, and
  template helpers) into a new `ares-core/src/detection/` module, exposing
  shared logic and config for both ares-core and ares-tools
- Added `lateral_patterns` to `detections.yaml` for regex-driven connection
  type classification, including new types: mssql, constrained_delegation,
  ntlm_relay
- Implemented dynamic mapping of connection types to MITRE technique IDs and
  relevant detection templates in ares-core
- Added new detection templates for SMB signing disabled, mass share
  enumeration, MSSQL linked server, MSSQL xp_cmdshell, MSSQL impersonation,
  and improved S4U constrained delegation and RBCD detection
- Included negative pattern support (`exclude_patterns`) in detection config
  and LogQL builder to reduce noise and improve accuracy

**Changed:**

- Lateral movement analyzer and graph modules now use dynamic MITRE mapping
  and pattern logic from detection config, removing hardcoded technique maps
- Lateral pattern regexes are loaded from YAML at runtime, not hardcoded in
  Rust, enabling easier extension and updates without code changes
- LogQL query builder now uses `computer=~"<host>"` instead of `hostname=~`
  for host filtering, matching label conventions in Loki
- All detection template lookups and config access in ares-tools now use the
  shared core logic, reducing duplication and ensuring consistent behavior
- Blue team reporting merges MITRE techniques from both state and evidence
  items to avoid undercounting and improve reporting accuracy
- Updated tests and detection template assertions to cover new templates,
  lateral patterns, and negative pattern handling
- Loki query logic now retries on transient failures (timeouts, 429, 502,
  503, 504) with exponential backoff for better reliability

**Removed:**

- Removed all hardcoded lateral movement regex patterns and technique ID
  mappings from Rust source; these are now sourced from `detections.yaml`
- Deleted the old `ares-tools/src/blue/detection/detections.yaml` in favor of
  the unified and extended `ares-core/src/detection/detections.yaml`
```
…ogic bugs

**Added:**

- Introduced `connection_types` field to detection templates for YAML-driven
  lateral movement logic, reducing code duplication and hardcoding
- Added `lateral_patterns_load_from_yaml` and `brute_force_no_host_line_filter`
  tests to improve test coverage of YAML loading and brute force detection

**Changed:**

- Refactored ares-tools to enable the `blue` feature only when specified,
  preventing feature leakage to dependents via Cargo feature unification
- Updated ares-core build dependencies to use workspace versions of serde and
  serde_yaml for consistency and maintainability
- Eliminated unnecessary `Box::leak` and `leak_str` usage, since detection
  config is already `'static`
- Rewrote `templates_for_connection_type` to filter templates by the new
  `connection_types` field, replacing a 43-line match with a concise filter
- Refactored `mitre_for_connection_type` to use YAML as the authoritative
  source for MITRE technique IDs, falling back to hardcoded values only when
  not defined in YAML
- Corrected MITRE technique IDs for several detection templates to match their
  actual behaviors
- Narrowed exclude pattern for `detect_s4u_delegation` to avoid false negatives
  with hyphenated SPNs
- Set `host_as_filter: false` for `detect_brute_force` to prevent false
  negatives on Windows events lacking computer names in log bodies
- Changed `detect_mass_share_enumeration` to set `auto_pivot: false`, reducing
  noisy pivot triggers from single events
- Updated all investigation query suggestions to use correct job labels
  (`windows-security` or `windows-system`) and the appropriate LogQL selectors
- Improved retry logic in Loki queries to respect the `Retry-After` header on
  429 responses, and treat body read failures as retryable instead of fatal
- Fixed doc comment for Loki retry logic to match actual behavior (2 retries,
  delays of 1s and 2s)

**Removed:**

- Removed the unconditional `blue` feature from the ares-tools dependency on
  ares-core, allowing feature flags to be controlled per crate
- Deleted obsolete `leak_str()` function and all manual string leaking related
  to detection template lifetime management
- Removed `detect_mass_share_enumeration` from the auto_pivot template test
  list, since it no longer auto-triggers pivots
…atting

**Added:**

- Logic to truncate long hash values for display in reports, preserving the full
  value for data but showing only a preview (first 32 and last 16 characters)
- `details_list` field in vulnerability context for rendering individual detail
  items as bullet points in reports
- Tests for hash truncation logic and vulnerability details list generation

**Changed:**

- Blue team and red team report directories restructured for consistency; all
  blue operation and investigation reports are now saved under
  `<output_dir>/blue/` and `<output_dir>/blue/investigations/` respectively,
  and all red team reports under `<output_dir>/red/` (across CLI, orchestrator,
  and Taskfiles)
- Blue team and red team report file naming standardized to use
  `<op_id>.md` or `<inv_id>.md` instead of previous patterns
- Taskfiles updated to reflect new directory structure and file naming for
  listing, showing, and cleaning reports for both blue and red teams
- Correlator logic enhanced to recognize current, intermediate, and legacy
  report layouts for backward compatibility during report loading
- Red team comprehensive report template updated to display all hashes in a
  table format, using a truncated hash preview for readability, and to show
  vulnerability details as bullet points instead of a single string

**Removed:**

- Legacy and intermediate report directory/file patterns from current report
  generation (retained only for backward compatibility in loading)
fix: increase blue tool and loki query timeouts for improved reliability
**Changed:**

- Increased the default blue tool execution timeout from 120 to 300 seconds to
  allow longer-running queries in orchestrator sub-agent logic
- Increased the default HTTP client timeout for Loki queries from 30 to 90
  seconds to reduce failures on slow responses
```
…uery guidance

**Added:**

- Insert explicit CRITICAL guidance on Loki query batching and filtering in blue team
  agent documentation, emphasizing use of `execute_parallel_queries`, event ID
  regex, short time windows, and result limits

**Changed:**

- Clamp `hours_back` to a maximum of 2 in blue detection tools to prevent timeouts
- Lower default and maximum query limits (50 per query, max 100 in direct queries)
- Set default time window to 15 minutes for `query_logs_around_timestamp`
- Reduce allowed queries (5 max, 2 concurrent) and retries (2) in parallel Loki queries
- Reject Loki queries without line filters (|=, |~) to prevent expensive timeouts
- Update tool definitions and inline docstrings to reflect new usage, timeouts,
  and query performance recommendations
- Extend blue tool and investigation timeouts to accommodate longer query times
  (agent: 30–35min, individual tool: 10min)
- Revise all blue team agent documentation (triage, threat hunter, lateral analyst)
  to give stricter, explicit instructions on query batching, label usage,
  deployment labels, and time window management

**Removed:**

- Eliminated allowance for large `hours_back` windows and bare selector queries
  in detection and Loki logic to prevent timeouts through the Grafana proxy
**Added:**

- Added support for exporting OTEL traces via HTTP/protobuf by checking
  `OTEL_EXPORTER_OTLP_PROTOCOL` environment variable in telemetry init
- Integrated Dreadnode Platform by exporting OTEL traces and setting related
  environment variables when `DREADNODE_API_KEY` is present in EC2 Taskfile

**Changed:**

- Updated telemetry initialization to select between HTTP/protobuf and gRPC
  exporters based on the protocol env var; added error handling for HTTP export
- Ensured resource attributes in telemetry cannot override `service.name` or
  `service.namespace` from environment variables
- Updated EC2 launch and setup scripts to use new OTEL HTTP/protobuf variables,
  headers, and endpoints, removing legacy gRPC endpoint variables
- Switched OpenTelemetry dependency to enable both `grpc-tonic` and
  `http-proto` features in Cargo.toml

**Removed:**

- Removed `OTEL_EXPORTER_OTLP_ENDPOINT` and related gRPC-specific env vars from
  orchestrator and worker setup scripts to prevent conflicts with HTTP export
…yout

**Added:**

- Documented the `ares-golden-image` directory as an all-in-one red team EC2 AMI
- Added section on building the golden image AMI, including usage, requirements,
  and output details for red team attack box deployments
- Explained necessity of `GITHUB_TOKEN` for cloning private repos during AMI
  build

**Changed:**

- Updated the infrastructure layout to include `ares-golden-image` among
  ares-* directories
- Expanded container image build matrix to show `ares-golden-image` built on
  Kali Linux as an EC2 AMI
- Added `goad_attack_box.yml` and `ares-golden-image` to the Ansible playbook
  and role mapping table, clarifying tool coverage for the golden image
@dreadnode-renovate-bot dreadnode-renovate-bot Bot added the area/docs Changes made to project documentation label Apr 17, 2026
l50 added 3 commits April 16, 2026 23:09
**Added:**

- Detailed documentation for full EC2 operation lifecycle, including setup, deployment,
  monitoring, and teardown steps
- Section on convenience wrappers for red team operations on EC2, describing
  `red:ec2:multi` usage and options
- Instructions for arbitrary command execution and build tool selection on EC2
- Description of EC2 environment variables, deployment labels, and worker services
- Expanded blue team operations documentation for both K8s and EC2, including
  Taskfile wrappers and Redis port-forwarding procedures
- Summary table comparing red/blue coordination between K8s and EC2 environments
- Additional EC2-specific health check and debugging procedures for stuck operations

**Changed:**

- Clarified task usage for EC2 deployments by specifying required environment variables
- Updated verification instructions to distinguish between K8s and EC2 (`task remote:check`
  vs. `task ec2:status`)
- Improved section headings to clearly separate K8s and EC2 operational instructions
- Enhanced blue team workflow explanation, distinguishing between K8s and EC2 setups
- Expanded debugging steps to provide parallel guidance for both K8s and EC2 scenarios

**Removed:**

- Deprecated EC2 launch example from the red team operations section to avoid redundancy
**Changed:**

- Updated the EC2 build task to extract the ares source tarball without using
  the `--strip-components=1` flag, ensuring the full directory structure is
  preserved during extraction in `.taskfiles/ec2/Taskfile.yaml`
feat: add deployment label to suggested queries and extend blue investigation timeouts
**Added:**

- Include an optional deployment label in suggested Loki queries for host and
  user investigations, using the ARES_DEPLOYMENT environment variable if set

**Changed:**

- Increased investigation run timeout from 30 to 45 minutes and stale threshold
  from 35 to 50 minutes to accommodate longer-running Grafana proxy queries
- Extended blue team wait deadline from 20 to 45 minutes in completion logic to
  prevent premature shutdown during lengthy investigations
- Increased maximum retry attempts for Loki queries from 2 to 3, allowing more
  resilience to transient proxy failures
- Set an explicit 10-second connect timeout on the Loki HTTP client to avoid
  indefinite hangs during connection attempts

```
@l50 l50 merged commit c33f9f1 into main Apr 17, 2026
9 checks passed
@l50 l50 deleted the fix/blue-detection-gaps branch April 17, 2026 14:00
l50 added a commit that referenced this pull request Apr 17, 2026
…res-core (#214)

**Key Changes:**

- Moved YAML-driven detection configuration and logic from ares-tools to
ares-core
- Converted all lateral movement mappings, patterns, and MITRE technique
lookups to be YAML-driven
- Refactored query building and pattern detection to use shared config
and types
- Improved host/computer label handling and made LogQL selectors more
accurate

**Added:**

- Shared detection config module in `ares-core/src/detection/mod.rs` for
loading and querying `detections.yaml`
- Centralized `ares-core/src/detection/detections.yaml` with lateral
movement patterns, template connection types, and MITRE mappings
- New tests for lateral pattern loading, brute force query logic, and
MSSQL template resolution in ares-tools

**Changed:**

- Lateral movement analyzer and patterns now load connection types and
regexes from YAML config, eliminating hardcoded match arms
- MITRE technique mapping for connection types now derives from template
YAML, with hardcoded values as fallback only
- Refactored detection config usage in ares-tools to re-export types
from ares-core
- LogQL query builder and detection template lookup in ares-tools now
use shared types and support negative/exclusion patterns
- Changed LogQL label selection to use the `computer` label instead of
`hostname` for greater accuracy and coverage
- Refined suggested queries in investigation tools and reports to use
correct job labels and computer names
- Added more accurate event ID, pattern, and connection type metadata to
detection templates
- Improved report generator to aggregate MITRE techniques from both
state and evidence items

**Removed:**

- All hardcoded detection patterns, connection type match arms, and
technique ID maps in ares-core lateral movement modules
- Redundant detection config types and logic from ares-tools in favor of
using the shared ares-core module
- Unnecessary use of `Box::leak` for static string lifetimes in lateral
pattern code
l50 added a commit that referenced this pull request Apr 17, 2026
…res-core (#214)

**Key Changes:**

- Moved YAML-driven detection configuration and logic from ares-tools to
ares-core
- Converted all lateral movement mappings, patterns, and MITRE technique
lookups to be YAML-driven
- Refactored query building and pattern detection to use shared config
and types
- Improved host/computer label handling and made LogQL selectors more
accurate

**Added:**

- Shared detection config module in `ares-core/src/detection/mod.rs` for
loading and querying `detections.yaml`
- Centralized `ares-core/src/detection/detections.yaml` with lateral
movement patterns, template connection types, and MITRE mappings
- New tests for lateral pattern loading, brute force query logic, and
MSSQL template resolution in ares-tools

**Changed:**

- Lateral movement analyzer and patterns now load connection types and
regexes from YAML config, eliminating hardcoded match arms
- MITRE technique mapping for connection types now derives from template
YAML, with hardcoded values as fallback only
- Refactored detection config usage in ares-tools to re-export types
from ares-core
- LogQL query builder and detection template lookup in ares-tools now
use shared types and support negative/exclusion patterns
- Changed LogQL label selection to use the `computer` label instead of
`hostname` for greater accuracy and coverage
- Refined suggested queries in investigation tools and reports to use
correct job labels and computer names
- Added more accurate event ID, pattern, and connection type metadata to
detection templates
- Improved report generator to aggregate MITRE techniques from both
state and evidence items

**Removed:**

- All hardcoded detection patterns, connection type match arms, and
technique ID maps in ares-core lateral movement modules
- Redundant detection config types and logic from ares-tools in favor of
using the shared ares-core module
- Unnecessary use of `Box::leak` for static string lifetimes in lateral
pattern code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/docs Changes made to project documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant