Skip to content

refactor(eval): unify MatchResult and CorrelationResult into a single EvaluationResult#132

Merged
mostafa merged 4 commits into
mainfrom
feat/unify-evaluation-result
May 20, 2026
Merged

refactor(eval): unify MatchResult and CorrelationResult into a single EvaluationResult#132
mostafa merged 4 commits into
mainfrom
feat/unify-evaluation-result

Conversation

@mostafa
Copy link
Copy Markdown
Member

@mostafa mostafa commented May 20, 2026

Summary

Collapses the two parallel result types (MatchResult and CorrelationResult) into a single EvaluationResult built from a shared RuleHeader plus an #[serde(untagged)] ResultBody enum. The five fields that are shared today (rule_title, rule_id, level, tags, custom_attributes) live in RuleHeader along with a new optional enrichments map; the kind-specific fields stay in DetectionBody / CorrelationBody.

ProcessResult becomes a Vec<EvaluationResult> (detections first, correlations after, in evaluation order). The three duplicated sink loops in file.rs, stdout.rs, and nats_sink.rs collapse to one. A new ProcessResultExt extension trait gives call sites readable detections() / correlations() / detection_count() / correlation_count() views without forcing pattern matching.

Wire shape

NDJSON output keeps the same field set, same values, and same skip_serializing_if behavior. Both the header and the body flatten into the parent JSON object via #[serde(flatten)], so each line stays a single flat object. Downstream consumers continue to distinguish detection from correlation by the presence of correlation_type.

One cosmetic change: on rules with a non-empty custom_attributes map, custom_attributes is now emitted between the rule header fields and the kind-specific body fields rather than at the end of the line. JSON objects are unordered per spec, so this is invisible to compliant consumers. The new golden snapshot tests at crates/rsigma-eval/tests/wire_shape_golden.rs pin the new ordering for both kinds (with and without custom_attributes / enrichments).

Library API (breaking, pre-1.0)

Before After
m.rule_title, m.tags, ... m.header.rule_title, m.header.tags, ...
m.matched_fields, m.event m.as_detection().unwrap().matched_fields, .event
m.correlation_type, m.group_key, ... m.as_correlation().unwrap().correlation_type, ...
result.detections.len() result.detection_count()
result.correlations.iter() result.correlations()
result.detections[0] result.detections().next().unwrap()

Performance

A new Criterion bench at crates/rsigma-eval/benches/result_serialize.rs compares the new design's serialize throughput against a byte-for-byte copy of the old flat structs across four representative inputs (small/realistic, detection/correlation). The new design is within +/-4% of the baseline on every sample; the bench's equivalence preflight also asserts byte-identical wire shape across the two layouts on the sample inputs.

Test plan

  • cargo test --workspace --all-features (1341 tests pass)
  • cargo fmt --all -- --check clean
  • cargo clippy --workspace --all-targets --all-features -- -D warnings clean
  • cargo bench --bench result_serialize (within +/-4% of baseline on every sample; equivalence preflight passes)
  • Golden NDJSON snapshot tests for detection and correlation lines, including the non-empty custom_attributes case (crates/rsigma-eval/tests/wire_shape_golden.rs)

mostafa added 4 commits May 20, 2026 21:38
…onResult

Collapse the two parallel result types into a single `EvaluationResult`
built from a shared `RuleHeader` plus an `#[serde(untagged)]` `ResultBody`
enum (`Detection` | `Correlation`). The five fields shared today
(`rule_title`, `rule_id`, `level`, `tags`, `custom_attributes`) move into
`RuleHeader`, plus a new optional `enrichments` map. Kind-specific fields
live in `DetectionBody` / `CorrelationBody`. The wire NDJSON shape is
preserved exactly: both bodies are flattened into the parent JSON object
via `#[serde(flatten)]`, no `result_kind` discriminator is added.
Downstream consumers continue to disambiguate variants via
`correlation_type` presence (today's contract).

`ProcessResult` collapses to a `pub type ProcessResult = Vec<EvaluationResult>`.
The engine still emits detections first, then correlations, in the same
order as today. A new `ProcessResultExt` trait exposes `detections()` /
`correlations()` iterators and `detection_count()` / `correlation_count()`
methods so call sites stay readable.

A new Criterion bench at `crates/rsigma-eval/benches/result_serialize.rs`
compares the new design's serialize throughput against a byte-for-byte
copy of the old types across four representative inputs (V1 baseline vs
V2 derived `#[serde(flatten)]` vs V3 hand-written `Serialize`). V2 lands
within +/-4% of V1 on every sample, so the derive path is what ships.
The bench also asserts byte-identical wire shape across all three
variants on the sample inputs before any timing is collected.

Library API is breaking but pre-1.0: `MatchResult` and `CorrelationResult`
are replaced by `EvaluationResult`, `ResultBody`, `RuleHeader`,
`DetectionBody`, `CorrelationBody`, plus `ProcessResultExt`. The three
duplicated `for m in &result.detections / .correlations` loops in the
file, stdout, and NATS sinks collapse to one `for m in result`.

1336 workspace tests pass with --all-features; cargo fmt clean; clippy
--workspace --all-targets --all-features clean.
- CHANGELOG: Unreleased entry for the unified result type, breaking
  library API change, consumer migration table, and wire-shape
  preservation note.
- crates/rsigma-eval/README.md: rewrite the "Output Types" section
  around EvaluationResult, RuleHeader, DetectionBody, CorrelationBody,
  FieldMatch, EventRef, and the new ProcessResultExt trait.
- docs/library/eval.md: update the type table, detection doctest, and
  correlation example to use the new types and iterators.
- README.md and assets/architecture.mmd: replace the old result-types
  output block with the unified composition.
- crates/rsigma-eval/tests/wire_shape_golden.rs: new golden NDJSON
  snapshot tests pinning the byte-exact serialization for one
  detection line and one correlation line, plus the downstream
  disambiguation contract (correlation_type only on correlations,
  matched_fields only on detections) and the enrichments-None skip
  behavior.
A non-empty `custom_attributes` map now serializes between the rule
header fields and the kind-specific body fields, not at the end of the
line as it used to. JSON objects are unordered per spec, so this is
invisible to compliant consumers; pin the actual byte ordering with a
new golden snapshot so a future change is intentional, not silent.

Tighten the wording in the CHANGELOG entry and the result.rs module
docstring to record that the field set, values, and skip_serializing_if
behavior all match the previous layout, with custom_attributes being
the lone position change.
@mostafa mostafa merged commit 0ddc744 into main May 20, 2026
14 checks passed
@mostafa mostafa deleted the feat/unify-evaluation-result branch May 20, 2026 20:02
mostafa added a commit that referenced this pull request May 21, 2026
Records the shipped surface of PR #134 under `## [Unreleased]`,
immediately above the existing `### Unified evaluation result type
(#132)` entry. Covers the new `--enrichers <PATH>` flag, the four
primitive `type:` values (`template`, `lookup`, `http`, `command`),
the strict kind separation and namespace-validated templates, the
`scope` filtering axes and `on_error` policies, the six new
Prometheus metric families (all pre-registered at startup), the
public library API surface (`Enricher`, `EnrichmentPipeline`, the
four primitive types, `HttpResponseCache`, `register_builtin`), and
the new dependencies pulled in (`humantime`, `arc-swap`, `globset`,
`jaq-core`, `jaq-std`, plus `wiremock` as a dev-dep).

Style notes:
- All repo-relative paths use plain inline code rather than markdown
  links, matching the rest of the CHANGELOG and avoiding broken
  cross-links when `docs/release-notes.md` pulls the file in via
  `include-markdown`.
- The `lookup` row notes "as declared today via pipeline `sources:`"
  so the entry reads accurately for the release that ships PR #134,
  without committing the wording to pipeline-only declaration
  forever.

Verification: mkdocs build --strict is clean (release-notes.md
embeds the CHANGELOG via include-markdown).
mostafa added a commit that referenced this pull request May 21, 2026
Audit pass over `git log v0.12.0..main` plus this branch's own
commits caught four user-visible items that were not yet recorded in
the Unreleased section. Adds a small `### Drop reserved 'attack'
subcommand` section for the namespace removal (commit 6369ac8, direct
to main with no PR), and a new `### Other changes` bulleted list
covering:

- PR #131 docs cleanup (version macros now self-serve from
  Cargo.toml, the new "Rule loading at scale" section in the
  performance-tuning guide, the rsigma-parser README intro count
  bumped from 65 to 66 to match the rest of the docs).
- The option-B wording fix shipped in 778c83e (`lookup` enricher
  error and docs say "configured on the daemon" instead of "declared
  in your pipeline `sources:` block").
- Detection Engineering Weekly #157 added to the featured-in list
  in README and docs/index.md.
- CONTRIBUTING.md gains a Documentation section that lists the
  docs/ MkDocs site as a release deliverable alongside crate READMEs,
  with a page-to-change matrix.

The four items each ship in the same release as #134
post-evaluation enrichment and #132 unified result types, so they
fit naturally under the current Unreleased heading.

Verification: mkdocs build --strict clean (release-notes.md
embeds CHANGELOG.md via include-markdown).
@mostafa mostafa mentioned this pull request May 26, 2026
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant