v0.15.0
TL;DR
RSigma v0.15.0 is the "new conversion target and Sigma extensions" release:
- Fibratus conversion backend: convert Sigma rules into Fibratus rule YAML for the first endpoint-sensor target, with a
fibratus_windowsfield-mapping pipeline, idiomatic macro recognition, ATT&CK label flattening, and sequence-DSL correlation lowering (#191). - Array matching:
[any]/[all]/[all_or_empty]/[none]object-scope blocks, implicit any-member matching, and positional indexing (args[0], negative indices), evaluated in the engine and lowered to PostgreSQL JSONB (#159). - Declarable correlation window modes:
sliding/tumbling/sessionwindows plus a sessiongap, end to end across the parser, runtime evaluator, and PostgreSQL conversion, with pySigma-stylecorrelation_methodselection at convert time (#192). sigma-version: an optional top-level spec-major attribute that gates breaking spec changes by the declared version (array matching now activates only at major3), plus cross-document reference lints (#188).rstix: a new STIX 2.1 + TAXII 2.1 library crate; Phase 1 lands the core foundation (validated typed IDs, timestamps, deterministic SCO IDs, controlled vocabularies) (#185), thanks to @SecurityEnthusiast.- Gated match-detail enrichment: a new
MatchDetailLevel(off/summary/full) that explains why each field matched, off by default so the default wire shape is byte-for-byte unchanged (#186). - RFC 5424 syslog now strips a leading UTF-8 BOM by default, fixing corrupted
_rawfields, broken anchored matchers, and BOM-blocked embedded-JSON detection (#187). - Daemon shutdown fix:
SIGINT/SIGTERMhandlers are now installed before the API listener is announced, closing a startup race that could hard-kill the process instead of draining cleanly.
Fixed
- Daemon startup signal race. The daemon now installs its
SIGINT/SIGTERMhandlers eagerly, before the API listener is announced and reachable, and reuses those same streams for the serve task's graceful shutdown. Previously the handlers were installed lazily on the serve task's first poll, so a signal arriving in the window between the socket becoming connectable (the kernel completes handshakes from the listen backlog) and that first poll hit the default disposition and killed the process instead of draining cleanly.
Fibratus conversion backend (#191)
Convert Sigma rules into rule YAML for Fibratus, an Apache-2.0 kernel-event detection and EDR engine. Fibratus is the first conversion target aimed at an endpoint sensor rather than a centralized log store; rules emitted by rsigma backend convert -t fibratus drop into a Fibratus installation's Rules/ directory and load with the same parser as the upstream rules library.
Output formats. Four format names cover two output shapes. default (alias yaml, rule) emits a complete YAML rule document per Sigma rule (name, id, description, labels, condition, min-engine-version, optional action) with --- separators between multi-rule output so the whole stream is a valid YAML document set. expr strips the envelope and emits the bare filter expression only, for piping into ad-hoc Fibratus commands.
Modifier coverage. Sigma's case-insensitive default flips to Fibratus's case-insensitive operators (icontains/istartswith/iendswith); the |cased modifier or -O case_sensitive=true flips to the bare forms. Plain literal equality (no wildcards) uses the dedicated string-equality operators ~= (case-insensitive default) and = (|cased) rather than a wildcard match, which evaluates more efficiently and reads the way the upstream rules library writes literal equality; the evt.name event discriminator always uses the exact =. Wildcard-bearing values lower to imatches/matches. Multi-value OR lists collapse into a single Fibratus list-operator clause (field iin ('a', 'b'), field imatches ('a*', 'b?'), field icontains ('a', 'b'), ...); a |all list stays AND-joined because a list right-hand side is OR-only. Regex (|re) lowers to the regex(field, 'pat1', 'pat2', ...) = true filter function, with multi-value lists collapsing into a single call and negation expressed as a leading not; patterns that use lookarounds or backreferences are rejected with a structured UnsupportedModifier rather than emitting something Fibratus's RE2 engine would reject at load time. CIDR (|cidr) lowers to cidr_contains(field, '...'), with multi-value lists collapsing into a single variadic call. Numeric comparisons map to </<=/>/>=. exists lowers to field != false / field = false and a null value to field = '' (Fibratus has no null token). Field references are native (field1 = field2). Keywords return UnsupportedKeyword because Sigma keywords have no bound field and Fibratus operators require one.
Field naming. A new fibratus_windows builtin pipeline (registered alongside ecs_windows and sysmon) maps Sigma's PascalCase Windows fields to the lowercase-dotted Fibratus vocabulary and adds the right evt.name discriminator per logsource category (process_creation -> CreateProcess, network_connection -> Connect, dns_query -> QueryDns, registry_set -> RegSetValue, ...). Most categories map Image -> ps.exe, CommandLine -> ps.cmdline, TargetFilename -> file.path, TargetObject -> registry.path, DestinationIp -> net.dip, ImageLoaded -> module.path, QueryName -> dns.name. Field names target the Fibratus 3.0.0 registry: DNS fields live under dns.*, loaded executables/DLLs under module.* (the legacy image.* namespace is deprecated), and Sigma fields with no 3.0.0 equivalent (SignatureStatus/Hashes/Imphash under image_load/driver_load, DestinationHostname/Initiated under network_connection) are intentionally unmapped so a dependent rule fails conversion instead of emitting a field the loader rejects. The evt.name discriminator is injected as the first condition (the new add_condition prepend: true option), so the emitted rule leads with the cheapest, most selective predicate and Fibratus short-circuits before the rule body. On a Fibratus 3.0.0 process_creation (CreateProcess) event ps.* is the created (child) process, so Image/CommandLine/ProcessId/User -> ps.exe/ps.cmdline/ps.pid/ps.username and the spawning process is ParentImage/ParentCommandLine/ParentProcessId -> ps.parent.exe/ps.parent.cmdline/ps.parent.pid (Fibratus 3.0.0 decommissioned the legacy ps.sibling.* namespace and unified process attributes under ps.*). For process_access (OpenProcess) the caller is ps.* and the opened process is exposed as event arguments, so TargetImage/TargetProcessId -> evt.arg[exe]/evt.arg[pid] (matching the upstream LSASS-access rule) and GrantedAccess -> ps.access.mask.names. file_event (file creation) excludes the OPEN disposition (the create_file macro semantics) so it does not fire on plain file access, and registry_set/registry_event map Details -> registry.data. The pipe_created logsource is intentionally not mapped because Fibratus has no named-pipe visibility without a kernel driver. Use it whenever you convert SigmaHQ Windows rules: rsigma backend convert rules/windows/ -t fibratus -p fibratus_windows. ATT&CK tags in tags: flatten into Fibratus's labels: block via a static MITRE lookup: attack.<tactic_short_name> -> tactic.id/tactic.name/tactic.ref, attack.t<NNNN> -> technique.id/technique.ref, and attack.t<NNNN>.<sub> -> subtechnique.id/subtechnique.ref (the base technique and sub-technique live in separate label namespaces matching the upstream Fibratus rules library convention). Unknown tags pass through as tag.<original>: <original>.
Correlation. Sigma correlation rules lower to Fibratus's inline sequence ... maxspan ... by <fields> | stage | | stage | DSL (the form Fibratus 1.10 introduced when it decommissioned policy: sequence). The group-by fields, shared across every referenced rule, are emitted once as a sequence-level by field1, field2, ... clause (the upstream rules-library style) instead of repeated per stage, so multi-field group-by needs no inline bindings. temporal_ordered and temporal (ordered fallback) emit one |...| stage per referenced rule; small-threshold event_count and value_count expand into N repeated or N distinct stages capped at -O max_repeated_slots (default 5), with value_count distinctness expressed via positional pattern bindings (field != $1.field and field != $2.field and ...). The four math-aggregate types (value_sum, value_avg, value_percentile, value_median), thresholds above the cap, range/equality predicates, and multi-rule event_count/value_count all return UnsupportedCorrelation with structured rationales the operator can act on; the coverage matrix in the new Fibratus backend reference is the source of truth.
Backend options. -O action=kill,isolate appends an action: block to every rule envelope. -O min_engine=3.0.0 sets min-engine-version:. -O emit_metadata=false drops the description: and labels: blocks for a minimal envelope. -O max_repeated_slots=N raises the correlation cap. -O case_sensitive=true forces the bare operators globally. -O temporal_permute=true expands a temporal (any-order) correlation into one ordered sequence document per permutation of the referenced rules (capped at N <= 3, so 1/2/6 documents per correlation; each permutation gets distinct title and id suffixes), so any matching order alerts; larger correlations return UnsupportedCorrelation. -O use_macros (default true) walks top-level and clauses and replaces recognized runs with idiomatic Fibratus macro calls (spawn_process, create_thread, write_file, read_file, open_file, create_file, set_value, open_process, open_thread, ...), greedy-longest-match so a full three-clause open_file triple beats the standalone evt.name = 'CreateFile' prefix; each clause is matched against both the exact (=) and case-insensitive (~=) operator forms, so recognition is independent of -O case_sensitive, and clauses that match no macro pass through verbatim.
Correlation window modes. The backend honors the rsigma.window / rsigma.gap extension attributes the Correlation window modes entry above adds. The native sequence ... maxspan DSL is itself a sliding total-span constraint per stage, so rsigma.window: sliding (the default) is a faithful pass-through and adds no warnings. rsigma.window: tumbling returns UnsupportedCorrelation because Fibratus has no calendar-aligned bucket primitive. rsigma.window: session (the SEP's "should warn" degraded case) still emits a sliding sequence with the rule's timespan as maxspan, but pushes a warning to the conversion warnings channel noting that the requested per-step rsigma.gap is not enforced because Fibratus has no maxpause-style inactivity timeout. Two new -O options: correlation_method (sliding/session; pySigma-style override that takes precedence over the rule's own rsigma.window) and gap (default session gap for rules that do not declare their own rsigma.gap, used in the warning text). The backend advertises these via the new Backend::correlation_methods and Backend::default_correlation_method trait methods; tumbling is intentionally absent from the advertised list and a -O correlation_method=tumbling override is rejected up-front.
CLI integration. rsigma backend targets and rsigma backend formats fibratus list the new target and its formats. The CLI's per-rule output joining now defers to the backend's finalize_output, which fixes a latent bug for PostgreSQL's view/continuous_aggregate formats (they wanted ;\n\n between statements but got \n) and is what makes the Fibratus --- document separator land correctly on stdout.
Correlation window modes: declarable sliding/tumbling/session windows and a session gap (#192)
rsigma correlation rules can now declare how their timespan is anchored to the event stream, via an optional window attribute, plus a gap field for dynamic session windows, end to end across the parser, runtime evaluator, and PostgreSQL conversion. This is an rsigma-specific extension: a portable-spec version was proposed upstream and declined (sigma-specification #214), on the grounds that the window strategy is a backend-and-deployment concern rather than portable detection logic. rsigma keeps the capability where it is reliable (a stateful streaming engine has no global transaction caps) and follows the upstream guidance: rule-level window/gap live in the rsigma.* extension namespace, and conversion exposes the choice to the converting user the way pySigma's correlation_methods do.
- New
windowattribute with three values:sliding(the default, equal to today's trailing per-event window, so no existing rule changes meaning),tumbling(fixed, boundary-aligned, non-overlapping buckets of sizetimespan), andsession(a dynamic window that extends while consecutive in-group events stay withingap, capped bytimespanas the maximum total span). - New
gapattribute reusing the existingtimespangrammar (Xs/Xm/Xh/Xd/Xw/XM/Xy). It is required whenwindow: sessionand rejected for the other modes. The parser errors on a session window without a gap, a gap without a session window, and an unknown window mode. rsigma.*extension namespace.window/gapare accepted both via thersigma.*engine-extension keys (rsigma.window,rsigma.gap, alongsidersigma.suppressand friends), which is the primary spelling, and via the first-classcorrelation.window/correlation.gapkeys, kept as aliases. Thersigma.*spelling wins when both are present. The parser and linter resolve either spelling.- Conversion method selection (pySigma-style). The
Backendtrait gainscorrelation_methodsanddefault_correlation_method. The PostgreSQL backend advertisessliding/tumbling/session(defaultsliding), andrsigma backend convert -O correlation_method=NAMElets the converting user pick the strategy per backend, overriding a rule's ownwindowhint for that conversion.-O gap=5msupplies the default session gap for rules that declare none (a rule's owngapalways wins), socorrelation_method=sessionworks across whole rulesets. Invalid methods and malformed gaps are rejected both up front in the CLI and per-rule in the backend;rsigma backend formats <target>lists the available methods. - Runtime evaluation. The correlation engine honors all three modes:
slidingkeeps the existing trailing per-event window,tumblingresets per-group state on epoch-aligned bucket boundaries, andsessionkeeps a window open while consecutive in-group events stay withingap, restarting after a gap of inactivity or once the total span would exceedtimespan. A late arrival belonging to an earlier tumbling bucket is discarded rather than allowed to reset the active bucket, so out-of-order stragglers cannot wipe an accumulating count. Engine-level state eviction is window-mode aware: sliding state trims by the trailing cutoff as before, while tumbling/session groups are only dropped whole once stale (trimming their front would forget the bucket/session start and silently weaken thetimespancap). The same window logic applies to chained correlations and to event-inclusion buffers. Window bookkeeping is derived from the existing per-group timestamps, so persisted daemon state (snapshots) stays format-compatible and survives upgrades. - PostgreSQL conversion. The backend renders the windowing strategy from the rule's
window:tumblingemits boundary-aligned buckets (time_bucketon TimescaleDB,date_binon plain PostgreSQL) sized to the rule'stimespan, andsessionemits a gaps-and-islands query (LAG+ a running session id) that honors thegapexactly and enforces thetimespancap as a post-aggregation filter (recorded as a warning). An absent orslidingwindow keeps the existing per-output_format SQL unchanged, so no existing query changes. Tumbling and session cover every correlation type, includingtemporal/temporal_ordered, which bucket or sessionize the combined detections and count distinct referenced rules (order is not enforced, matching the existing temporal path). - Conversion warnings channel.
ConversionResultgains awarningsfield andBackend::convert_correlation_rule_with_warnings, so a backend can emit non-fatal "should warn" diagnostics (the closest faithful approximation) while still converting, distinct from a hardConvertError.rsigma backend convertprints these to stderr. - Lint rules. Four new checks:
invalid_window_mode,missing_session_gap,gap_without_session, andinvalid_gap_format. Both the parser and the linter treat awindow/gapkey set to a non-string value (e.g. an unquotedgap: 300) as a type error with a quoting hint, rather than silently reading it as absent. The lint catalogue now lists 74 built-in checks plus the 1 reserved enum value (empty_filter_rules). - API.
rsigma-parsergains aWindowModeenum (Sliding/Tumbling/Session, defaultSliding) andwindow: WindowModeplusgap: Option<Timespan>fields onCorrelationRule, populated from either thersigma.*or the first-class spelling.rsigma-evaladdswindow_modeandgap_secstoCompiledCorrelationand anapply_window_openhelper.rsigma-convertadds the warnings channel described above plusBackend::correlation_methods/default_correlation_methodandcorrelation_method/gapoptions onPostgresBackend. The LSP offers a newcorrelation-sessionsnippet that emits the primaryrsigma.window/rsigma.gapspelling. - Backward compatible.
windowis optional and defaults tosliding;gapis only valid underwindow: session. No existing rule changes meaning or becomes invalid.
sigma-version: gate breaking spec changes by the declared specification major (#188)
rsigma now reads an optional top-level sigma-version attribute on a Sigma document: the Sigma specification MAJOR version the document targets (for example sigma-version: 3). It is the reference implementation of the rule-level spec-version mechanism proposed as SEP #213, split out of array matching so that every future breaking spec change is gated by one declared version rather than a per-feature escape.
- Fixed-floor default. When
sigma-versionis absent, the document resolves to a fixed floor (major2, the v2.x line): a constant defined by the specification, not the latest version the tool supports. Existing rules keep their current semantics and are never silently reinterpreted. Only the major is significant (a release string like"2.1.0"is accepted and read for its major), since breaking changes occur only at major bumps. - Array matching is now gated. Array-matching bracket selectors (
field[any],args[0], ...) are active only at major3or higher. A rule that declaressigma-version: 3reads a trailing[...]as an array selector; at the floor (absent or major2) brackets are literal field-name characters, normalized to the escaped form (args\[0\]) so the escape-aware evaluator and converters resolve them literally. This is a behavior change to the (unreleased) always-on array matching, with no compatibility cost because the feature has not shipped. - Lint rules.
unsupported_sigma_version(error) flags a declared major newer than this build implements;array_matching_without_version(warning) flags a document that uses bracket-selector syntax but resolves below major3, where the brackets would be read literally rather than as selectors. The linter also resolves cross-document references byidorname, across a whole directory:sigma_version_mismatch(warning) flags a correlation/filter and a rule it references that declare different majors, andunknown_rule_reference(warning) flags acorrelation.rulesorfilter.rulesentry that resolves to no rule in the linted set (directory scope only, where the index is complete). Directory linting now runs a two-pass index so references resolve across sibling files. The lint catalogue now lists 70 built-in checks plus the 1 reserved enum value (empty_filter_rules). - API.
rsigma-parsergains aversionmodule (SPEC_VERSION_FLOOR,SPEC_VERSION_ARRAY_MATCHING,SPEC_VERSION_SUPPORTED,resolve_major,array_matching_enabled,is_unsupported), an optionalsigma_version: Option<u32>field onSigmaRule,CorrelationRule, andFilterRule, and afieldpath::escape_bracketshelper. Gating happens at parse time, so the evaluator and converters consume the already-gated AST with no version logic of their own.
Array matching: [any]/[all]/[all_or_empty]/[none] blocks, implicit any-member, and positional indexing (#159)
rsigma can now match members of arrays in event data, an experimental extension proposed to the Sigma specification and accepted as a Sigma Enhancement Proposal (issue #158, sigma-specification Discussion #106, SEP #212). Arrays are first-class in cloud and audit logs (CloudTrail, GCP, Okta, Azure Activity, Kubernetes audit, Windows Event Logs) and there was previously no portable way to match a member. The feature is documented in the new Array Matching guide and ships marked experimental because the surface syntax is still being finalized upstream.
Three constructs, all expressed with [...] selectors on the field path.
- Implicit any-member. A plain field expression matches a scalar or any member of an array (
connections: '1.2.3.1'), including through dotted paths into arrays of objects (connections.ip|cidr: '123.1.0.0/16'). This required fixing a first-match-wins bug inJsonEvent::get_field: a dotted path crossing an array now collects every element's leaf value, so any-member matching is correct rather than testing only the first element. - Object-scope blocks
field[any]:,field[all]:,field[all_or_empty]:, andfield[none]:open a nested detection evaluated against a single array member, for same-element correlation (one connection that is bothprotocol: TCPand in a suspicious CIDR).[any]requires at least one matching member;[all]requires a non-empty array where every member matches;[all_or_empty]is[all]but also matches an empty or missing array (the vacuously-true reading);[none]is the dual of[any](no member matches) and matches an empty or missing array. The block body comes in two forms (the dual approach accepted in SEP #212): a basic conjunction map (the common case), and an extended nested detection with its owncondition:plus named element-scoped sub-selections, for per-elementand/or/not(for example "any connection in the CIDR that is not TCP"). The basic form is the implicit-AND degenerate case of the extended form. Inside a block body, a standalone.references the current scalar member (with modifiers, e.g..|gte), so an array of scalars can carry multiple, named, or negated per-element predicates ("any 5xx response code that is not 504"). - Positional indexing
field[N]selects one element, for ordered arrays where each index carries meaning (args[0]is the process image,args[1..]are parameters). Indices may be negative (args[-1]is the last element, counting from the end). It is deterministic: a missing field, a non-array value, or an out-of-range index does not match. It composes with paths and quantifiers (connections[0].ip,rules[any].ip[0]).
Array selectors are kept strictly distinct from the existing all value-list modifier. Only an unescaped trailing [...] is a selector; a literal bracket in a field name is escaped as \[ / \] (mirroring the existing \* / \? wildcard escaping), so args\[0\] matches a field literally named args[0] rather than index 0.
New lint rule. flattened_array_correlation (warning) flags two or more sibling keys that share a quantified array prefix (e.g. connections[any].protocol and connections[any].ip); they open independent scopes and do not correlate on the same element, so the rule points authors at the object-scope block form. The lint catalogue now lists 66 built-in checks plus the 1 reserved enum value (empty_filter_rules).
Conversion. A new Backend::convert_array_match hook lowers the constructs where a backend can express them and errors with UnsupportedArrayMatching otherwise, never emitting a query with different semantics. The PostgreSQL/TimescaleDB backend lowers object-scope blocks to EXISTS / NOT EXISTS over jsonb_array_elements (guarded by jsonb_typeof(...) = 'array') and positional indices to ->n / ->>n (negative subscripts on PG 11+), in JSONB mode. Because [none] and [all_or_empty] must match an empty or missing array, they lower to a CASE that only unnests an actual array and treats a missing/null value as a match, so jsonb_array_elements is never applied to a scalar. The extended block body lowers to the same per-element primitive with a boolean inner predicate (the nested condition: becomes OR / parenthesized NOT over the element alias), so it costs no backend coverage. A backend that cannot lower a positional field[N] index rejects it (via the new Backend::supports_field_index capability) rather than emitting a literal field reference that would diverge from the evaluator; LynxDB, other text backends, and PostgreSQL flat-column mode report the construct as unsupported. Positional indexing is unexpressible in Elasticsearch query DSL because Lucene arrays are unordered sets, which is the strongest argument for evaluating the index in the engine.
AST and API. rsigma-parser gains ArrayQuantifier (Any, All, AllOrEmpty, None) and the Detection::ArrayMatch / Detection::And / Detection::Conditional variants, plus a fieldpath module of shared escape-aware helpers (bracket unescaping and unescaped-bracket detection) reused by the evaluator and converter; rsigma-eval gains the matching CompiledDetection variants. The rule index, bloom filter, and cross-rule Aho-Corasick prefilters no longer prune array-valued fields.
Tests. New parser, evaluator, and converter tests cover the flat-array, object-array fan-out, any/all/none/all_or_empty correlation and empty-array semantics, scalar-member, nested-quantifier, mixed-map, positional-index (including negative indices), extended-block (per-element negation, disjunction, and [all] with a nested condition), . scalar-element marker, and escaped-bracket literal-field cases, plus PostgreSQL golden SQL and unsupported-backend errors.
Strip the UTF-8 BOM from RFC 5424 syslog messages (#187)
RFC 5424 section 6.4 mandates that a UTF-8 MSG begin with a byte order mark (U+FEFF, bytes EF BB BF) as an encoding marker, not as content. syslog_loose preserves it verbatim, and str::trim() does not remove it (U+FEFF is not Unicode White_Space), so the BOM previously leaked into the parsed event: it corrupted the _raw field and anchored matchers (startswith, exact equality), and it blocked embedded-JSON detection because serde_json errors on a leading BOM, silently degrading a BOM-prefixed JSON payload to a key/value event.
- The syslog adapter now strips a single leading BOM from the message body by default, gated by a new
SyslogConfig.strip_bomfield (defaults totrue). - Opt out with
rsigma engine eval --syslog-strip-bom false/rsigma engine daemon --syslog-strip-bom false, or theinput.syslog_strip_bom/eval.syslog_strip_bomconfig keys, to keep the message byte-for-byte.
rstix: STIX 2.1 + TAXII 2.1 library crate, Phase 1 core foundation (#185)
Introduces rstix, a new workspace library crate for native STIX 2.1 and TAXII 2.1 support. This first phase lands the core foundation only; the object model, serialization dispatch, pattern engine, validation pipeline, and graph/marking/store/TAXII runtime behaviours are deferred to later phases.
- Core primitives (
rstix::core): a validatedStixIdin{type}--{uuid}form with 42 typed-ID wrappers and SDO/SCO/SRO/Meta kind discriminants;StixTimestampandTaxiiTimestamp, whereStixTimestamppreserves fractional-second precision for round-tripping but compares and hashes by instant so the same moment with different digit widths is treated as equal;Confidenceplus six interchange scales (None/Low/Medium/High, Admiralty, 0-10, WEP, DNI, MISP);SpecVersion;LanguageTag; and theQueryableStixObject/QueryValuequery traits. - Deterministic SCO IDs (
rstix::id):generate_sco_idderives UUIDv5 identifiers from RFC 8785 (JCS) canonicalized contributing properties under the STIX namespace. Per-type property selection follows STIX 2.1, including single-hash selection by preference order, the spec-mandated UUIDv4 fallback forprocessand for objects with no contributing properties present, and a first-available-hash fallback for non-preferred algorithms. The generated IDs are pinned against python-stix2 golden vectors. - Vocabulary tables (
rstix::vocab): open and closed STIX controlled vocabularies and the orderedOpinionValueenum, backed by compile-timephfsets. - Surface:
#![forbid(unsafe_code)], a single defaultserdefeature, andparse_bundlereserved as aNotImplementedentry point for the next phase. The workspace crate map, architecture page, and feature-flags reference are updated for the new crate.
Gated match-detail enrichment for detection results (#186)
matched_fields entries can now explain why each field matched, gated behind a new opt-in verbosity level so the default wire shape is byte-for-byte unchanged.
- New
MatchDetailLevel { Off, Summary, Full }onrsigma-eval, configured viaEngine::set_match_detail(and theCorrelationEnginepassthrough).Offis the default and preserves the historical{field, value}shape exactly; all new keys areOption/skipped on serialization, so existing sinks, the daemon NDJSON wire format, and the golden tests are unaffected unless a caller opts in. Summaryaddsselection(the originating named detection),matcher(a newMatcherKindenum:exact,contains,startswith,endswith,regex,one_of,cidr,numeric,exists,fieldref,null,bool,expand,timestamp,keyword), andcase_sensitive.Fulladditionally recordspattern, the value the matcher tested against (truncated for very long pattern sets). Negated matchers setnegated: true.- Closed two long-standing reporting gaps (visible only at
Summary/Full, soOffis untouched): keyword detections, which previously contributed nothing tomatched_fields, are now reported under the sentinel field"keyword"; andnull-on-absent matches, previously invisible because the field had no value, are now reported withvalue: null. - New
CompiledMatcher::describe()(returningMatchDescriptor) produces the structural description used to populate these fields. It runs only when a rule matches and only aboveOff, so the non-matching hot path is unchanged. - CLI/runtime plumbing:
rsigma engine eval --match-detail <off|summary|full>,rsigma engine daemon --match-detail <…>plus thedaemon.engine.match_detailconfig key, andRuntimeEngine::set_match_detail(carried across hot reloads).