Skip to content

v0.18.0

Latest

Choose a tag to compare

@mostafa mostafa released this 01 Jul 10:29
e06e745

TL;DR
RSigma v0.18.0 is the "post-engine alerting and detection lifecycle" release: the daemon grows an Alertmanager-style processing stage and an entity risk layer, the toolkit gains the triage, hygiene, and ADS pieces that close the detection lifecycle, content-based schema and logsource routing lands, a data-aware diagnostics toolkit ships, and rstix completes its STIX 2.1 data model and gains a pattern engine.

  • Post-engine alert processing: an alert pipeline adds deduplication, silencing, inhibition, and incident grouping to the daemon sink path (#255); risk-based alerting scores entities and emits a risk-incident layer (#264); the webhook sink can HMAC-sign every request (#266).
  • Detection lifecycle: a triage feedback loop turns analyst dispositions into a live per-rule false-positive ratio (#263); a rule hygiene report surfaces retirement and clean-up candidates (#262); optional ADS metadata gains linter enforcement and an authoring command (#261).
  • Schema and logsource routing: content-based schema recognition (#245) and per-schema pipeline routing (#246), plus opt-in conflict-based logsource pruning in the evaluator (#249).
  • Diagnostics: an explain-and-introspect toolkit adds detection explain, pipeline transform diff, and correlation window introspection (#270).
  • Daemon transport: Unix domain socket support for the input source, output sink, and API listener (#273).
  • rstix (threat-intel library, not yet independently releasable): completes the STIX 2.1 data model and serialization (#248, #254, #265, #268) and adds a pattern engine that parses and type-checks STIX patterning Levels 1-3 (#272), thanks to @SecurityEnthusiast.
  • Fixes, security, and dependencies: jq halt/halt_error can no longer terminate the process (#247); a transitive anyhow bump clears RUSTSEC-2026-0190 (#271); a rolled-up dependency bump (#257); and the CI/CD guide documents rsigma-action (#260).

rstix Pattern Engine: parse and type-check (Levels 1–3) (#272)

Adds the pattern feature to rstix with a hand-written lexer, recursive-descent parser for STIX patterning Levels 1–3, and an SCO schema type-checker for all 18 cyber-observable types:

  • Pattern::parse — lex, parse, and type-check a pattern string; returns PatternError with byte offset (lex/parse) or path (type-check).
  • Grammar — single observations, top-level AND / OR / FOLLOWEDBY, and Level 3 WITHIN, REPEATS, and START/STOP qualifiers.
  • Type-checker — validates property paths (including extensions.'…', _ref.type, dictionary dot keys, custom SCO types), comparison operators, and constant types against per-SCO schemas.
  • Tests — STIX §9.8 fixture files under tests/fixtures/pattern/, acceptance test modules, gap-table regression coverage.

Evaluation, canonical printer, and IndicatorPattern AST wiring are deferred to later Pattern Engine work (documented in crates/rstix/README.md and docs/library/rstix.md).

Unix domain socket support for the daemon (input source, output sink, API listener) (#273)

The daemon now speaks unix:// on three surfaces (Unix targets only; gated behind the new runtime uds feature, which the daemon feature enables):

  • --input unix:///path/to.sock ingests newline-delimited events over a Unix domain socket, so co-located log shippers (rsyslog omuxsock, syslog-ng unix-stream, Vector, Fluent Bit) can feed the daemon without a TCP port or the HTTP-ingest overhead. One reader task per connection feeds the bounded event channel (the same back-pressure model as stdin), with a 1 MiB per-line cap so an unterminated line cannot exhaust memory.
  • --output unix:///path/to.sock (also accepted by --dlq) writes NDJSON detections and incidents to a collector listening on a local socket, reconnecting once on a transient write failure before routing to the DLQ.
  • --api-addr unix:///path/to.sock serves the health, metrics, and /api/v1/* API (plus OTLP ingestion when built with daemon-otlp) over a permission-gated local socket. The socket is created 0600 and unlinked on clean shutdown, and a stale socket left by a crashed run is reclaimed on the next start. TLS terminates on TCP only, so --tls-cert/--tls-key combined with a unix:// address is rejected at startup, and a unix:// address is exempt from the non-loopback plaintext-bind refusal (the socket file is the trust boundary).

On non-Unix targets unix:// is rejected with the existing unsupported-scheme config error.

Security: transitive anyhow bump for RUSTSEC-2026-0190 (#271)

cargo deny flagged RUSTSEC-2026-0190, an unsoundness in anyhow's Error::downcast_mut() where adding context via Error::context and then calling downcast_mut on the returned error violates borrow rules and triggers undefined behavior. It reaches the tree transitively. A targeted cargo update -p anyhow moves the workspace lockfile from 1.0.102 to the fixed 1.0.103 with no other dependency changes.

Explain and introspect toolkit: detection explain, pipeline diff, correlation introspection (#270)

A data-aware diagnostics suite that answers questions static tooling structurally cannot. Validation, linting, and the LSP operate on rule files with no event data, so they answer "is this rule well-formed?" A rule can be valid and still silently fail to match the event it was written for. This toolkit answers the orthogonal question: given this rule, this event, this pipeline, and this correlation state, why did I get this result? It ships in three independent tiers, all additive with no hot-path change.

  • Detection explain. A new rsigma engine explain --rules <path> --event <json|@file|-> runs a non-short-circuiting, bloom-free recording evaluator over one rule and one event and reports, for every condition node and field, whether it matched and why not (field absent, value mismatch with the actual value, case mismatch, existence, no keyword match). The default is an indented human tree with pass/fail markers and a one-line reason per failed leaf; --output-format json|ndjson serializes the full trace and csv|tsv emit a flat per-leaf table. Optional -p/--pipeline, --rule-id, and --show-pipeline flags mirror the eval surface. The verdict is computed from the same eval primitives the engine uses, so it can never disagree with engine eval (pinned by a property test). The new public rsigma_eval::explain_rule plus the RuleExplanation/ConditionTrace/DetectionTrace/ItemTrace/MatchReason model expose the same trace as a library API, reusing the CompiledMatcher::describe() helper from match-detail (#186).
  • Pipeline transform diff. A new rsigma pipeline diff --rules <path> -p <pipeline> serializes the rule AST before and after apply_pipelines_with_state, prints a unified diff, and lists the applied transformation ids, so a field rename or an AllOf to AnyOf expansion that a pipeline performs is visible before evaluation. --output-format json emits { before, after, applied_items, changed }. The same transformation summary prints before each trace under engine explain --show-pipeline. The pipeline command group no longer requires the daemon feature; only pipeline resolve stays gated.
  • Correlation window introspection. A new read-only CorrelationEngine::introspect() (and an id/group filtered variant) projects, per correlation and group, the current aggregate versus the threshold (the gap made explicit), the window contents, the last alert and remaining suppression, and the seconds until the next eviction. It is surfaced offline by engine eval --dump-correlation-state (print the final snapshot after replaying an NDJSON file, to stderr so stdout stays machine consumable) and live by two read-only daemon endpoints, GET /api/v1/correlations (compiled correlation list with per-group counts) and GET /api/v1/correlations/state (per-group window snapshot, filterable by ?id= and ?group=). Both compose with schema routing through the shared correlation store.

Complete rstix Data Model + Serialization phase (#268)

Closes the Data Model + Serialization phase for rstix with semantic validation, spec-audit alignment on MUST vs SHOULD rules, and remaining model gaps:

  • Bundle::validate() — returns ValidationReport with advisory warnings: STIX-W0031 (TLP v1 encoding), SCO deterministic id mismatch, granular-marking selector semantics, language-content field/type/list-length checks, ISO 3166-1 alpha-2 country codes, region open vocabulary, CAPEC/CVE external references, relationship endpoint matrix, and encryption-algorithm closed vocabulary. Parse remains permissive for these SHOULD-level rules.
  • Removed Bundle::raw_object() — unmodeled properties round-trip via extra_properties and common.extra only.
  • ObservedDataEmbeddedObject — deprecated observed-data objects map accepts embedded SCO or SRO members.
  • common.extra on SdoSroCommonProps / ScoCommonProps — captures unknown top-level keys on standalone leaf deserialize; drained into bundle extra_properties during bundle parse.
  • email-message — added subject_enc and body_enc fields (STIX §6.6.2).
  • Tests: tests/validation.rs with negative fixtures under tests/fixtures/validation/.

STIX domain objects, bundle parse, and reference validation (rstix) (#265)

The Data Model + Serialization phase adds the full STIX 2.1 domain-object layer and bundle ingestion to rstix:

  • 19 SDO types under model::sdo with SdoObject enum dispatch, per-field rustdoc, IndicatorPattern and ObservedDataForm enums, typed ref unions, and strict "type" deserialize on every leaf type.
  • StixObject top-level enum (SDO / SCO / SRO / Meta / Custom) with QueryableStixObject delegation and x_* top-level property capture during bundle parse.
  • Bundle::parse / parse_bundle() — typed bundle container, duplicate-id rejection, bundle-scoped reference existence and kind checks, relationship matrix validation, x_* merge on serialize, bundle id/spec_version rules (STIX §8), and extra_properties() for vendor extensions.
  • model/validate.rs — shared validators for common props, ref kinds, relationship endpoints (55 STIX 2.1 matrix entries), CAPEC/CVE external refs, and SCO format checks.
  • Fixtures and tests: rich spec-based fixtures for all 19 SDOs, bundle integration tests, 304 crate tests with roundtrip_strict coverage.

Spec-audit alignment: removed stricter-than-spec empty-name and empty-collection checks; added missing ref-kind, meta-object, extension, and bundle rules documented in the crate README invariant table.

Webhook HMAC request signing (#266)

The webhook sink can now HMAC-sign every outbound request so a receiving endpoint can verify the delivery's authenticity and integrity, and reject replays. Signing is opt-in per webhook through a signing: block and is computed over the exact rendered body bytes. It is most useful for the custom and internal relay endpoints an operator controls; the public chat and paging services do not verify a sender HMAC, so it complements the existing per-webhook TLS and bearer-token options rather than replacing them.

  • The default standard scheme follows the cross-industry Standard Webhooks convention, emitting webhook-id, webhook-timestamp, and webhook-signature: v1,<base64 HMAC-SHA256 of "{id}.{timestamp}.{body}">. A github scheme emits X-Hub-Signature-256: sha256=<hex> over the body, and a custom scheme exposes the header name, algorithm (sha256/sha512), encoding (hex/base64), value format, and signed-payload template for receivers like Stripe.
  • The HMAC key is read from the environment (signing.secret_env), resolved once at startup so a missing key fails the daemon at boot, and is never stored in the webhook YAML. secret_encoding: base64 decodes a svix-issued whsec_ secret, and rotate_secret_env emits a second signature for the duration of a key rollover (the standard and custom schemes).
  • The id, timestamp, and signature are minted once per delivery and reused on every retry, so a receiver dedupes redeliveries on webhook-id and enforces a replay window on webhook-timestamp. This rides on a new per-delivery DeliveryContext threaded through the shared sink delivery layer (a DeliverySink::deliver signature change).

Risk-based alerting: per-entity risk scoring and a risk-incident layer (#264)

A new optional post-engine daemon capability that shifts the unit of alerting from the individual detection to the entity it touches, modeled on Splunk RBA and Entity Risk Scoring. It runs in the sink path after enrichment and before the alert pipeline, so the evaluation hot path is untouched, and it is off until --risk <path> (or daemon.risk) is set.

  • Stage one annotates every in-scope firing with an integer risk score and one or more risk objects (entities such as user, host, src_ip). The score follows a documented precedence: the rsigma.risk_score custom attribute, then a tag_scores map (exact tag or a prefix.* wildcard, reduced by sum or max), then a level_scores map, then default_score. Risk objects are extracted with the shared field-selector namespace (rule, level, event.<path>, match.<field>, enrichment.<path>, correlation.group_key.<field>), so one firing can raise risk on several entities and enrichers can supply entity context. The score and objects are injected into header.enrichments under the reserved risk.score / risk.objects keys.
  • With emit_risk_events: true, the layer also emits a compact risk event per (detection, risk object) pair, disambiguated on the wire by a risk_event key and optionally routed to a dedicated NATS subject. event.<path> selectors require a retained event, with the same strip_event escape hatch as the alert pipeline.
  • Stage two (the incident block) runs a per-entity sliding-window accumulator that sums risk and tracks the distinct ATT&CK tactic count (from attack.<tactic> tags) and the distinct contributing-source count. A RiskIncidentResult fires when an entity crosses score_threshold or tactic_count_threshold over the window, subject to a per-entity cooldown. The incident is one flat NDJSON object disambiguated by a risk_incident_id (UUIDv4), carrying the entity, the window score, the contributing tactics and sources, the window bounds, the trigger, and the top contributing detections (include: refs | results); it is delivered through the existing incident sink path with an optional dedicated subject. The accumulator is bounded by max_open_entities, max_sources_per_entity, and max_results_per_incident with eviction accounting, and ages entries out on the sink-task tick.
  • Config hot-reloads on SIGHUP, file-watcher changes, and POST /api/v1/reload, keeping the previous config on a failed reload; in-flight accumulators survive the swap. Open entities are readable at GET /api/v1/risk.
  • State persists across restarts when --state-db is set: a versioned RiskStateSnapshot is saved to the SQLite store in its own rsigma_risk_state table on the periodic and shutdown hooks beside the correlation and alert-pipeline snapshots, and restored on boot with window-aware pruning. --clear-state skips the restore; a version mismatch starts fresh with a warning.
  • Nine pre-registered Prometheus metrics: rsigma_risk_annotations_total{action}, rsigma_risk_annotation_score, rsigma_risk_objects_total, rsigma_risk_entities_open, rsigma_risk_state_entries, rsigma_risk_evictions_total, rsigma_risk_incidents_emitted_total{trigger}, rsigma_risk_incident_results_total, and rsigma_risk_layer_duration_seconds.
  • The field-selector resolver moved to a shared crate-level rsigma_runtime::selector module so the alert pipeline and the risk layer share one implementation; rsigma_runtime::Selector and rsigma_runtime::alert_pipeline::Selector are unchanged.

Triage feedback loop: analyst dispositions and a per-rule false-positive ratio (#263)

A new opt-in daemon capability that captures analyst verdicts on the alerts a ruleset produces and turns them into a live per-rule false-positive ratio, the canonical SOC detection-quality metric. It is a measurement loop, not a case manager: it ingests a verdict and emits a ratio. Enabled with --enable-dispositions or daemon.dispositions.enabled: true; off by default, so existing deployments are unchanged.

  • A disposition is one JSON object: rule_id (required, with the title fallback the per-rule metrics use), verdict (true_positive, false_positive, or benign_true_positive), an optional scope (detection default or incident), optional fingerprint and incident_id alert identities, an optional RFC 3339 timestamp (default ingest time), and optional analyst and note for traceability. An incident-scoped verdict with no rule_id resolves to the incident's contributing rules through the live alert-pipeline incident map.
  • POST /api/v1/dispositions accepts a single object, a JSON array, or NDJSON, and returns an ingest summary (accepted, duplicate, rejected, plus per-record errors). GET /api/v1/dispositions returns the per-rule view (counts and the ratio) plus the active window, numerator, and minimum sample.
  • The ratio per rule is false_positive / total_dispositioned over a rolling window (daily buckets, default 30 days), suppressed until the rule reaches daemon.dispositions.min_sample (default 5) so a single false positive cannot publish a misleading 100%. Whether benign_true_positive counts toward the numerator is the daemon.dispositions.numerator knob (fp_only default, or fp_and_btp).
  • Two ingestion paths: the POST endpoint, and an optional pull source (--disposition-source, or daemon.dispositions.source) that reads a dynamic-source file (file, HTTP, or NATS) whose payload is the disposition records, refreshed per the source's policy. Redelivery is idempotent: dispositions dedup on (fingerprint or incident_id, verdict), falling back to (rule_id, timestamp, analyst) when no alert identity is carried, so a file re-read, a NATS redelivery, or an HTTP re-poll never double counts.
  • Four Prometheus metrics: rsigma_rule_false_positive_ratio{rule_title} (gauge, absent until the minimum sample), rsigma_dispositions_total{rule_title,verdict}, rsigma_disposition_ingest_total{source,result}, and rsigma_disposition_ingest_errors_total{reason}.
  • State persists across restarts when --state-db is set: a versioned DispositionSnapshot is saved to the SQLite store in its own rsigma_disposition_state table on the periodic and shutdown hooks beside the correlation and alert-pipeline snapshots, and restored on boot with window-aware pruning (buckets past the window are dropped). --clear-state skips the restore; a version mismatch starts fresh with a warning.
  • The same ratio feeds the rule scorecard command: the GET /api/v1/dispositions view deserializes directly as the scorecard's --triage input (a rules array keyed by rule_id carrying the true/false-positive counts and the derived fp_ratio), the schema this loop owns and finalizes.
  • The store is orthogonal to the eval and sink paths (fed only by its ingestion paths), so it cannot affect detection throughput. Config lives under daemon.dispositions (enabled, source, window, numerator, min_sample).

Rule hygiene and retirement report (#262)

A new rsigma rule hygiene subcommand assembles the signals rsigma already produces into one report of retirement and clean-up candidates, the detection-lifecycle phase the toolkit did not yet touch. It runs no evaluation: the static signals read off the parsed rules, the data-driven signals join optional snapshots. The feature is additive and ships with no new dependencies.

  • Signals. Seven, in one report: never-fired (silence) and noisy (a robust median-plus-MAD outlier test, with an absolute --noisy-threshold override) over a Prometheus snapshot or endpoint window; untagged (the same attack.* notion rule coverage uses, via a shared extractor); no-owner (from a custom_attributes owner key or the author field); incomplete-ads (a stable detection rule missing required ADS sections, mirroring the lint default bar); broken-fields (a rule whose referenced fields are all in a field-observability snapshot's never-seen set); and deprecated/stale (status: deprecated/unsupported, or a modified/date older than --stale-threshold).
  • Command. rsigma rule hygiene --rules <PATH>... [--metrics <FILE|URL>] [--metrics-window <DUR>] [--corpus <PATH>] [--fields <FILE>] [--silent-threshold <DUR>] [--stale-threshold <DUR>] [--noisy-threshold <N>] [--report <FILE>] [--fail-on <COND>]..., under the rule group. Only --rules is required; the static signals need nothing else, and --corpus is the offline alternative to --metrics for the silence and noisy signals. The report renders through the global output-format layer (TTY table, json/ndjson/csv/tsv) plus a --report JSON file, and a repeatable --fail-on (silent, noisy, untagged, no-owner, incomplete-ads, broken-fields, deprecated, or any) exits 1 under the house exit-code scheme. A hygiene config section carries the inputs, thresholds, and the gate default.
  • Internals. The ATT&CK tag extraction and the Prometheus exposition reader plus metrics loader were lifted into shared crate::rule_meta and crate::metrics_source modules that rule coverage and rule scorecard now consume, so the commands cannot drift on what "untagged" means or on how the per-rule counters are parsed. Behavior-neutral; the existing coverage and scorecard tests (and the promtext fuzz target) are unchanged.

ADS detection-strategy metadata and lint (#261)

Optional Palantir Alerting and Detection Strategy (ADS) metadata on Sigma rules, with enforcement in the linter and a new authoring command. The whole feature is additive metadata plus reads over it: no engine, eval, or hot-path changes, and no new dependencies.

  • Schema. The nine ADS sections map onto a rule's existing fields where they fit (goal from description, categorization from attack.* tags, false positives from falsepositives, priority from level) and carry the rest under a new rsigma.ads.* custom-attribute namespace (strategy, technical_context, blind_spots, validation, priority rationale, response). Values are pure documentation the engine never interprets. A per-rule rsigma.ads.exempt: true opts a rule out of enforcement. A single source-of-truth catalogue lives in rsigma-parser (ads::ads_catalogue()), modeled on the lint catalogue.
  • Lint. Eleven new lint rules in the built-in linter, opt-in via an ads: block in the layered .rsigma-lint.yml (enforce_status, required, severity): one ads_missing_* per section, ads_empty_section for a present-but-blank section, and ads_unknown_section for a typo under rsigma.ads.* (with a safe --fix rename). The checks fire only on detection rules whose status is in the configured enforce set (default [stable]) and reuse the existing catalogue, severity model, suppression, and tag_namespaces setting. The lint catalogue grows to 86 rules.
  • Command. A new rsigma rule doc subcommand reports each rule's present and missing ADS sections through the global --output-format layer or as a canonical --format markdown document, with --missing-only for the CI view and --scaffold/--in-place to prefill the rsigma.ads.* sections. --fail-on-missing makes it a standalone CI gate under the house exit-code scheme; a doc config section carries the gate default.
  • MCP. A new author_ads tool returns a rule's current and missing ADS sections plus a scaffold for an agent to complete, and a rsigma://ads/schema resource exposes the section catalogue alongside rsigma://lint/catalogue.

Documentation: rsigma-action in the CI/CD guide (#260)

The CI/CD guide and the README now document timescale/rsigma-action, the one-step GitHub Actions gate that wraps rule lint, rule validate, a merge-base fields-drift diff, rule backtest, and rule coverage into a single pull-request check with diff annotations, a sticky summary comment, and SLSA-attestation-verified cached binary installs. The manual multi-job workflow stays as the no-third-party-action and other-CI fallback.

Alert pipeline (#255)

A new optional post-engine stage in the daemon sink path, between enrichment and the sinks, configured with --alert-pipeline <path> (or the daemon.alert_pipeline config key) and hot-reloaded on SIGHUP, file-watcher changes, and POST /api/v1/reload; a failed reload keeps the previous pipeline active. It deduplicates results by a configurable fingerprint, modeled on Alertmanager: the first fire passes through and opens an active alert, subsequent fires fold into it, the alert re-emits on repeat_interval carrying the accumulated fire count, and it emits a final resolved record after resolve_timeout and is evicted.

  • Fingerprints are built from a shared field-selector namespace over EvaluationResult: rule, level, event.<path>, match.<field>, enrichment.<path>, and correlation.group_key.<field>. A malformed selector rejects the daemon at startup with an error naming the offending selector.
  • scope (rules / tags / levels) restricts which results the layer acts on; out-of-scope results pass through untouched. strip_event retains the event for selector resolution then drops raw event payloads before delivery. repeat_interval: 0 gives pure suppression with a single resolved summary on expiry.
  • The active-alert store is bounded by dedup.max_active_alerts (default 100000): once full, a first-fire for a new fingerprint passes through un-deduped rather than growing the store, so a high-cardinality fingerprint cannot exhaust memory.
  • Re-emit and resolved records ride the existing NDJSON wire shape, disambiguated by a dedup_state key in enrichments (alongside dedup_fingerprint, dedup_fire_count, dedup_first_seen, dedup_last_seen, and dedup_fields).
  • The Scope filter moved to a shared crate-level rsigma_runtime::scope module; the enrichment module re-exports it, so rsigma_runtime::Scope and rsigma_runtime::enrichment::Scope are unchanged.

A second stage groups dedup survivors into incidents. It assigns each survivor to an incident, annotates the pass-through result with incident_id in enrichments, and emits a higher-level IncidentResult on the Alertmanager timers.

  • Two modes: group_by (default) groups by equality on a selector list with a deterministic incident id stable across restarts; an opt-in entity_graph union-find merges incidents sharing an entity value, guarded against the giant-component failure by a stop_values list and a per-value max_value_cardinality ceiling.
  • Incidents emit on group_wait (initial batch), group_interval (updates), and repeat_interval (re-emit), and emit a final resolved record after resolve_timeout. include: refs | results controls how much contributing detail is embedded, bounded by per-incident caps.
  • IncidentResult is one flat NDJSON object disambiguated by an incident_id key, delivered via an additive Sink::send_incident across stdout/file/NATS (with an optional nats_subject override routing incidents to a dedicated subject); OTLP and webhook sinks do not receive incidents. Open incidents are readable at GET /api/v1/incidents.
  • Nine Prometheus metrics across both stages: rsigma_dedup_results_total{action}, rsigma_dedup_store_entries, rsigma_dedup_evictions_total, rsigma_dedup_summaries_emitted_total, rsigma_incidents_open, rsigma_incidents_emitted_total{trigger}, rsigma_incident_results_total, rsigma_incident_overmerge_total{guard}, and rsigma_alert_pipeline_duration_seconds.

A silencing stage mutes results matching operator-defined matchers before dedup, modeled on Alertmanager silences.

  • A matcher is selector <op> value over the field-selector namespace, with the =, !=, =~, !~ operators (regex anchored); a matcher set is ANDed. The matcher engine is shared with the forthcoming inhibition stage.
  • Silences carry a time window (optional RFC 3339 starts_at/ends_at), a derived pending/active/expired state, and an origin: static silences declared under silences: in the config (re-seeded on hot-reload) and api silences created at runtime over POST /api/v1/silences. Expired silences are garbage-collected.
  • New endpoints: GET/POST /api/v1/silences and DELETE /api/v1/silences/{id}. A muted result is acked and dropped before dedup, so it neither emits nor opens an incident. Dynamic (API) silences are bounded by max_silences (default 1000); creation past the cap returns 429.
  • Two metrics: rsigma_silenced_total and rsigma_silences_active.

An inhibition stage mutes a target result while a matching source is active, modeled on Alertmanager inhibit_rules.

  • Config-driven inhibit_rules, each { source_match, target_match, equal, duration } reusing the matcher engine. While a result matching source_match has been seen within duration, any result matching target_match sharing the same equal selector values is muted.
  • Carries Alertmanager's self-inhibition guard (a result matching both sides does not inhibit itself) and is non-transitive: a silenced source still inhibits its targets (the active-source index is updated from every non-inhibited result before silencing), but an inhibited target does not become a source.
  • Two metrics: rsigma_inhibited_total{rule} and rsigma_inhibit_sources_active.

The alert pipeline persists its state across restarts when --state-db is set.

  • A versioned AlertPipelineSnapshot (active dedup alerts, open incidents, dynamic silences, and the inhibition active-source index) is saved to the existing SQLite store in its own rsigma_alert_pipeline_state table, on the periodic and shutdown hooks beside the correlation snapshot, and restored on boot.
  • Restore is window-aware: dedup alerts past resolve_timeout, incidents past their resolve_timeout, silences past ends_at, and inhibition sources past their rule's duration are pruned. Deterministic group_by incident ids survive the restart; a version mismatch starts fresh with a warning. --clear-state skips the restore and --keep-state forces it, matching the correlation-state flags.

rstix: SCO per-field rustdoc (#254)

  • Per-field documentation on all 18 SCO types, 12 predefined extensions, and nested public structs (EmailMimePart, WindowsRegistryValue, X509V3Extensions, PE header/section types, etc.).
  • Removed #![allow(missing_docs)] from model::sco and model::sco::extensions; strict cargo doc now enforced for the SCO surface.
  • Runnable # Examples on representative types using spec fixtures.

rstix: STIX cyber-observable (SCO) model (#248)

All 18 STIX 2.1 cyber-observable types land in model::sco with strict fixture-backed round-trips:

  • Types: artifact, autonomous-system, directory, domain-name, email-addr, email-message, file, ipv4-addr, ipv6-addr, mac-addr, mutex, network-traffic, process, software, url, user-account, windows-registry-key, x509-certificate.
  • Dispatch: ScoObject (#[non_exhaustive]) delegates QueryableStixObject; created() / modified() always None for SCO arms.
  • Typed ref unions: DomainNameResolvesToRef, DirectoryContainsRef, NetworkTrafficEndpointRef, EmailMimeBodyRawRef with cross-type negative fixtures.
  • Extensions: 12 predefined SCO extensions under model::sco::extensions (archive-ext, ntfs-ext, pdf-ext, raster-image-ext, windows-pebinary-ext, http-request-ext, icmp-ext, socket-ext, tcp-ext, unix-account-ext, windows-process-ext, windows-service-ext) validated from parent validate().
  • Invariants: ModelError variants for enforced SCO rules; integration tests in tests/spec.rs use roundtrip_strict.

Logsource-aware evaluation (#249)

Opt-in, conflict-based logsource pruning in the evaluation engine (rsigma-eval). Engine::set_logsource_extractor installs a LogSourceExtractor that derives each event's logsource from configurable fields (defaulting to product/service/category) plus optional static defaults, and the engine then skips any candidate rule whose logsource conflicts with the event's before matching. Disabled by default with the hot path unchanged, and fail-open: an event with no extractable logsource is evaluated against every rule.

  • Conflict-based, not subset: a rule is skipped only when a dimension (product, service, or category) is set on both the rule and the event and the values differ, so an event tagged only product: windows skips product: linux rules while still evaluating Windows-category and logsource-less rules. This is distinct from the existing subset logsource_matches, which is unchanged.
  • Backed by a product-partitioned rule index, so always-evaluated rules of a conflicting product are never iterated rather than filtered after matching; service and category remain a residual filter. Evaluation of a product-tagged event against a ruleset split across products drops roughly in proportion to the conflicting fraction.
  • --logsource-routing on engine eval and engine daemon enables pruning; --logsource-field-map product=...,service=...,category=... remaps the event field names each dimension is read from; --event-logsource product=windows,... sets a static logsource for a single-source pipeline. The same keys live under a logsource_routing block in the daemon and eval config sections, with the usual CLI > env > file precedence. Schema routing and logsource pruning compose: each routed per-schema engine prunes its own candidates.
  • EVTX-only format default: engine eval -e @file.evtx supplies product: windows when no explicit or static product is configured. Ambiguous wire formats (JSON, syslog, logfmt, CEF, OTLP) never infer a product, so a conflict-based misprune cannot silently drop rules.
  • Two Prometheus counters on the daemon: rsigma_rules_pruned_by_logsource_total and rsigma_events_without_logsource_total (fail-open visibility).
  • Correlation inherits the pruning, since CorrelationEngine evaluates through the same engine; hot-reload carries the extractor across engine swaps.

Schema-aware routing (#246)

--schema-routing on engine eval and engine daemon classifies each event and routes it to the pipeline-set bound to its schema, instead of applying one pipeline set to every event. Bindings come from the routing: section of --schema-config (bindings, default_pipelines, on_unknown); --on-unknown overrides the unknown-handling policy (warn, drop, passthrough, error).

  • Multi-engine dispatch: one detection engine is built per distinct pipeline-set; each event is classified, then evaluated against the engine for its schema's bound pipelines, with a default-set fallback for known-but-unbound and unknown schemas. Batch detection across events runs in parallel (under the parallel feature); correlation stays sequential.
  • Unified cross-schema correlation: detections from every per-schema engine feed one shared correlation store, and group-by extraction is schema-aware, so the same entity (a user, host, or IP) correlates across schemas even when each schema names the field differently (for example ECS user.name versus User).
  • Hot-reload rebuilds the per-schema engines and carries the shared correlation state across the swap. Dynamic (${source.*}) pipelines bound to a schema are resolved at load time and on hot-reload, with the same fail-closed policy as non-routing pipelines.
  • Config-file support: the schema flags map to a schema block under both daemon (observe, routing, config, on_unknown) and eval (routing, config, on_unknown) in the layered config, a flag always winning over the file.

Schema-aware log source recognition (#245)

Content-based schema classification that recognizes the structure of each event from its marker fields and values rather than its wire format, so a mixed JSON stream of ECS, flat Sysmon, rendered Windows Event Log, CEF, and OCSF events can be told apart.

  • engine classify: a diagnostic that reads a single event, an NDJSON file, or stdin and reports the recognized schema (or unknown) per event plus a per-schema summary, rendered through the global output-format layer. --schema-config merges user-defined signatures over the built-ins.
  • Daemon schema observability: --observe-schemas classifies every event and exposes the per-schema breakdown and unknown rate over GET /api/v1/schemas and the rsigma_events_by_schema_total{schema} and rsigma_events_unknown_schema_total metrics. An optional --schema-config merges user signatures over the built-ins.
  • Declarative signatures (field present/absent, any-of, equals, regex) live in rsigma-eval; built-ins cover ECS, OCSF, rendered Windows Event Log, Sysmon, CEF, and a low-specificity generic_json fallback. An event matching no signature is reported as unknown, the signal for an unsupported schema.

Fixed

  • jq extract expressions can no longer terminate the process. The halt and halt_error filters are implemented in jaq-std with std::process::exit, so a single source or enrichment expression could take the whole engine down; both are now removed from the supported filter surface and surface as an ordinary expression error instead (#247).

Dependency bumps (#257)

Rolls up the open Dependabot PRs into a single merge, regenerating the lockfiles against current main rather than replaying stale lockfile bases. Rust (workspace Cargo.lock): opentelemetry_sdk 0.32.0 to 0.32.1 (#256), jaq-core 3.0.0 to 3.1.0 (#235), jaq-json 2.0.0 to 2.0.1 and jaq-std 3.0.0 to 3.0.1 (#258), insta 1.47.2 to 1.48.0 (#233), evtx 0.11 to 0.12.2 (#232), and the patch-updates group (#253) regex 1.12.3 to 1.12.4, daachorse 3.0.1 to 3.0.2, time 0.3.47 to 0.3.49, prost 0.14.3 to 0.14.4, getrandom 0.4.2 to 0.4.3, and uuid 1.23.2 to 1.23.3 (the getrandom bump drops the wit-bindgen/wasi build toolchain 0.4.2 pulled in); regex 1.12.3 to 1.12.4 in fuzz/Cargo.lock (#229). CI (all repinned by commit SHA, batched via the actions-updates group, #252): actions/checkout v6.0.3 to v7.0.0, taiki-e/install-action v2.81.4 to v2.82.0, and rust-lang/crates-io-auth-action v1.0.4 to v1.0.5. VS Code extension: @types/node 25.9.1 to 25.9.3 and @types/vscode 1.120.0 to 1.125.0 (#251), plus the transitive form-data 4.0.5 to 4.0.6 (#224), js-yaml 4.1.1 to 4.2.0 (#225), markdown-it 14.1.1 to 14.2.0 (#226), and undici 7.25.0 to 7.28.0 (#236). The rusqlite 0.39 to 0.40.1 bump (#234) is held back: it pulls libsqlite3-sys 0.38.1, whose build script needs the cfg_select! macro that is unavailable on the pinned MSRV (1.88.0).

v0.17.0...v0.18.0