Skip to content

v0.14.0

Latest

Choose a tag to compare

@mostafa mostafa released this 05 Jun 09:56
· 53 commits to main since this release
b5334eb

TL;DR
RSigma v0.14.0 is the "layered config, structured output, and correctness/hardening" release:

  • Layered YAML configuration with explicit precedence (flag > env > project > user > system > default) plus a new rsigma config group (init, validate, show, schema, path, reload).
  • Structured output everywhere: a global --output-format <json|ndjson|table|csv|tsv> selector with a TTY-aware default, plus global --color, --quiet, and --no-stats.
  • Custom linter tag namespaces via a repeatable --tag-namespace flag and a tag_namespaces config key, so organisation-specific tags no longer force disabling unknown_tag_namespace wholesale, thanks to @fwosar.
  • Sigma correctness: multi-field value_count composite keys, compile-time rejection of multi-field numeric aggregations, empty value_median returns None, cross-crate detection-name selector consistency, and convert-side rejection of modifiers it cannot express.
  • Runtime hardening: a category-based HTTP egress policy (SSRF/cloud-metadata defense applied at DNS resolution), a 10 MiB enricher response cap, hot-reload that preserves engine tuning, and fail-closed dynamic-source resolution.
  • Evaluator and parser robustness: compile-time rejection of conflicting detection-modifier combinations, allocation-free JsonEvent dot-path traversal, and CLI diagnostics that stop silently swallowing invalid status / level / related: metadata.
  • Detached dynamic sources: pipeline-embedded sources: now warns louder on stderr and through the daemon hot-reload path.
  • Release pipeline, CI, Docker, and supply-chain hardening before publish, two batched Dependabot rollups, and a docs-accuracy sweep across the site.

Documentation accuracy: TLS, feature flags, metric and lint counts, CLI surface, endpoint inventory, benchmark freshness (#181)

A docs-only sweep that closes the accuracy gaps that accumulated over the v0.13.x line. No source code changes; every fix points the documentation at the actual behaviour that ships in the binary.

  • Daemon TLS is no longer described as roadmap. docs/reference/http-api.md and docs/reference/architecture.md previously told operators that in-process TLS termination was planned and linked to issue #128. The daemon-tls Cargo feature, the --tls-cert / --tls-key / --tls-client-ca / --tls-min-version flag set, and the SIGHUP cert hot-reload all shipped in the v0.14.0 release window; both pages now point at the existing security.md#tls-termination-for-the-api-listener write-up instead.
  • Feature flag catalogue matches the manifest again. docs/reference/feature-flags.md opened by claiming a workspace of seven crates (it has been six since the binary / rsigma-cli split). The daemon-tls row listed rustls-pemfile as a pulled-in dependency; the actual manifest pulls rustls, tokio-rustls, rustls-pki-types, x509-parser, hyper, hyper-util, and tower-service. The "per-feature CI matrix" section described a per-feature opt-in matrix that does not exist in .github/workflows/ci.yml today (CI runs --all-features plus the three-OS test matrix). All three drifts are corrected, and the production-recommended cargo install recipe now includes daemon-tls.
  • Metric counts agree across the three pages that publish them. docs/reference/metrics.md headlined "30 metric names across four concerns" while its own section headings summed to 37 rows; the actual registry in crates/rsigma-cli/src/daemon/metrics.rs exposes 38 metric names under --all-features (33 always-present plus 3 OTLP and 2 TLS gated on the matching build features), grouped into seven concerns. Engine core is 17 metrics, not 16. docs/guide/streaming-detection.md and docs/guide/observability.md propagated the stale "27" number; both are now aligned, and observability gains the previously missing enrichment (6) and TLS (2) rows.
  • Lint rule counts are honest. docs/reference/lint-rules.md claimed 66 built-in checks; one of them (empty_filter_rules) is enum-only and not emitted in production. Page now reads "65 built-in checks plus 1 reserved enum value". The "Filter rules (7)" heading was actually a table of 8 rows including the reserved variant -- relabelled "Filter rules (8 IDs, 7 emitted)". The "Detection-modifier hygiene (5)" heading listed 7 rows that are not duplicates of the detection section above -- relabelled "Detection-modifier hygiene (7)" with the misleading "subset of the detection rules above" wording removed.
  • CLI global flags are fully documented. docs/cli/index.md listed only --log-format and asserted "every subcommand accepts one global flag", missing the other four globals (--output-format, --color, --quiet, --no-stats) that have shipped alongside it. The overview now describes all five with their defaults, accepted values, effect, and the layered flag > env > config > default precedence model. The command tree gains the previously omitted rule migrate-sources entry, and docs/cli/rule/lint.md drops the stale command-local --color flag (color is global now) and documents the four machine renderers (json, ndjson, csv, tsv) the lint command honours when --output-format is set explicitly.
  • Command-group overviews list every group. docs/getting-started/concepts.md claimed "the five command groups" but the table only listed four (engine, rule, backend, pipeline); add the missing config row with its six subcommands (init, validate, show, schema, path, reload). The rule row picks up migrate-sources. docs/reference/output.md drops rule validate from the table output consumers (the command always prints its bespoke per-file summary regardless of --output-format) and spells that out so operators are not surprised when the selector does nothing on that command.
  • POST /api/v1/sources/resolve/{source_id} is in the HTTP API inventory. The daemon registers both the body variant (/api/v1/sources/resolve with a JSON body that names one source) and the path-parameter variant (/api/v1/sources/resolve/{source_id} with no body). Only the body variant was documented; the path variant now appears in both the summary table and a short body section with the success response (200 {"status":"resolve_triggered","source_id":"..."}) and the two failure responses (404 when no dynamic sources are configured, 429 when a refresh for the same source is still in flight).
  • Benchmark figures are labelled as captured on v0.9.0. BENCHMARKS.md (and the docs-site mirror docs/benchmarks.md that includes it) carried Date: 2026-05-07 / Version: 0.9.0 headers; the workspace has since shipped through v0.13.0 and parts of the hot path have moved. Relabel as "Date captured" / "Captured on version" and add a one-paragraph freshness admonition that asks anyone refreshing the numbers to update the metadata block in the same commit.
  • Site-level loose ends. The llmstxt plugin block in mkdocs.yml now lists rule/migrate-sources, every cli/config/* page, reference/output.md, reference/configuration.md, and guide/enrichers.md -- five public pages that an LLM consuming the generated llms.txt had no way to surface before. docs/developers/testing.md had a stale CLI E2E table ("12 files / 167 tests") that missed seven files added since (cli_config.rs, cli_daemon_enrichment.rs, cli_daemon_fields_observer.rs, cli_daemon_tls.rs, cli_migrate_sources.rs, cli_output_format.rs, cli_sources_deprecation.rs); the page now lists 19 files with their per-file test counts and asks readers to verify the exact total against their tree rather than copy a stale number forward.

Eval and convert internals: modifier validation, dot-path perf, golden routing (#180)

Three independent quality fixes for the evaluator and converter that all surface bugs the previous code silently swallowed or paid an avoidable allocation for.

Conflicting modifier combinations are now rejected at compile time. compile_detection_item previously turned the parsed modifier list into a flat boolean context and dispatched through compile_value in a fixed precedence order. Whichever flag the dispatch checked first won, so a rule declared as Field|cidr|contains silently produced a CIDR match with contains dropped, Field|re|contains produced a regex match with contains dropped, Field|gt|contains ran the numeric comparison and dropped contains, Field|exists|contains collapsed to an existence check that dropped both the substring matcher and the value, Field|wide|utf16 silently picked whichever UTF-16 dialect the dispatch implemented first, and Field|i with no |re silently became a no-op. The rules still compiled, still matched something, but the semantics were never what the author wrote. A new validate_modifiers pass runs before compile_value and rejects five categories of contradiction: more than one operator per item (the operator set spans contains / startswith / endswith / re / cidr / exists / fieldref / gt / gte / lt / lte and every timestamp part); more than one UTF-16 encoding from wide / utf16 / utf16be; base64 together with base64offset; any value transformation (base64 / base64offset / wide / utf16 / utf16be / windash / expand) on a field that also carries a non-string operator that does not consume the transformed value; and the regex flag modifiers (|i / |m / |s) without |re. Legal combinations stay legal: |re|i|m|s, |base64|wide, |contains|cased, |contains|all with multiple values, |contains|neq, |re|neq, and a single timestamp part all continue to compile. Errors flow through the existing EvalError::InvalidModifiers variant with a message that lists every offending modifier so the rule author can pick which one to drop. The full SigmaHQ corpus (rules/ plus rules-compliance/ plus rules-emerging-threats/ plus rules-placeholder/ plus rules-threat-hunting/, ~3.7k detection rules at the pinned CI SHA) compiles unchanged.

JsonEvent dot-path traversal no longer allocates per lookup. JsonEvent::get_field is called once per detection item per event for every nested-field rule (process.command_line, actor.id, …) and also drives keyword scans, group-key extraction for correlation, value-count and numeric aggregation field reads, FieldRef matchers, and timestamp extraction. The dot-notation branch previously did let parts: Vec<&str> = path.split('.').collect(); and walked the slice, allocating a small vector on every lookup whose only purpose was to be sliced once per recursion. The walker now consumes the leading segment with str::split_once('.') on each recursion and re-passes the unconsumed path on the array branch (matching the existing OR semantics for events.actors.name style lookups). The pathological trailing-dot case (a.b.) and consecutive-dot case (a..b) keep matching None rather than falsely returning the leaf or panicking; two regression tests cover both inputs.

Postgres and LynxDB goldens are now routed through convert_collection. Both runners previously parsed the SigmaCollection and called Backend::convert_rule in a loop, bypassing the orchestration layer that the rsigma backend convert CLI uses. The gap meant that pipeline-state plumbing, per-rule error collection, and the _rule_tables / _rule_schemas / _rule_queries correlation map injection were never exercised by the goldens. Both tests/golden_postgres.rs and tests/golden_lynxdb.rs now invoke convert_collection(&backend, &collection, &[], "default") and assert on the flattened query output, with a hard assertion on output.errors.is_empty() so a silent partial conversion now fails the test instead of producing an empty actual string. The 20 existing goldens (11 Postgres, 9 LynxDB) pass unchanged.

Parser and CLI diagnostics: invalid metadata, output controls, panic-free migrate (#179)

Tightens five small but visible cracks in the parser and CLI surface that all silently swallowed problems an operator was almost certainly trying to catch.

Invalid status / level and malformed related: entries are now surfaced. parse_detection_rule, parse_correlation_rule, and parse_filter_rule previously coerced any unparseable status: or level: into a silent None (get_str(m, "status").and_then(|s| s.parse().ok())), and parse_related filter_mapped away any item that was not a mapping, was missing id/type, or carried an unknown type. A typo such as status: stabel or type: derved round-tripped to the in-memory rule with the field absent and no diagnostic. The parsers now thread a &mut Vec<String> for warnings, push index-qualified messages (related[2] invalid type 'derved' (expected one of: derived, obsolete, merged, renamed, similar)), and let parse_sigma_yaml extend SigmaCollection.errors with the result. Existing CLI surfaces (rule parse, rule validate, the "Loaded rules" path) already render collection.errors, so the new warnings flow through unchanged.

New SigmaCollection ergonomics. Three helpers cover the "treat any error as failure" path that downstream callers were re-implementing each time: SigmaCollection::has_errors(), SigmaCollection::error_count(), and SigmaCollection::into_result() (consumes the collection and returns Err(Vec<String>) when anything failed, Ok(self) on a clean parse). The stale doc on the errors field that referenced a non-existent collect_errors flag is replaced with the actual contract.

rule lint honours --quiet and --no-stats. Both global flags had no effect on the human renderer, so CI scrapers that piped only findings still got a "Loaded lint config: …" progress line on stderr and a "Checked N file(s): … passed, … failed" trailing summary on stdout. The summary block is now gated by OutputCtx::show_stats() and the config-load progress by show_progress(); findings still print under both flags. The structured tracing::info!("Lint summary", …) event continues to fire so log-based consumers still see the per-run totals.

Invalid global.output_format / global.color config values now warn instead of silently falling back. A typo like output_format: xml or color: rainbow in the YAML config used to bypass the OutputFormat::parse / ColorChoice::parse filter, return None, and revert the operator to the TTY-aware defaults with no signal. A new output::warn_invalid_global_output wrapper between config::discovered_global_output and OutputCtx::resolve validates both strings, emits a stderr warning that lists the accepted alternatives, and strips the bad value so the resolver still falls back cleanly. The command itself still succeeds because the warning is informational.

rule migrate-sources no longer panics on a pipeline read race. After writing the extracted sources.yml, the rewrite loop in cmd_migrate_sources re-read each pipeline file with std::fs::read_to_string(path).unwrap(). A file deleted between scan and rewrite (or a permission flip on a flaky filesystem) crashed the CLI. The read now matches the soft-error pattern the std::fs::write call below it already used: print a warning: line, skip the offending pipeline, and keep going. The extracted sources file is already on disk at that point, so a single unreadable pipeline does not invalidate the other rewrites.

Release pipeline, CI, Docker, and supply-chain hardening

Tightens every link in the release chain before v0.14.0 ships so the act of publishing itself does not undermine the correctness work that already landed.

publish.yml no longer masks cargo publish failures with || echo "::warning::… already published or failed". A new pre-flight step dry-runs every crate in dependency order before any side-effecting publish; authentication, lockfile drift, and dependency-resolution issues now abort the workflow before any crate hits crates.io. Every actual publish passes --locked. The workflow_dispatch trigger keeps the dry-run rehearsal path; only release: published touches the real registry. The Swatinem/rust-cache step was removed to close a cache-poisoning vector against the signed artifacts.

release-binaries.yml pins the toolchain through dtolnay/rust-toolchain@... stable with targets: set inline (replacing the unpinned rustup update plus follow-up rustup target add). After downloading the per-target archives, the release job generates a SHA256SUMS manifest and uploads it as a release asset, covered by the same actions/attest-build-provenance subject path as the archives. Consumers who do not pull the SLSA attestation can still verify download integrity against the manifest.

Workspace. rust-toolchain.toml pins ambient cargo invocations to the MSRV of 1.88.0, so contributors who do not pass +stable build against the same Rust the MSRV CI job uses. A new [profile.release] block enables lto = "thin", codegen-units = 1, and strip = true. The pin surfaced an existing 1.88.0 clippy / rustdoc backlog (collapsible-else, uninlined-format-args, duplicated #[cfg(feature = "daemon-tls")], broken intra-doc links, unclosed <host> HTML tags); this release ships the cleanup so cargo clippy --workspace --all-targets --all-features -- -D warnings and cargo doc --workspace --no-deps -D warnings -D rustdoc::broken-intra-doc-links are gate-green.

CI. The Sigma corpus regression job now fetches SigmaHQ/sigma at a pinned commit (bumped by editing SIGMA_CORPUS_SHA in ci.yml) instead of master tip, so an upstream rule edit cannot turn the workspace red without a deliberate commit here. The MSRV check gains --all-targets so a test- or example-only dependency that requires a newer Rust cannot slip past the MSRV gate. The coverage job uploads lcov.info as an artifact for external trackers. A new doc job runs cargo doc --workspace --all-features --locked --no-deps with the strict rustdoc gate so a future broken intra-doc link fails CI rather than landing silently.

Docker. Both cargo build stages in the Dockerfile add --locked, and rust-toolchain.toml is part of the dependency-cache layer so the toolchain version is part of the layer's cache key. .dockerignore excludes /fuzz, /tests, /docs, /docs-drafts, /site, and /benches from the build context (root-anchored so per-crate tests/ subdirectories are unaffected). The Grype vulnerability scan runs as a 2-leg matrix (amd64 × arm64) so arch-specific CVEs cannot land on a release image unnoticed; SARIF uploads label findings by arch in the Security tab.

Supply chain. deny.toml denies wildcard dependency versions (wildcards = "deny"), surfaces unmaintained-crate advisories via unmaintained = "workspace", and grows a quarterly-review note plus "Last reviewed" date on the RUSTSEC-2021-0153 ignore. The audit workflow now triggers on deny.toml and its own workflow file in addition to manifest and lockfile changes, and pulls cargo-audit from taiki-e/install-action prebuilds instead of compiling it from source on every invocation. Dependabot picks up two new ecosystems: docker against / (so a new digest for the pinned rust:1-alpine base image surfaces as a PR) and npm against /editors/vscode (so the extension's TypeScript / @vscode/vsce / eslint deps follow the same weekly batching as the Cargo deps).

Runtime hardening: HTTP egress policy, body cap, hot-reload tuning, fail-closed dynamic sources (#167)

Cluster of P0 hardening fixes for the daemon's HTTP surfaces and rule hot-reload. None of these were exploitable in a default deployment, but each silently produced behavior different from what the operator (or rule author) wrote, and all of them ship together before v0.14.0.

HTTP egress policy for sources and enrichers. Both the dynamic-source HTTP resolver and the HTTP enricher previously accepted any URL declared by a rule or pipeline, including the cloud-metadata IMDS endpoint at 169.254.169.254, IPv6 link-local (fe80::/10), and the AWS IPv6 metadata address fd00:ec2::254. The new rsigma_runtime::EgressPolicy describes a category-based deny list applied at DNS resolution time (EgressFilteredResolver implements reqwest::dns::Resolve), so DNS rebinding cannot defeat host-string checks. Three presets ship: default (block link-local + cloud metadata, allow loopback + private), strict (also block loopback + RFC1918 private), and permissive. Per-category builders (with_block_link_local, with_block_cloud_metadata, with_block_loopback, with_block_private) cover the in-between cases. The policy is selectable on rsigma engine daemon via --egress-policy <default|strict|permissive> and on the layered YAML config via daemon.engine.egress_policy. Default is default.

HTTP enricher response body cap. HttpEnricher::enrich used to consume the upstream body via reqwest::Response::bytes, which buffers the entire response into memory. A misbehaving enrichment endpoint streaming an unbounded body could OOM the daemon. The fetch path now checks Content-Length up-front and streams chunks with a 10 MiB ceiling (DEFAULT_ENRICHER_MAX_RESPONSE_BYTES, matching the existing source-side cap), configurable per enricher via with_max_response_bytes.

Engine tuning survives hot-reload. LogProcessor::reload_rules rebuilt the RuntimeEngine through RuntimeEngine::new, which defaults bloom_prefilter off, bloom_max_bytes to None, and (with daachorse-index) cross_rule_ac off. Daemons that enabled those flags at startup silently lost them on every reload. The reload path now snapshots those settings on the old engine via three new accessors (RuntimeEngine::bloom_prefilter, bloom_max_bytes, cross_rule_ac) and replays them on the new engine before load_rules runs.

Dynamic-source reload fails closed. RuntimeEngine::load_rules resolved dynamic sources inside block_in_place and, when resolution returned an error, logged a warning and continued with the captured pipelines as-is. ${source.*} placeholders stayed unexpanded, producing rules with semantics different from the operator's intent. On a hot-reload this silently replaced a healthy engine with a broken one. Both the "resolver failed" and "no tokio runtime available" branches now return an error from load_rules, so LogProcessor::reload_rules propagates it and skips the engine swap. The captured pipelines are restored before the error returns so a retry sees the same input state.

Shared HTTP client for dynamic sources. resolve_http_with_limit constructed a fresh reqwest::Client on every call. Under a refresh storm (a dynamic-source pipeline polling several feeds every 30 seconds) this rebuilt TLS state, DNS resolvers, and connection pools each iteration. A process-wide OnceLock<Arc<reqwest::Client>> exposed through sources::http::shared_http_source_client now backs every source fetch; per-call timeouts ride along via RequestBuilder::timeout(...). HTTP enrichers already shared an Arc<reqwest::Client> (build_default_http_client), and the policy resolver above is wired through both clients.

API additions. pub use from rsigma-runtime: EgressDenial, EgressFilteredResolver, EgressPolicy, default_egress_policy, set_default_egress_policy, enrichment::http::DEFAULT_ENRICHER_MAX_RESPONSE_BYTES, sources::http::shared_http_source_client. EngineStats now derives Debug / Clone / Copy. HttpEnricher::with_max_response_bytes is the only new method on an existing public type.

Tests. Cumulative: 12 new unit + integration tests across rsigma-runtime (egress policy categories, IPv4-mapped IPv6 recursion, builder overrides, filtered resolver against literal 169.254.169.254 and 8.8.8.8, body-cap rejection via Content-Length, body-cap rejection via chunked-stream overflow, reload tuning preservation, fail-closed dynamic source, shared client Arc identity) plus 2 new integration tests in rsigma-cli (engine.egress_policy config-to-flag flow and clap rejection of an invalid policy value). The 14 enrichment integration tests (wiremock on 127.0.0.1) and the 35 source integration tests continue to pass under the default policy.

Sigma correctness: multi-field correlations, empty median, unsupported convert modifiers (#166)

Closes a cluster of silently-wrong evaluation and conversion behaviors so v0.14.0 ships none of them.

Multi-field value_count now uses a composite distinct-key. Previously the engine read fields.first() and ignored the rest, so field: [User, SrcIp] over events with the same user from different source IPs counted as one distinct value. The fix joins the rendered field values with the ASCII Unit Separator (\u{1f}) and counts distinct tuples; a missing field on any component drops the event (matching the prior single-field behavior). The single-field hot path keeps its old allocation profile.

Multi-field value_sum / value_avg / value_percentile / value_median are now rejected at compile time. The Sigma spec does not define how to combine several numeric fields under one of these aggregations. The previous behavior silently used only the first field and dropped data. The compiler now returns a structured CorrelationError listing the offending fields.

Empty value_median windows now return None. They used to return 0.0, which spuriously satisfied predicates like lte: 0 and eq: 0. The behavior now mirrors value_percentile, which already returned None for empty windows.

Detection-name selector matching is now consistent across crates. The evaluator's pattern_matches lacked the middle-* branch (sel*main) that the converter had, so the same selector pattern silently resolved to different detection sets in eval vs convert. Hoist a single detection_name_matches (plus SelectorPattern::matches_detection_name) into rsigma-parser and reuse it from both crates, with cross-crate tests covering exact, full wildcard, prefix wildcard, suffix wildcard, and middle wildcard cases.

The rsigma-convert default item dispatch rejects modifiers it cannot express. default_convert_detection_item previously fell through to Backend::convert_field_eq_str for any modifier it did not handle explicitly, so a rule using |neq, |base64, |base64offset, |wide, |utf16, |utf16be, |windash, |expand, regex flags without re (|m, |s), or timestamp parts (|minute/|hour/|day/|week/|month/|year) shipped SQL/SPL with different semantics from what the author wrote. The dispatch now returns ConvertError::UnsupportedModifier before the fall-through. Backends that handle one of these modifiers natively can override Backend::convert_detection_item and bypass the default. A defensive ok_or_else replaces the last unwrap() on the selector-dispatch path.

Dependency cleanup. base64 and ipnet were declared in crates/rsigma-convert/Cargo.toml but never referenced from anywhere under crates/rsigma-convert/src/. Dropped.

Docs. crates/rsigma-eval/README.md now explains that percentile selects which percentile to compute (not the threshold), that an empty window does not fire, and that the four numeric aggregations require a single field.

Custom tag namespaces for the linter (#161, #162)

rsigma rule lint no longer forces teams to disable unknown_tag_namespace wholesale just to use organisation-specific tags. A repeatable --tag-namespace <ns> flag and a tag_namespaces list in .rsigma-lint.yml register extra namespaces that are recognised alongside the built-in spec set (attack, car, cve, d3fend, detection, stp, tlp). Namespace values are normalised to lowercase, and when unknown_tag_namespace does fire its message lists the full combined set of known namespaces.

On the library side, LintConfig gains a tag_namespaces field that layers through the same merge path as disabled_rules and exclude_patterns. Both list-valued fields, exclude_patterns and tag_namespaces, are now de-duplicated (first occurrence wins) when a config file and CLI flags are layered, so overlapping entries no longer accumulate. Docs cover the new flag and config key across the lint CLI reference, the linting guide, the lint-rules reference, and the linter developer guide; the CLI and root READMEs list the flag.

TTY-aware output + structured output formats (#157)

Every rsigma subcommand can now emit its structured output in one of five formats, selected by a new global --output-format <json|ndjson|table|csv|tsv> flag. The default is TTY-aware: pretty JSON when stdout is a terminal, plain NDJSON when piped or redirected, so rsigma engine eval … | jq does the right thing without any extra flag and rsigma engine eval in a terminal is finally readable.

New global flags. Three more global knobs ride alongside --output-format and the existing --log-format:

  • --color <auto|always|never> honours NO_COLOR under auto (the default).
  • --quiet / -q suppresses every non-data line (progress, summary, fallback warnings); errors still go to stderr.
  • --no-stats suppresses only the trailing summary line; progress messages still appear.

All four resolve through the same layered precedence as the rest of the config: flag > RSIGMA_GLOBAL__* env > global.* in the YAML config > TTY-aware default.

Per-command rendering.

  • engine eval is the showcase. table renders a LEVEL | RULE | TYPE | DETAIL summary (numeric columns right-aligned). csv and tsv stream a header line plus one row per match. --pretty is preserved as a backwards-compatibility alias for "pretty JSON" and wins over the TTY default.
  • rule fields folds its --json flag into the new selector; --json is kept as a hidden deprecated alias for --output-format json. The legacy table view stays the default even when piped, so existing pipelines are unchanged. --output-format ndjson streams one field record per line.
  • rule lint keeps the coloured human view as the default. --output-format json emits a {summary, findings} envelope; ndjson streams one Finding per line; csv / tsv write a PATH,SEVERITY,RULE,LINE,MESSAGE table. The per-command --color flag is gone in favour of the global one; behaviour is identical.
  • rule parse, rule condition, rule stdin: routed through the shared JSON renderer; --pretty still defaults to on (the AST is small and human-friendly is the default).
  • backend convert: keeps its existing -f, --format for the backend query format and -o, --output for the output file unchanged. --output-format json wraps the queries in a {target, format, queries: [{rule_title, rule_id, query}, …]} envelope. The non-JSON tabular formats are not meaningful for free-form query text, so the command prints a stderr warning and falls back to raw text (the warning is itself suppressible with --quiet).

Output module. A new crates/rsigma-cli/src/output/ module owns the OutputFormat and ColorChoice enums, the OutputCtx resolver, the Tabular trait + width-aligning render_table (with auto-right-align for numeric columns), the streaming DelimitedWriter for CSV/TSV (hand-rolled RFC 4180-style escaping, no new dependency), and the Painter previously in commands/lint.rs (now reused by every command). The lint Painter is gone; the shared one resolves color from --color plus NO_COLOR plus TTY detection just like before.

Config schema. global.format is renamed to global.output_format (the old key was reserved for this work and was inert), and eval.format is dropped (it was inert too). The committed template, the JSON Schema emitted by rsigma config schema, and the schema drift-guard test all reflect the rename.

Tests. New unit tests in crates/rsigma-cli/src/output/mod.rs cover format / color parsing, TTY default resolution, --quiet / --no-stats semantics, CSV/TSV escaping edge cases, and the Tabular row shape. A new crates/rsigma-cli/tests/cli_output_format.rs integration suite (19 tests) exercises every format end to end on engine eval, rule lint, rule fields, and backend convert, plus the env-layer and config-file resolution and the flag-beats-env precedence.

Docs. New canonical docs/reference/output.md page (registered in docs/reference/.pages) covering formats, TTY behaviour, color, quiet/no-stats, precedence, and per-command behaviour. The eval CLI doc gains an output-format section and a table-view example; the env-vars doc lists the two new variables; root README and CLI README gain a Global flags section. The configuration reference example shows the new global.output_format / global.color keys.

Layered YAML configuration + rsigma config group (#152)

engine daemon and engine eval are now driven by an optional layered YAML config file with explicit precedence CLI flag > env > project file > user file > system file > compiled default, applied per leaf. The same machinery is exposed through a new rsigma config command group for scaffolding, validation, introspection, and reload.

Discovery (lowest to highest precedence): compiled defaults, /etc/rsigma/config.yaml, $XDG_CONFIG_HOME/rsigma/config.yaml (defaulting to ~/.config/rsigma/config.yaml), the nearest .rsigmarc walked up from the current directory, ./rsigma.yaml, the environment layer, and finally CLI flags. --config <PATH> replaces the discovery chain entirely with one explicit file. The XDG path is computed by honouring XDG_CONFIG_HOME directly, not dirs::config_dir(), so macOS stays under ~/.config/rsigma to match the rsigma install layout.

Schema. A single typed RsigmaConfigPartial (Option-typed partial structs merged by a generic merge) covers three sections: global (currently log_format; color/format are reserved for the output-format work), daemon (mirrors every non-secret daemon flag, with nested api/api.tls/input/output/correlation/state/engine/nats sub-sections), and eval (mirrors the eval flag surface). Secret-bearing daemon settings (NATS creds/token/password/nkey, TLS key password) are deliberately absent from the schema; they remain env/flag-only.

Resolution. The resolver folds each layer's partial into a serde_json::Value with a generic deep-merge and tracks the winning layer per leaf (default, file, env, flag). CLI flag wins are detected via clap ArgMatches::value_source; the env layer reads a uniform RSIGMA_<SECTION>__<KEY> scheme (the __ separator deliberately leaves the existing single-underscore clap-bound names like NATS_CREDS and RSIGMA_CONSUMER_GROUP untouched). Values are parsed as YAML scalars so ints/bools/lists coerce naturally. A defaults module of named constants is the single source of every default; clap's default_value attributes are referenced from those constants and a drift-guard test pins the two together.

rsigma config subcommand group. Six subcommands, all agent-friendly (data to stdout, diagnostics to stderr):

  • config init [-o PATH] [--force] writes a commented template (default ./rsigma.yaml) with a # yaml-language-server: $schema= header. Refuses to overwrite without --force.
  • config validate [-c PATH] [--format text|json] [--strict] deserializes every layer, warns on unknown keys via serde_ignored, warns on sections set but inert in this build (daemon.api.tls without daemon-tls, daemon.nats without daemon-nats, daemon.engine.cross_rule_ac without daachorse-index), and prints a structured envelope ({ ok, sources, unknown_keys, inactive_sections }) in JSON mode. --strict upgrades unknown keys to exit 3.
  • config show [-c PATH] [--for global|daemon|eval] [--format text|json|yaml] prints the effective config (defaults < file < env) with the source of each leaf.
  • config schema emits a JSON Schema (draft 2020-12) derived from the same partial structs the loader uses, via schemars::JsonSchema. The schema is what powers editor autocomplete (yaml-language-server) and what agents/CI can validate against.
  • config path [-c PATH] lists the config files that would be loaded.
  • config reload [--addr ADDR] [-c PATH] triggers a daemon hot-reload via POST /api/v1/reload, mapping 0.0.0.0/[::] bind addresses to loopback so the client can actually connect. Cross-platform (works on Windows, where SIGHUP does not exist); kill -HUP <pid> still works on unix.

Command wiring. DaemonArgs and EvalArgs both gain --config <PATH> and --dry-run. --rules is now optional on both: it can be supplied via daemon.rules / eval.rules instead, with a clear error if neither layer provides it. main() now goes through Cli::command().get_matches() + from_arg_matches so the daemon and eval dispatch paths (including the deprecated flat daemon and eval aliases) can hand the sub-ArgMatches to the resolver. global.log_format from a discovered config file (or RSIGMA_GLOBAL__LOG_FORMAT) drives the CLI log subscriber when --log-format is not passed, so the global section is no longer inert.

Dependencies. Adds the tiny serde_ignored 0.1 (unknown-key detection) and promotes schemars 1.x (already present transitively via jsonschema) to a direct dependency. No new top-level versions in Cargo.lock beyond serde_ignored. --config reload reuses the existing ureq dependency.

Tests. Unit tests cover layered file discovery, serde_ignored unknown-key collection, per-field precedence on both daemon and eval (CLI > env > file > default), JSON deep-merge with Null no-op, the RSIGMA_*__* env scheme, and the daemon-defaults drift guard. A new crates/rsigma-cli/tests/cli_config.rs integration suite (10 tests) exercises the real binary end to end: config init round-trips (the committed template validates clean and carries zero unknown keys), --force guard, unknown-key warnings + --strict exit, missing-file error, JSON schema emission, config show JSON source annotations, config path, and the config-to-command flow (engine eval reading rules from config, an explicit --rules overriding the config, and engine daemon --dry-run printing config values).

Docs. New docs/cli/config/{init,validate,show,schema,path,reload}.md pages with a .pages nav entry, a canonical docs/reference/configuration.md page (precedence, discovery, schema, env scheme, secrets policy, --dry-run semantics, version: 1 migration field) registered in docs/reference/.pages, --config and --dry-run rows added to docs/cli/engine/daemon.md and docs/cli/engine/eval.md, the top-level docs/cli/index.md updated to include the new config group in the quick-nav and command tree, and docs/reference/environment-variables.md rewritten to document the uniform RSIGMA_<SECTION>__<KEY> scheme alongside the legacy single-underscore names. Root README gains a Configuration section and the CLI README a config block under Subcommands.

Pipeline-embedded sources: deprecation gets louder (#140, closes #136)

Phase 3 of the detached-dynamic-sources cycle. Pipeline files that declare an inline sources: block now print a warning: line on stderr in addition to the existing tracing::warn! event:

warning: pipeline '<name>' (<path>) declares an inline 'sources:' block, which is deprecated and will be removed in v1.0. Migrate with `rsigma rule migrate-sources -p <path> -o sources.yml` and load via `--source sources.yml` on `rsigma engine daemon`.

The structured warning is unchanged (now enriched with a path field), so log aggregators that already parse the message keep working. The emission moves out of commands/daemon.rs into a new public rsigma_runtime::warn_pipeline_inline_sources helper that two paths share:

  • CLI startup. The CLI's load_pipelines (the entry point for engine eval, engine daemon, rule validate, rule fields, backend convert) and pipeline resolve both call the helper directly for every pipeline file loaded at startup.
  • Daemon hot-reload. RuntimeEngine::load_rules -> reload_pipelines in rsigma-runtime now calls the helper too, so a SIGHUP, file-watcher event, or POST /api/v1/reload that re-reads a deprecated pipeline surfaces the warning even though the daemon's reload path does not go back through the CLI's load_pipelines. Library consumers that drive RuntimeEngine themselves inherit the same behaviour.

Canonical-path deduplication via a process-wide OnceLock<Mutex<HashSet<PathBuf>>> inside the helper keeps the daemon from re-spamming the same pipeline path on every reload tick once the warning has already fired for it.

Doc and README sweep. Every example for dynamic sources now declares them in a standalone YAML file loaded via --source. The pipeline-embedded form is documented only as a short "Deprecated" callout that points operators at rsigma rule migrate-sources and the v1.0 removal issue (#137). The reference page (docs/reference/dynamic-sources.md), the user guide (docs/guide/processing-pipelines.md and docs/guide/enrichers.md), the daemon CLI page (docs/cli/engine/daemon.md), the top-level README, the CLI README, and the runtime README all switch to the external-file form. The CLI README's recipe-catalog refresh values also switch from the unsupported { interval: ... } mapping form to the literal-duration syntax (1h, 24h) that the parser actually accepts.

Deprecation timeline. v0.13.0 (#135) introduced the tracing::warn!. This release adds the louder stderr warning, plumbs the warning through the daemon hot-reload path, and hides the deprecated form from docs. v1.0 (#137) turns it into a hard parse error and removes the Pipeline.sources field.

Tests. A new cli_sources_deprecation.rs integration suite pins the stderr emission across rule validate, engine eval, and pipeline resolve, plus the dedup invariant when the same pipeline is passed twice via -p, the negative case (pipelines without inline sources do not warn), and the migration-command suggestion (the warning embeds the actual pipeline path so the suggested rsigma rule migrate-sources invocation is copy-pasteable). Three new unit tests in crates/rsigma-runtime/src/engine.rs exercise the runtime path directly: a RuntimeEngine::load_rules call records the canonical pipeline path in the dedup set, a clean pipeline does not, and a hot-reload (second load_rules call) leaves the dedup set unchanged. Two more in crates/rsigma-runtime/src/pipeline_deprecation.rs cover the dedup primitive in isolation.

Dependency bumps (#156)

Rolls up four open Dependabot PRs into a single merge. Rust: serde_json 1.0.149 to 1.0.150 and tower-http 0.6.10 to 0.6.11 in the workspace Cargo.lock (#154), with the same serde_json bump applied to fuzz/Cargo.lock alongside a resync of that stale lockfile to the current workspace state (the jaq 3.0 migration to jaq-core / jaq-json / jaq-std and the internal crate versions catching up from 0.11.0 to 0.13.0) (#153). CI: taiki-e/install-action 2.78.0 to 2.79.3, docker/build-push-action 7.1.0 to 7.2.0, github/codeql-action 4.35.4 to 4.35.5, and zizmorcore/zizmor-action 0.5.5 to 0.5.6, all repinned by commit SHA (#155). VS Code extension: the tmp dev dependency bumps 0.2.5 to 0.2.7, picking up the upstream security fix that rejects non-string and relative prefix / postfix / template values.

Dependency bumps (#178)

Rolls up eight open Dependabot PRs into a single merge. Docker: the rust:1-alpine base image digest moves from 606fd31 to 66f48b1 (#169). CI (all repinned by commit SHA, batched via the actions-updates group, #177): taiki-e/install-action 2.79.3 to 2.79.12, EmbarkStudios/cargo-deny-action 2.0.18 to 2.0.20, docker/setup-buildx-action 4.0.0 to 4.1.0, docker/login-action 4.1.0 to 4.2.0, github/codeql-action 4.35.5 to 4.36.0, and docker/metadata-action 6.0.0 to 6.1.0. Rust (workspace Cargo.lock): log 0.4.29 to 0.4.30 in the patch-updates group (#173), async-nats 0.48.0 to 0.49.0 (#174), and hyper 1.9.0 to 1.10.0 (#175). VS Code extension: @types/vscode ^1.116.0 to ^1.120.0, @vscode/vsce ^3.9.0 to ^3.9.1, and esbuild ^0.27.7 to ^0.28.0 in the npm-updates group (#170); typescript ^5.9.3 to ^6.0.3 (#171); @types/node ^20.19.39 to ^25.9.1 (#172). The three VS Code PRs all touched editors/vscode/package.json and package-lock.json; resolved by keeping the newest version from each PR and regenerating the lockfile with npm install --package-lock-only --ignore-scripts. The rusqlite 0.39.0 to 0.40.0 bump (#176) is deliberately deferred: it pulls in libsqlite3-sys 0.38.0, whose build.rs uses the cfg_select! macro that is not stable on the workspace MSRV of 1.88.0. It will be re-batched once MSRV is raised.

v0.13.0...v0.14.0