Skip to content

v0.16.0

Latest

Choose a tag to compare

@mostafa mostafa released this 15 Jun 08:27
· 22 commits to main since this release
fa246de

TL;DR
RSigma v0.16.0 is the "MCP server" release:

  • MCP server: a new Model Context Protocol integration that exposes the Sigma toolchain to AI agents.
    • rsigma-mcp crate and rsigma mcp serve (opt-in mcp feature): typed tools (parse, lint, validate, evaluate, convert, fix) plus field/backend/pipeline introspection and reference resources, with enrichment-aware evaluation (#208).
    • Remote transport and config: Streamable HTTP (rsigma mcp serve --http), constant-time bearer-token auth, in-process TLS, and a new mcp config section wired through rsigma config and the environment layer (#209).
    • Smoke harness: scripts/mcp-smoke.py drives a built server end to end over stdio and HTTP across every tool and resource as a standard-library CI job (#210).
    • Prerequisite refactor: the auto-fix applier, modifier/MITRE reference data, and the 75-rule lint catalogue move into rsigma-parser so the CLI, the LSP, and the MCP server share one implementation, behavior unchanged (#207).
  • backend convert per-rule file output: point --output at a directory to write one file per converted rule, named from the rule title with the backend's native extension (#205).
  • Configurable correlation state caps: --max-state-entries exposes the global entry cap and a new --max-group-entries bounds a single group's window state, with matching config keys and a per-rule attribute (#200).
  • Fibratus conversion fixes: corrected process_creation/process_termination/create_remote_thread field mappings and registry_event scoping against the Fibratus 3.0.0 vocabulary, thanks to @rabbitstack (#202).
  • Correlation window-mode benchmarks: a throughput suite plus a non-Criterion peak-memory stress target for the sliding/tumbling/session modes shipped in v0.15.0 (#199).
  • rstix: data-model skeleton and common property containers for the STIX 2.1 library, with leaf-type serde, thanks to @SecurityEnthusiast (not yet releasable on its own) (#201).
  • Dependency and security bumps: rolls up six Dependabot PRs and patches three RustSec PostgreSQL advisories (#206).

Developer tooling: MCP smoke harness (#210)

scripts/mcp-smoke.py drives a built rsigma mcp serve binary end to end over stdio and Streamable HTTP (with bearer auth), exercising all 11 tools and 3 resources as a post-build sanity check, and runs as the MCP Smoke CI job. Standard-library only.

MCP server: Streamable HTTP transport, bearer auth, and mcp config keys (#209)

Adds a remote transport and configuration to the MCP server.

  • Streamable HTTP transport. rsigma mcp serve --http <addr> serves the MCP endpoint at /mcp over HTTP (stdio stays the default). Built on rmcp's StreamableHttpService mounted on axum.
  • Bearer-token auth. --auth-token <token> (or RSIGMA_MCP_AUTH_TOKEN) requires a static token on every request, compared in constant time; requests without it get 401. The token is flag/env-only and never read from config files.
  • TLS. --tls-cert/--tls-key terminate TLS in-process using the daemon's rustls loader (requires the daemon-tls feature). Plaintext binds on non-loopback addresses are refused unless --allow-plaintext.
  • Config keys. A new mcp config section (mcp.http_addr, mcp.lint_config, mcp.rules_dir) is wired through rsigma config init/validate/show/schema and the RSIGMA_MCP__* environment layer. The auth token stays flag/env-only by design.

MCP server: rsigma mcp serve and the rsigma-mcp crate (#208)

A new Model Context Protocol server exposes the rsigma Sigma toolchain to AI agents (Cursor, Claude Code, ...) as structured tools. Instead of scraping CLI text, an agent calls typed tools and gets back JSON: ASTs, lint findings with spans and fix availability, evaluation matches, backend queries, and field inventories.

  • rsigma mcp serve. A new command group (Commands::Mcp) running the server over stdio, gated behind a new opt-in mcp Cargo feature (build with --features mcp; the prebuilt binaries and Docker image include it). Flags: --lint-config (applied by the lint tool) and --rules-dir (a default root for relative path-based tool calls).
  • rsigma-mcp crate. A new library crate built on rmcp 1.7 with the RsigmaMcp handler and serve_stdio. Ten core tools: parse_rule, parse_condition, lint_rules, validate_rules, evaluate_events, convert_rules, list_backends, list_fields, resolve_pipeline, and list_builtin_pipelines. Every tool accepts inline content (yaml/condition/events) xor a file path; stdout is reserved for the transport and diagnostics go to stderr.
  • fix_rules tool. Applies safe auto-fixes to Sigma YAML (lowercase keys, status/level typos, duplicate removal, ...) preserving comments and formatting, and returns the fixed YAML plus applied/failed/skipped-unsafe counts. Unsafe fixes are never auto-applied. write: true (only valid with a file path) persists the change to disk; an optional lint_rules filter restricts which lint rules are fixed.
  • MCP resources. rsigma://lint/catalogue (the 75-rule catalogue as JSON), rsigma://reference/modifiers, and rsigma://reference/mitre-tactics let agents ground themselves on the exact lint vocabulary and modifier semantics without spending tool calls.
  • Enrichment-aware evaluate_events. An optional enrichers (inline YAML/JSON) or enrichers_path parameter builds an enrichment pipeline and enriches results before returning; loader validation errors (including template-namespace checks) come back as structured errors, so the tool doubles as an enricher-config validator.
  • rsigma_runtime::enrichment::config. The enrichers YAML loader (load_enrichers_file, build_enrichers, build_enrichers_full, EnrichersFile) moves from the CLI daemon into rsigma-runtime so the daemon and the MCP server share one loader. The daemon is rewired to the moved loader with behavior and error text unchanged.
  • Docs. A new MCP server guide, the mcp serve CLI page, an rsigma-mcp library page, and the mcp feature entry in the feature-flags reference.

MCP server prerequisites: shared fix applier, reference data, and lint catalogue (#207)

Internal refactors that lift three pieces of lint and reference machinery into rsigma-parser so the CLI, the LSP, and the upcoming MCP server share one implementation. Behavior is unchanged for existing commands.

  • rsigma_parser::lint::fix. The string-level auto-fix applier (json_pointer_to_route, apply_single_fix_patch, apply_rename_key) moves from rsigma-cli into the parser, with a new apply_fixes_to_source(source, &[&LintWarning]) -> SourceFixOutcome entry point that applies every safe fix to a YAML string and reports applied/failed counts. The yamlpath/yamlpatch dependencies move with it. rsigma rule lint --fix keeps its file-on-disk behavior through a thin wrapper.
  • rsigma_parser::reference. The MODIFIERS and MITRE_TACTICS tables move out of the LSP binary (where they were unreachable cross-crate) into a public parser module; the LSP re-exports them so hover/completion are unchanged.
  • rsigma_parser::lint::catalogue. A new catalogue() returns per-rule metadata (id, default severity, fix disposition, one-line description) for all 75 lint rules, generated from a single list whose exhaustive match makes adding a rule without a catalogue entry a compile error.

backend convert: per-rule file output when --output is a directory (#205)

rsigma backend convert can now write one file per converted rule instead of a single concatenated stream. When --output points at a directory (an existing directory, or a path with a trailing separator that is created on demand), each rule is written to its own file named after a snake_case slug of the rule title, with the backend's native extension. This was prompted by Fibratus rule-deployment ergonomics: the engine loads one YAML rule per file from its Rules/ directory, so the split output drops straight in without hand-separating the ----joined stream.

  • Naming. File stems are a slug of the rule title (Detect Whoami becomes detect_whoami), falling back to the rule id and then a rule literal when the title slugifies to nothing. Colliding names get a numeric suffix (same.yml, same_2.yml) so two rules never overwrite each other. A rule that converts to several documents (for example a temporal correlation expanded with -O temporal_permute=true) keeps them together in its one file, finalized through the backend so the format-aware separators land inside.
  • Extensions. A new Backend::output_file_extension hook picks the per-rule extension: yml for the Fibratus YAML envelope (txt for its bare-expression expr format), sql for PostgreSQL, and txt by default. Single-file and stdout output are unchanged.
  • Docs. The Fibratus backend reference, the rule-conversion guide, and the README document the directory-output workflow (rsigma backend convert rules/ -t fibratus -p fibratus_windows -o ./Rules/).

Fibratus conversion: corrected field mappings and registry event scoping (#202)

Three correctness fixes to the fibratus_windows pipeline shipped in #191, found while converting more of the upstream Fibratus rules library.

  • Process field coverage. process_creation and process_termination gain the field mappings they were missing against the Fibratus 3.0.0 vocabulary: OriginalFileName -> ps.pe.file.name, CurrentDirectory -> ps.cwd, ProcessGuid -> ps.uuid, ParentProcessGuid -> ps.parent.uuid, IntegrityLevel -> ps.token.integrity_level, Company -> ps.pe.company, Description -> ps.pe.description, Product -> ps.pe.product, and FileVersion -> process.pe.file.version. process_termination additionally picks up the CommandLine, User, LogonId, and Parent* mappings it previously lacked entirely, so a process-exit rule that touches any of those fields now converts instead of failing.
  • Thread events. create_remote_thread maps TargetImage -> evt.arg[exe], so a rule that scopes the injected-into process converts rather than dropping the field.
  • Registry event scoping. The registry_event logsource category now prepends an evt.category = 'registry' discriminator as its first condition, the same treatment the other categories already get. Fibratus rejects a rule at load time when it has no event-type scoping by name or category, so without this the converted registry_event rules would not load.

rstix: Phase 2 slice 1 — model skeleton and common properties (#201)

Phase 2 (Data Model + Serialization) begins with slice 1 of ~7. This slice is not releasable on its own.

  • model module: ModelError and model::common property containers — SdoSroCommonProps (required spec_version, created, modified; confidence as Option<Confidence>), ScoCommonProps (SCO-only fields), ExternalReference (STIX §2.5.2: non-empty source_name plus at least one of description, url, or external_id enforced on construction and deserialization), GranularMarking (marking_ref XOR lang; non-empty selectors), and ExtensionMap / ExtensionType.
  • Leaf-type serde: serde_impls/ for StixId, timestamps, and Confidence; typed-ID serde in the define_typed_id! macro; inline LanguageTag serde.
  • Tests: fixture-backed integration tests in tests/spec.rs (tests/fixtures/spec/common/); core serde unit tests in src/core/.

Configurable correlation state caps: --max-state-entries and a new per-group entry cap (#200)

The correlation engine's memory bounds are now fully operator-configurable. Previously the global (correlation, group-key) entry cap (max_state_entries, default 100,000) was a library-only setting with no CLI surface, and nothing bounded the growth of a single group's window state, which grows with timespan x event rate on chatty groups.

  • --max-state-entries <N> on engine eval and engine daemon (config key daemon.correlation.max_state_entries) exposes the existing global hard cap. When reached, the stalest entries are evicted down to 90% capacity and a warning is logged, as before. A drift-guard test pins the CLI default to the engine's CorrelationConfig default across the two crates.
  • --max-group-entries <N> (config key daemon.correlation.max_group_entries, per-rule rsigma.max_group_entries custom attribute) is a new, opt-in cap on retained entries within a single group's window state: timestamps for event_count, (timestamp, value) pairs for value_count and the numeric aggregations, and per-referenced-rule hits for the temporal types. On overflow the oldest entries are dropped, which can only under-count (aggregates saturate; a correlation that needed the evicted entries may not fire). Session windows always keep their oldest entry as the span anchor, so truncation cannot silently extend the timespan cap. Unset means unbounded, the historical behavior, so existing deployments are unaffected.
  • API. CorrelationConfig gains max_group_entries: Option<usize>, CompiledCorrelation gains the matching per-rule override, and WindowState gains truncate_oldest(cap, preserve_front). The per-rule attribute follows the same resolution order as the other rsigma.* correlation attributes: rule override wins over the engine default.
  • Docs. New flags documented on the engine eval and engine daemon CLI pages, the configuration file reference, the custom-attributes reference, the processing-pipelines attribute table, and the Performance Tuning guide's correlation-memory section (which also now states that the global cap bounds group count, not bytes within a group, and is global rather than per-rule).

Correlation window-mode benchmarks: throughput and peak-memory stress suite (#199)

Two new benchmark surfaces for the correlation window modes shipped in #192, prompted by the SEP #214 discussion on memory becoming the bottleneck in stateful window correlation (high-cardinality group keys, long-lived sessions).

  • correlation_window_modes Criterion group (cargo bench -p rsigma-eval --bench correlation -- correlation_window_modes): sliding vs tumbling vs session on an identical event_count workload. All three modes run at ~1.4-1.5 Melem/s — the window decision in apply_window_open is O(1), so declaring window: session is free at evaluation time.
  • correlation_memory bench target (cargo bench -p rsigma-eval --bench correlation_memory): not a Criterion suite — it installs a counting global allocator and reports peak/settled heap deltas, which Criterion cannot observe. Three scenario families: high-cardinality session keys against the max_state_entries cap (1M unique keys held to a 39.8 MiB peak by stalest-first eviction; ~256 B per live session group uncapped), long-lived chatty sessions (8 B per in-window event_count event; value_count with distinct strings costs ~92 B per event and drops to 63 Kelem/s at 1,800 distinct values per window because the distinct count is recomputed per event), and a three-mode comparison on identical load (identical memory and throughput).
  • Documentation: results recorded in BENCHMARKS.md (new Window Modes and Window-Mode Memory Stress sections plus Key Observations); the Performance Tuning guide's correlation-memory section now documents what the cap does and does not bound, per-event state costs by correlation type, the value_count distinct-count hot spot, and the cardinality-flood eviction caveat — and fixes a long-standing inaccuracy (the max_state_entries cap is global across all correlation rules, not per rule). The streaming-detection guide links the window-mode semantics to the measured numbers, and the developer testing guide documents the non-Criterion bench target.

Dependency and security bumps (#206)

Rolls up six open Dependabot PRs into a single merge and patches three RustSec advisories. Rust (workspace Cargo.lock), batched via the patch-updates group (#197): log 0.4.30 to 0.4.32, chrono 0.4.44 to 0.4.45, daachorse 3.0.0 to 3.0.1, async-nats 0.49.0 to 0.49.1, hyper 1.10.0 to 1.10.1, and uuid 1.23.1 to 1.23.2; libfuzzer-sys 0.4.12 to 0.4.13 in fuzz/Cargo.lock (#195). CI (all repinned by commit SHA, batched via the actions-updates group, #198): actions/checkout v6.0.2 to v6.0.3, github/codeql-action v4.36.0 to v4.36.2, and taiki-e/install-action v2.79.12 to v2.81.4. VS Code extension: vscode-languageclient 9.0.1 to 10.0.0 (#196), @vscode/vsce 3.9.1 to 3.9.2 (#194), and esbuild 0.28.0 to 0.28.1 (#203); the three all touched editors/vscode/package.json and package-lock.json, resolved by keeping the newest of each and regenerating the lockfile. Security: the transitive PostgreSQL client stack pulled in through rsigma-convert moves postgres-protocol 0.6.11 to 0.6.12 and postgres-types 0.2.13 to 0.2.14 (RUSTSEC-2026-0179, RUSTSEC-2026-0180) and tokio-postgres 0.7.17 to 0.7.18 (RUSTSEC-2026-0178), closing the three denial-of-service advisories published 2026-06-12.

v0.15.0...v0.16.0