Skip to content

Normalize CLI v0.3.1

Choose a tag to compare

@github-actions github-actions released this 08 May 09:42
· 120 commits to master since this release

Removed

  • Semantic embedding search dropped from the shipping binary. normalize structure search, normalize context --semantic, the [embeddings] config section, daemon incremental re-embedding, markdown/commit/context-block embedding during structure rebuild, the embeddings cargo feature flag, and the semantic_compat shim are all removed from the normalize binary. The normalize-semantic crate remains published on crates.io for standalone use; symbol search is being redesigned around discrete tags for a future release. Release builds simplify to a single cargo build step (musl no longer needs a no-default-features carve-out).

Added

  • First-run grammar install. The first time a user runs any
    normalize subcommand other than grammars/--help/--version,
    normalize checks whether tree-sitter grammars are installed in the
    user's config directory (~/.config/normalize/grammars/). If not, and
    the session is non-interactive (piped or driven by an agent), grammars
    are downloaded and installed automatically with a stderr notice. In an
    interactive session the check is deferred to normalize init, which
    prompts implicitly by installing on first invocation. Subsequent
    invocations short-circuit on a .installed-version stamp file, so the
    check has zero cost after first use. Users who already have a populated
    grammar directory (e.g. from cargo xtask build-grammars or
    NORMALIZE_GRAMMAR_PATH) are detected and stamped without a download.
  • normalize structure test-fixtures — language extraction fixture runner. Discovers <lang>/<case>/input.<ext> + expected.json pairs under crates/normalize-languages/tests/fixtures/, extracts symbols/imports/calls via SymbolParser, and diffs against expected JSON. Flags: --lang <lang> (filter to one language), --fixture-dir <dir> (custom fixture root), --update (write actual output as expected — bootstrap mode). Schema: {exhaustive, symbols: [{name, kind, line}], imports: [{module, name, line}], calls: [{callee, line}]}. All fields optional; subset matching by default; "exhaustive": true for exhaustive checking.
  • normalize rules test-fixtures — fixture-based test format for native/syntax rules. Discovers test cases with input source + expected diagnostics under .normalize/rule-tests/ (configurable via --fixture-dir) and runs them through the engine with subset or exhaustive matching.
  • normalize view chunk <file> --chunk N and --around <pattern> — large-file navigation for agents. --chunk N divides a file into fixed-size chunks (default 100 lines, configurable via --chunk-size) and shows chunk N (1-indexed). --around <pattern> finds the first regex/substring match and shows ±50 lines of context (configurable via --context-lines). Multiple matches: --match-index I to navigate. Returns ChunkedViewReport with file, chunk, total_chunks, line_start, line_end, content, and (for --around) match_line, total_matches, match_index. Works with --json/--jq/--jsonl.
  • normalize view import-path <from> <to> — find the shortest import chain between two files via BFS over the resolved import graph (requires normalize structure rebuild). --all returns all simple paths up to --limit (default 5); --reverse finds paths from <to> to <from>. Emits file → file → file chains or "No import path found between A and B" when unreachable.
  • normalize edit add-parameter — add a parameter to a function signature and update all call sites. normalize edit add-parameter <file> <function> --param <name> --default <value> [--type <type>] [--position <N>] [--dry-run]. Uses tree-sitter to locate the function and argument lists; finds callers via the facts index (falls back gracefully when unavailable). Supports Rust, TypeScript/JavaScript, Python.
  • normalize edit inline-function <file> <line>:<col> — inline a single-use function at its call site within the same file. Locates the function definition (or resolves the name from a call-site position), substitutes arguments for parameters using whole-word replacement, strips the return keyword when the call is in expression position, replaces the call with the inlined body, and removes the now-dead definition. Supports JS/TS function declarations and arrow const bindings, Python def, and Rust fn. Conservative: aborts if the function has multiple return statements or if the argument count doesn't match. --force to inline a function with multiple call sites (only the first is inlined).
  • normalize edit introduce-variable <file> <range> <name> — extract an expression at the given line:col-line:col range into a named binding inserted before the containing statement. Supports Rust (let), JS/TS (const), Python (bare assignment).
  • normalize edit inline-variable — inverse of introduce-variable: replace all uses of a variable with its initializer and delete the binding.
  • Post-edit index invalidation for refactoring commands. After any refactoring edit (rename, introduce-variable, inline-variable, add-parameter, move, inline-function) writes files to disk, the command immediately notifies the daemon via the new FilesChanged request. The daemon broadcasts FileChanged events for each path and triggers an incremental index refresh — no need to wait for inotify or run normalize structure rebuild manually. Non-fatal: silently skipped when the daemon is not running.
  • normalize sync — copy a project (and its AI agent session metadata) to a destination for portability. Excludes target/, node_modules/, .git/objects/, .normalize/findings-cache.sqlite by default. After copying, rewrites absolute paths in the index DB so the copy works from its new location. Supports --dry-run, --verbose, --all (sync all known projects), --active N (only projects with activity in the last N days), --repo <glob>, --exclude <glob>. Subsequent runs are incremental via a manifest of previously synced files.
  • normalize sessions parallelization [session-id] — finds turns with sequential same-type tool calls that could be parallelized (e.g. Read(foo.rs) → Read(bar.rs)). --threshold N controls minimum group size (default: 2). Works per-session or aggregated across filtered sessions.
  • normalize sessions heatmap [session-id] — per-file read/write counts across a session. Classifies files as hot (>5 writes), read_only (reads with 0 writes — potential test gap), or normal. Sorted by write count, --top N to limit rows.
  • normalize sessions cost [session-id] — per-turn token cost breakdown using model-specific Anthropic pricing. Shows input/output/cache-read/cache-write tokens and estimated USD per turn. Summary includes total cost, cost without cache, cache savings (USD), and cache efficiency %.
  • Composable sessions messages filters and review state. sessions messages now accepts --has-tool, --errors-only, --exclude-interrupted, --turn-range, --min-chars, --max-chars. New sessions mark/sessions unmark subcommands write/remove session IDs from .normalize/sessions-reviewed; sessions list accepts --reviewed/--unreviewed to filter accordingly.
  • high-fan-out and high-fan-in native rules — coupling smell detection. high-fan-out flags files that import from more than threshold (default: 20) distinct resolved modules; high-fan-in flags files imported by more than threshold distinct files. Both are default disabled, require normalize structure rebuild, and are configurable via [rules.rule."high-fan-out"] threshold = N / [rules.rule."high-fan-in"] threshold = N. Tagged architecture and coupling.
  • boundary-violations native rule — configurable directory-level import boundary enforcement. Declare pairs like "services/ cannot import cli/" under [rules.rule."boundary-violations"] boundaries = [...]; the rule queries the structural index's imports table and reports each resolved import that crosses a boundary. Default disabled; requires normalize structure rebuild. Uses glob matching with services/ treated as services/**.
  • dead-parameter native rule — flags function parameters that are never referenced in the function body. Uses normalize-scope's ScopeEngine with @local.definition.parameter captures in locals.scm. Underscore-prefixed names (_x) are excluded. Supported: Rust, Python, JavaScript, TypeScript, TSX, Go, Java, C, C++, C#. Default disabled.
  • Re-export tracing in import resolution. pub use path::Item in Rust and export { X } from './y' in TypeScript/JavaScript are now extracted as re-exports (is_reexport = 1 in the imports table). After resolve_all_imports(), a new trace_reexports() pass follows re-export chains (up to 10 hops, with cycle detection) so imports in file A that land on an intermediate re-exporter B are updated to point to B's source file C. This improves call graph accuracy and find_callers/find_callees results across module boundaries. Schema bumped v11→v12.
  • #match?/#eq? predicate evaluation for tree-sitter queries. normalize_languages::satisfies_predicates now evaluates the standard predicates (#match?, #not-match?, #eq?, #not-eq?) so .scm query authors can filter captures by text content or equality. Unknown predicates continue to pass (forward-compatible).
  • normalize-surface-syntax pattern matching / destructuring as first-class IR nodesPat enum (Ident, Object(Vec<PatField>), Array(Vec<Option<Pat>>, Option<String>), Rest) and PatField { key, pat, default } added to the IR; Stmt::Destructure { pat, value, mutable } replaces the old lowering of const { a, b } = obj to multiple Stmt::Let bindings. TypeScript reader produces Stmt::Destructure with full structural fidelity (shorthand, renamed, nested, defaults, object rest, array holes, array rest); TypeScript writer emits const { a, b: c } = obj and const [x, y, ...rest] = arr. Python reader maps a, b = f(), [x, y] = arr, (a, *rest) = items. Lua writer lowers to local a, b = table.unpack(expr).
  • normalize-surface-syntax type annotations and template literals. Param struct replaces String for function parameters and carries type_annotation: Option<String>; Function gains return_type; Stmt::Let gains type_annotation. TypeScript and Python readers populate these fields; writers emit them. Expr::TemplateLiteral(Vec<TemplatePart>) added for backtick strings; TypeScript writer emits backtick syntax, Python writer emits f-strings, Lua writer lowers to .. concatenation.
  • normalize-surface-syntax import/export and class definitions in IR. Imports and exports are now first-class IR nodes (previously elided). Class definitions carry methods, fields, and inheritance.
  • normalize-surface-syntax Lua reader improvementsobj:method(args) desugars to obj.method(obj, args) with implicit self. Table constructor parsing fixes: ["string"] = value computed string keys properly extract the string value; multi-variable generic for (for k, v in pairs(t)) preserves all loop variables; numeric for reads the optional step field. Fixed elseif chaining bug. Object key emission uses idiomatic bare identifier syntax for valid Lua identifiers. String escaping handles null bytes.
  • normalize-surface-syntax JavaScript reader and TypeScript reader gap fixes. A dedicated JavaScript reader joins the existing TypeScript reader; remaining gaps in the TypeScript reader (comments preservation, additional grammar nodes) are filled.
  • normalize-typegen IR improvementsField gains nullable: bool, default: Option<DefaultValue>, and constraints: Option<FieldConstraints> (min/max, min/max_length, pattern, format). Schema::validate() checks for well-formedness: valid identifiers, no duplicate type/field names, unresolved Ref targets, and circular reference detection via DFS.
  • normalize-typegen JSON Schema, GraphQL SDL, and Protobuf output backends (backend-jsonschema, backend-graphql, backend-proto features). JSON Schema draft 2020-12 with $defs/$ref, anyOf/oneOf, required, additionalProperties: false. GraphQL SDL: structs → type/input, string enums → UPPER_CASE variants, tagged unions → union. Protobuf proto3: structs → message, string enums with mandatory _UNSPECIFIED = 0, tagged unions → oneof, arrays → repeated, optional fields → optional keyword.
  • normalize-typegen Protobuf and GraphQL SDL input parsers — round-trip support for both formats; combined with the writers above, schemas can flow between any pair of supported representations.
  • normalize-typegen --split and --dry-run CLI; IR source locations. Schemas can be emitted as one file per type via --split. IR carries source locations for error reporting.
  • Daemon-cached diagnostics for all engines. The daemon caches syntax, fact, and native rule diagnostics and serves them instantly on normalize rules run. Cache is primed eagerly on file changes (incremental for syntax/fact, full re-run for native) and lazily on first request. Falls back to local evaluation transparently when the daemon is unavailable.
  • Daemon push of diagnostic deltas via binary subscribe. New Event::DiagnosticsUpdated { root, updates } variant carries per-file deltas (only files whose issues changed since the last refresh; empty issues Vec = file became clean). Subscribe connections opened with the 0x01 magic byte stream events as length-prefixed rkyv binary frames; the JSON-line subscribe path is unchanged for backward compatibility. New DaemonClient::watch_events_binary decodes the binary frames. Eliminates the LSP poll-after-IndexRefreshed pattern. Wire-schema change: Event path fields are now String (previously PathBuf) so the enum is rkyv-serializable.
  • Per-file diagnostic storage in the daemon + JSON mirror. Schema bump v9→v10 adds a daemon_diagnostics_per_file table (path PRIMARY KEY, rkyv Vec<Issue> blob) populated on every prime/refresh. Per-file pulls (e.g. normalize rules run path/to/file.rs) hit this table directly via a new filter_files field on RunRules instead of fetching and filtering the whole "all" blob. The daemon also writes .normalize/diagnostics.json atomically on every prime/refresh as a canonical-state artifact for ephemeral consumers.
  • Daemon live-reloads .normalize/config.toml and .normalize/rules/** on change. Previously the daemon's notify watcher saw config edits but routed them nowhere, so cached RunRules results stayed under the old config until either a source file changed or the daemon was restarted. The dispatch loop now classifies config and rule-definition edits as a fourth route; the handler clears every cached diagnostic blob and triggers a reprime against the freshly-loaded NormalizeConfig. [daemon] startup keys (enabled, auto_start) still require a restart by design.
  • Cross-daemon-restart cache validity (config_hash gate). The SQLite-backed daemon diagnostic blobs (daemon_diagnostics, daemon_diagnostics_per_file) carry the hash of the inputs that produced them (binary version + .normalize/config.toml + .normalize/rules/**). On load, a hash mismatch is treated as a cache miss; the daemon reprimes under the current config. Schema bump v10→v11.
  • Tier 1 daemon config reload — filter-only changes apply at serve, no reprime. When .normalize/config.toml edits affect only severity, allow-lists, or enabled = false, the daemon updates its cached config snapshot, sets a serve_filter_pending flag, and lets the next RunRules re-filter the cached findings in place. No blob clear, no rule re-evaluation.
  • Tier 2 daemon config reload — per-rule surgical re-eval. When config edits affect only specific rules (a rule is newly enabled, or its extra threshold/field changed), the daemon re-runs only those rules through the syntax, fact, and native engines using the existing filter_ids path, splices updated findings into the per-engine blobs, rebuilds the "all" blob and per-file rows, and broadcasts a DiagnosticsUpdated event. ConfigDiff::rules_to_rerun names the affected rule IDs; surgical_rerun_rules implements the splice. [walk] exclude changes still trigger a full reprime.
  • ConfigDiff for surgical daemon cache invalidation. normalize-rules-config exports ConfigDiff::compute(old, new) which classifies a config change as filter-only, per-rule re-run, or full reprime. Consumed by the daemon tiers above.
  • Daemon .scm file hash tracking. Custom rule edits (.normalize/rules/**/*.scm) participate in the Tier 2 surgical re-eval path via content-hash tracking, not just mtime.
  • LSP native rule diagnostics. The LSP server publishes diagnostics from native rules (missing-summary, stale-summary, check-refs, etc.) alongside syntax and fact rule diagnostics. Native rules run debounced and workspace-wide. They re-trigger on .git/index changes so git add events immediately refresh stale-summary results.
  • No git binary required. All git operations now use gix (pure-Rust gitoxide): git blame (ownership, provenance, view history), git status --porcelain, path-filtered commit counts, git rev-list --count, budget metrics diff, ratchet ref-based check/measure. A git binary in $PATH is no longer a runtime dependency.
  • Configurable walker exclusions — new [walk] section in .normalize/config.toml controls directory walking. ignore_files configures which gitignore-format files are respected (default: [".gitignore"]; [] to disable). exclude accepts gitignore-style globs (default: [".git", ".claude/worktrees/"]). Threaded through native rules, syntax rules, the unified rules runner, the daemon, and the LSP server.
  • Co-change edge indexnormalize structure rebuild populates a co_change_edges table in the SQLite index with file pairs that frequently change together (co-change count ≥ 2, commits touching >50 files skipped as noise, per-file fanout capped at top 20 partners). Incremental: only new commits since the last rebuild are processed. normalize analyze coupling-clusters queries this table instead of re-walking git history; falls back transparently to the git walk when the table is empty. Rebuild output now includes a co_change_edges count.
  • stale-doc native rule — flags documentation files that are likely stale because strongly co-changed code files have been updated more recently. Queries the co_change_edges index for each doc file (**/*.md, **/*.rst, docs/**/*), finds code files it historically changes with, and flags the doc if any partner was committed more recently. SUMMARY.md is excluded (covered by stale-summary). Configurable via [rules.rule."stale-doc"] with min_co_changes (default 3), min_lag_days (default 0), and doc_patterns. Default disabled; requires normalize structure rebuild.
  • missing-test fact rule — flags public functions that are never called from a test function (a function with a test attribute such as #[test], @test, @Test, or @pytest.mark). Default disabled. Entry-point and module-boundary files excluded via the default allow list.
  • stale-mock fact rule — flags mock/stub functions (identified by attributes such as @Mock, @patch, @stub, mock, stub, fake) that call a callee which no longer exists as a symbol in the index. Catches mocks that were not updated after a rename or deletion. Default disabled.
  • normalize edit move — moves a symbol's definition to another file and rewrites import statements in every file that imported it from the old location. Per-language module-path derivation is best-effort: Python, Go, and JavaScript/TypeScript imports are rewritten when a new path can be derived; Rust and unsupported cases emit warnings and skip the import site rather than fabricating wrong paths. --reexport (Python only) leaves a re-export stub at the source location. Supports --dry-run and shadow-history --message. Leading decorations (doc comments, attributes, decorators, annotations, pragmas) preceding the symbol are included in the move, classified by tree-sitter node.kind() rather than text patterns.
  • normalize config validate deep validation. Runs four phases (TOML syntax, JSON Schema compliance, serde deserialization, rules config parsing) on both project and global config files. Reports errors with file path, line/column when available, and validation phase. Exits non-zero on errors for CI/hook use.
  • normalize grep <path> — optional positional path argument scopes the search tree (consistent with view, edit, rank). The existing --root flag is preserved for backward compatibility; path takes precedence when both are given.
  • normalize rules run --files — accept an explicit list of file paths, bypassing the file tree walker entirely. Critical for hook-grade latency where the caller already knows which files changed. Composes with --only/--exclude for further filtering.
  • normalize rules run --only/--exclude — glob pattern filtering. --only "*.rs" restricts to Rust files; --exclude "tests/" skips test directories. Applied pre-walk for syntax rules and advisory native rules; post-walk for fact rules.
  • normalize structure rebuild --only/--exclude — glob pattern filtering for which files get indexed. Files not matching the filter are removed from the index after the walk.
  • normalize analyze architecture --limit — caps the number of cross_imports entries in the output (default 20, --limit 0 disables). Reduces default JSON response from ~196KB to ~10KB.
  • Command aliasessearch/findgrep, lintrules run, checkci, indexstructure rebuild, refactoredit. Users from other tools find familiar names work transparently.
  • Tiered help outputnormalize --help groups commands into four sections (Core, Analysis, Utilities, Infrastructure) instead of a flat alphabetical list. Core commands (view, grep, edit, rules, structure, init) appear first.
  • Daemon-cache timing diagnosticnormalize rules run prints [timings] daemon-cache: ... to stderr when diagnostics are served from the daemon's pre-warmed cache, making it visible that the fast path was taken.
  • NORMALIZE_DAEMON_CONFIG_DIR test override — when set, this env var redirects the daemon's daemon.sock, daemon.lock, and daemon-spawn.lock to the named directory. Lets integration tests spawn isolated daemons without contending with the user's running instance. Production behavior is unchanged when unset.
  • normalize trend — top-level subcommand for time-series health metrics. Replaces normalize analyze complexity-trend, analyze length-trend, analyze density-trend, analyze test-ratio-trend, and analyze trend. New names: normalize trend complexity, normalize trend length, normalize trend density, normalize trend test-ratio, normalize trend multi.
  • normalize package tree --depth N — caps the dependency tree at depth N (0 = roots only). Limits both text and JSON output. Default: unlimited.

Changed

  • All tree-sitter grammars now load uniformly via dlopen(). Previously
    normalize-surface-syntax and normalize-typegen statically linked four
    arborium-* grammar crates (TypeScript, JavaScript, Lua, Python — and
    GraphQL in typegen), while every other grammar loaded dynamically through
    the shared GrammarLoader. The static linking is gone: those readers now
    request grammars from the process-wide grammar_loader() singleton like
    the rest of the codebase. Net effect on the binary: the four (five for
    typegen) compiled-in parsers are removed; the runtime requirement that
    the relevant .so files live in a search path was already true for all
    other languages and is unchanged. Cargo features read-typescript,
    read-javascript, read-lua, read-python, input-typescript, and
    input-graphql continue to work — they now gate the reader source code,
    not a grammar dependency.
  • musl release build is now fully self-contained — no system runtime dependencies. Previously the x86_64-unknown-linux-musl artifact was a static-pie binary (couldn't dlopen() grammar .so files), then briefly a dynamic binary that required the system to provide ld-musl-x86_64.so.1 and libc.musl-x86_64.so.1. The release tarball now bundles its own musl loader and libc alongside a tiny POSIX-sh wrapper script that invokes the bundled loader explicitly (exec "$DIR/runtime/ld-musl-x86_64.so.1" --library-path "$DIR/runtime" "$DIR/runtime/normalize.elf" "$@"). The artifact runs on any Linux x86_64 system — Alpine, distroless, NixOS without pkgs.musl, glibc-only distros — with no installed musl required. install.sh extracts the tarball under ~/.local/share/normalize/ and symlinks the wrapper into ~/.local/bin/. The musl artifact is now also the safe default on systems without glibc (and the only choice on NixOS).
  • rusqlite → libsql migration across the workspace. normalize-facts (CA cache), normalize-native-rules (findings cache), normalize-syntax-rules (findings cache), and the normalize sync path-rewrite all moved off rusqlite onto libsql. Resolves a sqlite link conflict that previously forced workspace-wide symbol coordination. No user-visible behavior change; cache files at ~/.config/normalize/ca-cache.sqlite and .normalize/findings-cache.sqlite are still SQLite and remain compatible.
  • *-allow files removed in favor of config.toml entries. The 7 legacy .normalize/*-allow files (large-files-allow, hotspots-allow, duplicate-blocks-allow, duplicate-functions-allow, duplicate-types-allow, similar-blocks-allow, similar-functions-allow) are no longer loaded. Their entries now live directly in config.toml: large-files-allow[rules.rule."long-file"] allow = [...]; hotspots-allow[analyze] hotspots_exclude; duplicate/similar command allowlists → [analyze.<subcommand>] allow = [...]. Migration: if you have custom entries in any *-allow file, move them to the appropriate config.toml section.
  • Per-rule config moved to [rules.rule."<id>"]. Previously [rules] hosted both engine-wide bare keys (e.g. global-allow, sarif-tools) and per-rule sub-tables (e.g. [rules."rust/dbg-macro"]) — a TOML namespace collision waiting to happen. Per-rule overrides are now nested under a dedicated rule sub-table; the bare-key namespace under [rules] is reserved for engine-wide configuration. The legacy layout is still parsed for one release with a stderr deprecation warning. Migration: rename every [rules."<id>"] to [rules.rule."<id>"]. Engine-wide keys stay where they are.
  • [walk] exclude now accepts gitignore-style glob patterns. Previously each entry was matched only against directory entry basenames. Patterns are now compiled via a gitignore matcher anchored at the project root, so any pattern that works in .gitignore works here. Existing configs (e.g. [".git", "worktrees"]) keep working unchanged because gitignore patterns without slashes still match at any depth.
  • normalize init --setup detects scratch directories. When .claude/worktrees/ is present, the bootstrap step adds it to [walk] exclude by default. The Default::default() for NormalizeConfig is now genuinely empty; the opinionated bootstrap config lives separately and is only applied during init.
  • normalize rules run --only/--exclude pre-walk scoping. Glob patterns are applied before file parsing and walking, not just after. Single-file --only runs are now proportional to the matched file count, not the full tree.
  • normalize rules run routes through daemon when running. If normalize daemon start is active, normalize rules run (and any invocation that hits fact rules) sends the request to the daemon via Unix socket and receives pre-warmed Datalog evaluation results instead of cold-evaluating from scratch (~45 seconds on large codebases). Falls back transparently when no daemon is running.
  • normalize structure rebuild defaults to incremental mode (mtime-based). Only files changed since the last build are re-indexed. Pass --full to force a complete rebuild. When no files have changed, the command prints "Index up to date". The --json output includes an incremental: true field when incremental mode was used.
  • normalize view --dir-context accepts an integer N instead of a boolean flag. N selects context files using Python list[:N] semantics on the target→root ordered list: 1 = target dir only, 2 = target + parent, -1 = all ancestors, 0 = none.
  • normalize view --dir-context JSON output includes a dir_context field in ViewReport containing the merged context content. Previously the context was only prepended to text output; agents using --json received no context.
  • normalize rules tags always populates the rules array in JSON output. The --show-rules flag has been removed.
  • normalize syntax ast default depth changed from unlimited (-1) to 5. Pass --depth -1 to restore the old unlimited behavior.
  • normalize analyze docs --json by_language field serializes as named objects {"documented": N, "total": N} instead of positional arrays.
  • normalize grammars list --json returns objects with name and path fields instead of bare strings. Text output is unchanged.
  • normalize analyze architecture compact output no longer truncates hub and symbol paths with opaque worktree-hash prefixes. Paths are shown as clean workspace-relative paths.
  • normalize context compact output includes <!-- source --> file path comments and --- separators between blocks when multiple context files are merged. Single-block output is unchanged.
  • normalize ci / normalize rules run compact output (N files) header now reads (N files checked) to clarify it is the number of files scanned, not files with issues.
  • normalize grep consecutive matches within the same symbol are grouped under a single (SymbolName L48-61): header rather than repeating the symbol tag on every line.
  • normalize view <file>:N-M header no longer duplicates the line range (was file.rs:10-20:10-20, now file.rs:10-20).
  • normalize analyze lengthnormalize rank length; normalize analyze test-gapsnormalize rank test-gaps — ranking commands moved under rank.
  • normalize analyze node-types removed — duplicate of normalize syntax node-types. Use the latter.
  • large-file rule renamed to long-file — consistency with long-function. Update [rules.rule."large-file"] to [rules.rule."long-file"] in your config.
  • long-file, high-complexity, long-function available as native rules. Threshold-based health findings from analyze health are now usable in normalize rules run --type native. Default disabled (advisory); enable via [rules.rule."long-file"] enabled = true or --rule long-file. Defaults: 500 lines, complexity 20, 100 lines. Thresholds configurable via threshold key.
  • FileRule trait for native file-based ruleslong-function, high-complexity, and long-file implement a FileRule trait providing automatic SQLite caching and parallel execution. New file-based rules get caching for free by implementing check_file() and to_diagnostics().
  • validate-calls-scm moved out of SARIF tools — the internal .calls.scm capture-name validation is no longer a [[rules.sarif-tools]] entry. It is now a direct step in scripts/pre-commit using a single grep invocation (~5 ms vs ~762 ms).
  • Mtime-based cache for SARIF tools[[rules.sarif-tools]] entries support an optional watch field (list of glob patterns). When set, normalize rules run caches the tool's output in the SQLite findings cache keyed by the max mtime of all matching files, skipping the tool on warm runs where nothing changed.

Performance

  • Persistent symbol cache for single-file commands. Extractor checks the CA cache (~/.config/normalize/ca-cache.sqlite) before running tree-sitter on a file. On cache hit (same blake3 content hash, grammar, and include_private setting), the stored Vec<Symbol> is returned immediately — no parse, no query execution. Cross-file resolver paths are excluded since their results depend on other files.
  • Persistent tree-sitter query cache for facts extraction. Tree-sitter query results are cached in SQLite keyed by content hash and grammar version; normalize structure rebuild reuses extraction output for unchanged files instead of re-running queries.
  • normalize context v2 daemon caching. Context queries hit the daemon cache for near-zero latency on warm runs.
  • rkyv binary IPC for daemon rules cache. normalize rules run communicates with the daemon using a zero-copy binary protocol instead of three JSON round-trips: magic byte (0x01), 5-byte binary frame header ([type_byte][4-byte LE len]), rkyv-serialized payload. The daemon pre-builds an "all" blob on every refresh so the common unfiltered case is a direct blob read. Schema bump v8→v9. Round-trip: ~6–8 ms vs ~82 ms previously.
  • Parallel fact rule evaluation. run_rules_batch evaluates all enabled Datalog rules in parallel using rayon. On a typical 8-core machine with 7 enabled rules, normalize rules run --type fact drops from ~5.5 s to ~2.5 s wall time. JIT is not used in the parallel path (the ascent-interpreter JIT internals are not thread-safe under concurrent engine initialization); the sequential JIT path is still used for the incremental/daemon-cached path via run_rule_with_cache.
  • Syntax and native rules findings cache: WAL mode + single transaction. The per-file SQLite findings cache used by syntax rules and native rules opens with PRAGMA journal_mode=WAL; PRAGMA synchronous=NORMAL; and wraps cache-write operations in a single BEGIN/COMMIT transaction per run. Cold run: ~1.6 s (vs ~19 s before); warm run: ~0.84 s total.
  • SQLite findings cache for native and syntax rules. Warm runs of long-file, high-complexity, and long-function skip unchanged files entirely using a SQLite-backed per-file cache stored at .normalize/findings-cache.sqlite. The syntax rules engine migrates from the JSON syntax-cache.json to the same SQLite store. Cache keys include (path, mtime_nanos, config_hash, engine); changing the threshold or rule set invalidates only the affected entries.
  • Batched uncommitted-changes check for stale/missing-summary. stale-summary and missing-summary open the git repository once and collect all changed paths into a HashSet before the directory loop, replacing hundreds of per-directory gix calls. Warm pre-commit runs drop from ~2.2 s per rule to ~170 ms.
  • Incremental git walk for stale/missing-summary cache. The stale-summary and missing-summary rules walk only the commits since the last cached HEAD instead of re-walking all of git history on every pre-commit run. Warm runs after a single commit now take milliseconds.
  • Parallel effective-files walk for advisory native rules. When --only/--exclude filters are active, the file walk used to build the effective file list runs concurrently with the first group of native rules instead of sequentially after them.
  • Daemon file watcher respects [walk] exclude. Previously the file watcher ignored [walk] exclude and registered tens of thousands of inotify watches under directories like .claude/worktrees/. The watcher now consults the same gitignore-style matcher as the walker, eliminating ~50k spurious inotify watches on typical projects.
  • Native rule timing diagnostics. RUST_LOG=debug normalize rules run --type native logs per-rule and total elapsed time to stderr via tracing::debug!.

Fixed

  • musl release tarball now ships musl-linked grammars. The release workflow previously built grammars once on the glibc host runner and bundled the same .so files into both the gnu and musl tarballs. The musl-linked normalize binary uses the musl loader, which cannot dlopen() glibc-linked shared objects, so any grammar load on musl would fail at runtime. The workflow now invokes cargo xtask build-grammars --target x86_64-unknown-linux-musl --cc musl-gcc for the musl matrix entry and packs the resulting target/x86_64-unknown-linux-musl/grammars/*.so into the musl grammar artifact. A readelf -d check in CI fails the build if any grammar still depends on libc.so.6. cargo xtask build-grammars gained --target <triple> and --cc <compiler> flags; both default to host behavior so existing invocations are unchanged.
  • libsql block_on shim no longer deadlocks/panics when called from a tokio worker. The cache layers in normalize-facts, normalize-native-rules, and normalize-syntax-rules cache an owned current-thread runtime when constructed from sync code (the common case). Previously the helper used that cached runtime unconditionally, which caused Cannot start a runtime from within a runtime panics whenever a #[tokio::test] (or any other tokio task) hit a cache that had been initialized earlier from a sync test in the same process. The helper now inspects the call site's tokio context first — only falling back to the cached runtime when not already inside one — so the public API stays synchronous while remaining safe to call from any context.
  • normalize update no longer recomputes SHA-256 in O(n²). The hand-rolled hash implementation in the self-update path was quadratic and could take minutes on macOS releases (~45 MB binary). Replaced with the sha2 crate; verification is now milliseconds.
  • DaemonClient socket path captured at construction, not at every method call. DaemonClient::new() previously re-read NORMALIZE_DAEMON_CONFIG_DIR on every method invocation. Under any concurrent use this raced. DaemonClient::new() now resolves the env var once and stores the resolved PathBuf. New DaemonClient::with_socket_path(PathBuf) constructor lets callers (tests, LSP servers talking to multiple workspaces, library embedders) target a specific socket without touching env vars.
  • Daemon refresh events suppressed during first 60 s after add_root. FileIndex::incremental_refresh() short-circuited via a needs_refresh() staleness gate (a cold-CLI optimization). The daemon's refresh_root called the gated variant, killing LSP push diagnostics during the first minute of a daemon session. Adds FileIndex::incremental_refresh_force() which skips the gate; the daemon now uses it.
  • Daemon refresh_root deadlock on second incremental refresh. refresh_root held the per-root FileIndex mutex across the entire match arm, including the save helpers which all re-acquire the same mutex. The lock is now released after the last direct idx use and the save helpers re-acquire it as designed.
  • Grammar load failure is a loud warning, not silent empty results. When normalize structure rebuild encounters a file whose grammar .so is unavailable, it emits a tracing::warn! (once per grammar per run) and skips the file entirely rather than indexing it as having zero symbols. End-to-end fix: SymbolParser::parse_file returns Option<Vec<FlatSymbol>> (None = grammar unavailable); both the full and incremental paths skip files that return None and never write them to the CA cache or SQLite.
  • JIT compilation re-enabled for Datalog rules. ascent-interpreter upgraded to 0.2.0-alpha.1, which fixes the packed-tuple arity mismatch bug that caused aborts on non-trivial relations. JIT is active by default on x86_64; aarch64 continues to use interpreted evaluation.
  • Daemon memory leak. WatchedRoot no longer holds diagnostics or the reverse-dep graph in memory. After each refresh, issues are persisted to the daemon_diagnostics table in the SQLite index, then dropped from heap. The reverse-dep graph is derived transiently from the imports table on each refresh and discarded after use. Steady-state daemon RSS reduced from ~2.3 GB (after 10 days) to near-zero.
  • syntax-rules walker uses gitignore-style exclude. Previously the syntax-rules walker matched [walk] exclude against filename basenames only, ignoring the gitignore semantics that the rest of the system uses. Now consistent across the codebase.
  • Auto-build index for commands that need it. Commands that depend on the structural index (test-gaps, blame, coupling-clusters) auto-build the index when it's empty instead of silently returning degraded results. ensure_ready_or_warn prints a hint to stderr when the index can't be built.

Internal

  • RuleOverride typed per-rule config. Rule-specific fields (filenames, paths, threshold) moved out of the flat RuleOverride struct into typed per-rule config structs. Common fields (severity, enabled, allow, tags) stay shared; rule-specific TOML keys land in extra via #[serde(flatten)] and are deserialized by each rule via RuleOverride::rule_config::<T>().
  • Refactoring engine (refactor/) — three-layer architecture for composable code transformations: semantic actions (query/mutation primitives), recipes, and a shared executor with dry-run/shadow support. Foundation for move, add-parameter, inline-function, introduce-variable, inline-variable.
  • normalize-refactor crate — refactoring engine extracted from the main crate. Clean dependency boundary on normalize-edit, normalize-facts, normalize-languages, normalize-shadow.
  • normalize-syntax-rules fix feature gateapply_fixes and expand_fix_template gated behind default = ["fix"]. Read-only rules consumers can disable with default-features = false.

Installation

curl -fsSL https://rhi.zone/normalize/install.sh | sh
irm https://rhi.zone/normalize/install.ps1 | iex

Manual download: pick the archive for your platform from the assets below and verify with SHA256SUMS.txt.