Normalize CLI v0.3.1
Removed
- Semantic embedding search dropped from the shipping binary.
normalize structure search,normalize context --semantic, the[embeddings]config section, daemon incremental re-embedding, markdown/commit/context-block embedding duringstructure rebuild, theembeddingscargo feature flag, and thesemantic_compatshim are all removed from thenormalizebinary. Thenormalize-semanticcrate remains published on crates.io for standalone use; symbol search is being redesigned around discrete tags for a future release. Release builds simplify to a singlecargo buildstep (musl no longer needs a no-default-features carve-out).
Added
- First-run grammar install. The first time a user runs any
normalizesubcommand other thangrammars/--help/--version,
normalizechecks whether tree-sitter grammars are installed in the
user's config directory (~/.config/normalize/grammars/). If not, and
the session is non-interactive (piped or driven by an agent), grammars
are downloaded and installed automatically with a stderr notice. In an
interactive session the check is deferred tonormalize init, which
prompts implicitly by installing on first invocation. Subsequent
invocations short-circuit on a.installed-versionstamp file, so the
check has zero cost after first use. Users who already have a populated
grammar directory (e.g. fromcargo xtask build-grammarsor
NORMALIZE_GRAMMAR_PATH) are detected and stamped without a download. normalize structure test-fixtures— language extraction fixture runner. Discovers<lang>/<case>/input.<ext>+expected.jsonpairs undercrates/normalize-languages/tests/fixtures/, extracts symbols/imports/calls viaSymbolParser, and diffs against expected JSON. Flags:--lang <lang>(filter to one language),--fixture-dir <dir>(custom fixture root),--update(write actual output as expected — bootstrap mode). Schema:{exhaustive, symbols: [{name, kind, line}], imports: [{module, name, line}], calls: [{callee, line}]}. All fields optional; subset matching by default;"exhaustive": truefor exhaustive checking.normalize rules test-fixtures— fixture-based test format for native/syntax rules. Discovers test cases with input source + expected diagnostics under.normalize/rule-tests/(configurable via--fixture-dir) and runs them through the engine with subset or exhaustive matching.normalize view chunk <file> --chunk Nand--around <pattern>— large-file navigation for agents.--chunk Ndivides a file into fixed-size chunks (default 100 lines, configurable via--chunk-size) and shows chunk N (1-indexed).--around <pattern>finds the first regex/substring match and shows ±50 lines of context (configurable via--context-lines). Multiple matches:--match-index Ito navigate. ReturnsChunkedViewReportwithfile,chunk,total_chunks,line_start,line_end,content, and (for--around)match_line,total_matches,match_index. Works with--json/--jq/--jsonl.normalize view import-path <from> <to>— find the shortest import chain between two files via BFS over the resolved import graph (requiresnormalize structure rebuild).--allreturns all simple paths up to--limit(default 5);--reversefinds paths from<to>to<from>. Emitsfile → file → filechains or "No import path found between A and B" when unreachable.normalize edit add-parameter— add a parameter to a function signature and update all call sites.normalize edit add-parameter <file> <function> --param <name> --default <value> [--type <type>] [--position <N>] [--dry-run]. Uses tree-sitter to locate the function and argument lists; finds callers via the facts index (falls back gracefully when unavailable). Supports Rust, TypeScript/JavaScript, Python.normalize edit inline-function <file> <line>:<col>— inline a single-use function at its call site within the same file. Locates the function definition (or resolves the name from a call-site position), substitutes arguments for parameters using whole-word replacement, strips thereturnkeyword when the call is in expression position, replaces the call with the inlined body, and removes the now-dead definition. Supports JS/TS function declarations and arrowconstbindings, Pythondef, and Rustfn. Conservative: aborts if the function has multiplereturnstatements or if the argument count doesn't match.--forceto inline a function with multiple call sites (only the first is inlined).normalize edit introduce-variable <file> <range> <name>— extract an expression at the givenline:col-line:colrange into a named binding inserted before the containing statement. Supports Rust (let), JS/TS (const), Python (bare assignment).normalize edit inline-variable— inverse ofintroduce-variable: replace all uses of a variable with its initializer and delete the binding.- Post-edit index invalidation for refactoring commands. After any refactoring edit (
rename,introduce-variable,inline-variable,add-parameter,move,inline-function) writes files to disk, the command immediately notifies the daemon via the newFilesChangedrequest. The daemon broadcastsFileChangedevents for each path and triggers an incremental index refresh — no need to wait for inotify or runnormalize structure rebuildmanually. Non-fatal: silently skipped when the daemon is not running. normalize sync— copy a project (and its AI agent session metadata) to a destination for portability. Excludestarget/,node_modules/,.git/objects/,.normalize/findings-cache.sqliteby default. After copying, rewrites absolute paths in the index DB so the copy works from its new location. Supports--dry-run,--verbose,--all(sync all known projects),--active N(only projects with activity in the last N days),--repo <glob>,--exclude <glob>. Subsequent runs are incremental via a manifest of previously synced files.normalize sessions parallelization [session-id]— finds turns with sequential same-type tool calls that could be parallelized (e.g.Read(foo.rs) → Read(bar.rs)).--threshold Ncontrols minimum group size (default: 2). Works per-session or aggregated across filtered sessions.normalize sessions heatmap [session-id]— per-file read/write counts across a session. Classifies files ashot(>5 writes),read_only(reads with 0 writes — potential test gap), ornormal. Sorted by write count,--top Nto limit rows.normalize sessions cost [session-id]— per-turn token cost breakdown using model-specific Anthropic pricing. Shows input/output/cache-read/cache-write tokens and estimated USD per turn. Summary includes total cost, cost without cache, cache savings (USD), and cache efficiency %.- Composable
sessions messagesfilters and review state.sessions messagesnow accepts--has-tool,--errors-only,--exclude-interrupted,--turn-range,--min-chars,--max-chars. Newsessions mark/sessions unmarksubcommands write/remove session IDs from.normalize/sessions-reviewed;sessions listaccepts--reviewed/--unreviewedto filter accordingly. high-fan-outandhigh-fan-innative rules — coupling smell detection.high-fan-outflags files that import from more thanthreshold(default: 20) distinct resolved modules;high-fan-inflags files imported by more thanthresholddistinct files. Both are default disabled, requirenormalize structure rebuild, and are configurable via[rules.rule."high-fan-out"] threshold = N/[rules.rule."high-fan-in"] threshold = N. Taggedarchitectureandcoupling.boundary-violationsnative rule — configurable directory-level import boundary enforcement. Declare pairs like"services/ cannot import cli/"under[rules.rule."boundary-violations"] boundaries = [...]; the rule queries the structural index'simportstable and reports each resolved import that crosses a boundary. Default disabled; requiresnormalize structure rebuild. Uses glob matching withservices/treated asservices/**.dead-parameternative rule — flags function parameters that are never referenced in the function body. Usesnormalize-scope'sScopeEnginewith@local.definition.parametercaptures inlocals.scm. Underscore-prefixed names (_x) are excluded. Supported: Rust, Python, JavaScript, TypeScript, TSX, Go, Java, C, C++, C#. Default disabled.- Re-export tracing in import resolution.
pub use path::Itemin Rust andexport { X } from './y'in TypeScript/JavaScript are now extracted as re-exports (is_reexport = 1in theimportstable). Afterresolve_all_imports(), a newtrace_reexports()pass follows re-export chains (up to 10 hops, with cycle detection) so imports in file A that land on an intermediate re-exporter B are updated to point to B's source file C. This improves call graph accuracy andfind_callers/find_calleesresults across module boundaries. Schema bumped v11→v12. #match?/#eq?predicate evaluation for tree-sitter queries.normalize_languages::satisfies_predicatesnow evaluates the standard predicates (#match?,#not-match?,#eq?,#not-eq?) so.scmquery authors can filter captures by text content or equality. Unknown predicates continue to pass (forward-compatible).normalize-surface-syntaxpattern matching / destructuring as first-class IR nodes —Patenum (Ident,Object(Vec<PatField>),Array(Vec<Option<Pat>>, Option<String>),Rest) andPatField { key, pat, default }added to the IR;Stmt::Destructure { pat, value, mutable }replaces the old lowering ofconst { a, b } = objto multipleStmt::Letbindings. TypeScript reader producesStmt::Destructurewith full structural fidelity (shorthand, renamed, nested, defaults, object rest, array holes, array rest); TypeScript writer emitsconst { a, b: c } = objandconst [x, y, ...rest] = arr. Python reader mapsa, b = f(),[x, y] = arr,(a, *rest) = items. Lua writer lowers tolocal a, b = table.unpack(expr).normalize-surface-syntaxtype annotations and template literals.Paramstruct replacesStringfor function parameters and carriestype_annotation: Option<String>;Functiongainsreturn_type;Stmt::Letgainstype_annotation. TypeScript and Python readers populate these fields; writers emit them.Expr::TemplateLiteral(Vec<TemplatePart>)added for backtick strings; TypeScript writer emits backtick syntax, Python writer emits f-strings, Lua writer lowers to..concatenation.normalize-surface-syntaximport/export and class definitions in IR. Imports and exports are now first-class IR nodes (previously elided). Class definitions carry methods, fields, and inheritance.normalize-surface-syntaxLua reader improvements —obj:method(args)desugars toobj.method(obj, args)with implicit self. Table constructor parsing fixes:["string"] = valuecomputed string keys properly extract the string value; multi-variable generic for (for k, v in pairs(t)) preserves all loop variables; numeric for reads the optional step field. Fixed elseif chaining bug. Object key emission uses idiomatic bare identifier syntax for valid Lua identifiers. String escaping handles null bytes.normalize-surface-syntaxJavaScript reader and TypeScript reader gap fixes. A dedicated JavaScript reader joins the existing TypeScript reader; remaining gaps in the TypeScript reader (comments preservation, additional grammar nodes) are filled.normalize-typegenIR improvements —Fieldgainsnullable: bool,default: Option<DefaultValue>, andconstraints: Option<FieldConstraints>(min/max, min/max_length, pattern, format).Schema::validate()checks for well-formedness: valid identifiers, no duplicate type/field names, unresolvedReftargets, and circular reference detection via DFS.normalize-typegenJSON Schema, GraphQL SDL, and Protobuf output backends (backend-jsonschema,backend-graphql,backend-protofeatures). JSON Schema draft 2020-12 with$defs/$ref,anyOf/oneOf,required,additionalProperties: false. GraphQL SDL: structs →type/input, string enums → UPPER_CASE variants, tagged unions →union. Protobuf proto3: structs →message, string enums with mandatory_UNSPECIFIED = 0, tagged unions →oneof, arrays →repeated, optional fields →optionalkeyword.normalize-typegenProtobuf and GraphQL SDL input parsers — round-trip support for both formats; combined with the writers above, schemas can flow between any pair of supported representations.normalize-typegen--splitand--dry-runCLI; IR source locations. Schemas can be emitted as one file per type via--split. IR carries source locations for error reporting.- Daemon-cached diagnostics for all engines. The daemon caches syntax, fact, and native rule diagnostics and serves them instantly on
normalize rules run. Cache is primed eagerly on file changes (incremental for syntax/fact, full re-run for native) and lazily on first request. Falls back to local evaluation transparently when the daemon is unavailable. - Daemon push of diagnostic deltas via binary subscribe. New
Event::DiagnosticsUpdated { root, updates }variant carries per-file deltas (only files whose issues changed since the last refresh; emptyissuesVec = file became clean). Subscribe connections opened with the0x01magic byte stream events as length-prefixed rkyv binary frames; the JSON-line subscribe path is unchanged for backward compatibility. NewDaemonClient::watch_events_binarydecodes the binary frames. Eliminates the LSP poll-after-IndexRefreshedpattern. Wire-schema change:Eventpath fields are nowString(previouslyPathBuf) so the enum is rkyv-serializable. - Per-file diagnostic storage in the daemon + JSON mirror. Schema bump v9→v10 adds a
daemon_diagnostics_per_filetable (path PRIMARY KEY, rkyvVec<Issue>blob) populated on every prime/refresh. Per-file pulls (e.g.normalize rules run path/to/file.rs) hit this table directly via a newfilter_filesfield onRunRulesinstead of fetching and filtering the whole "all" blob. The daemon also writes.normalize/diagnostics.jsonatomically on every prime/refresh as a canonical-state artifact for ephemeral consumers. - Daemon live-reloads
.normalize/config.tomland.normalize/rules/**on change. Previously the daemon's notify watcher saw config edits but routed them nowhere, so cachedRunRulesresults stayed under the old config until either a source file changed or the daemon was restarted. The dispatch loop now classifies config and rule-definition edits as a fourth route; the handler clears every cached diagnostic blob and triggers a reprime against the freshly-loadedNormalizeConfig.[daemon]startup keys (enabled,auto_start) still require a restart by design. - Cross-daemon-restart cache validity (
config_hashgate). The SQLite-backed daemon diagnostic blobs (daemon_diagnostics,daemon_diagnostics_per_file) carry the hash of the inputs that produced them (binary version +.normalize/config.toml+.normalize/rules/**). On load, a hash mismatch is treated as a cache miss; the daemon reprimes under the current config. Schema bump v10→v11. - Tier 1 daemon config reload — filter-only changes apply at serve, no reprime. When
.normalize/config.tomledits affect only severity, allow-lists, orenabled = false, the daemon updates its cached config snapshot, sets aserve_filter_pendingflag, and lets the nextRunRulesre-filter the cached findings in place. No blob clear, no rule re-evaluation. - Tier 2 daemon config reload — per-rule surgical re-eval. When config edits affect only specific rules (a rule is newly enabled, or its
extrathreshold/field changed), the daemon re-runs only those rules through the syntax, fact, and native engines using the existingfilter_idspath, splices updated findings into the per-engine blobs, rebuilds the "all" blob and per-file rows, and broadcasts aDiagnosticsUpdatedevent.ConfigDiff::rules_to_rerunnames the affected rule IDs;surgical_rerun_rulesimplements the splice.[walk] excludechanges still trigger a full reprime. ConfigDifffor surgical daemon cache invalidation.normalize-rules-configexportsConfigDiff::compute(old, new)which classifies a config change as filter-only, per-rule re-run, or full reprime. Consumed by the daemon tiers above.- Daemon
.scmfile hash tracking. Custom rule edits (.normalize/rules/**/*.scm) participate in the Tier 2 surgical re-eval path via content-hash tracking, not just mtime. - LSP native rule diagnostics. The LSP server publishes diagnostics from native rules (
missing-summary,stale-summary,check-refs, etc.) alongside syntax and fact rule diagnostics. Native rules run debounced and workspace-wide. They re-trigger on.git/indexchanges sogit addevents immediately refresh stale-summary results. - No
gitbinary required. All git operations now usegix(pure-Rust gitoxide):git blame(ownership, provenance,view history),git status --porcelain, path-filtered commit counts,git rev-list --count, budget metrics diff, ratchet ref-based check/measure. Agitbinary in$PATHis no longer a runtime dependency. - Configurable walker exclusions — new
[walk]section in.normalize/config.tomlcontrols directory walking.ignore_filesconfigures which gitignore-format files are respected (default:[".gitignore"];[]to disable).excludeaccepts gitignore-style globs (default:[".git", ".claude/worktrees/"]). Threaded through native rules, syntax rules, the unified rules runner, the daemon, and the LSP server. - Co-change edge index —
normalize structure rebuildpopulates aco_change_edgestable in the SQLite index with file pairs that frequently change together (co-change count ≥ 2, commits touching >50 files skipped as noise, per-file fanout capped at top 20 partners). Incremental: only new commits since the last rebuild are processed.normalize analyze coupling-clustersqueries this table instead of re-walking git history; falls back transparently to the git walk when the table is empty. Rebuild output now includes aco_change_edgescount. stale-docnative rule — flags documentation files that are likely stale because strongly co-changed code files have been updated more recently. Queries theco_change_edgesindex for each doc file (**/*.md,**/*.rst,docs/**/*), finds code files it historically changes with, and flags the doc if any partner was committed more recently.SUMMARY.mdis excluded (covered bystale-summary). Configurable via[rules.rule."stale-doc"]withmin_co_changes(default 3),min_lag_days(default 0), anddoc_patterns. Default disabled; requiresnormalize structure rebuild.missing-testfact rule — flags public functions that are never called from a test function (a function with a test attribute such as#[test],@test,@Test, or@pytest.mark). Default disabled. Entry-point and module-boundary files excluded via the default allow list.stale-mockfact rule — flags mock/stub functions (identified by attributes such as@Mock,@patch,@stub,mock,stub,fake) that call a callee which no longer exists as a symbol in the index. Catches mocks that were not updated after a rename or deletion. Default disabled.normalize edit move— moves a symbol's definition to another file and rewrites import statements in every file that imported it from the old location. Per-language module-path derivation is best-effort: Python, Go, and JavaScript/TypeScript imports are rewritten when a new path can be derived; Rust and unsupported cases emit warnings and skip the import site rather than fabricating wrong paths.--reexport(Python only) leaves a re-export stub at the source location. Supports--dry-runand shadow-history--message. Leading decorations (doc comments, attributes, decorators, annotations, pragmas) preceding the symbol are included in the move, classified by tree-sitternode.kind()rather than text patterns.normalize config validatedeep validation. Runs four phases (TOML syntax, JSON Schema compliance, serde deserialization, rules config parsing) on both project and global config files. Reports errors with file path, line/column when available, and validation phase. Exits non-zero on errors for CI/hook use.normalize grep <path>— optional positionalpathargument scopes the search tree (consistent withview,edit,rank). The existing--rootflag is preserved for backward compatibility;pathtakes precedence when both are given.normalize rules run --files— accept an explicit list of file paths, bypassing the file tree walker entirely. Critical for hook-grade latency where the caller already knows which files changed. Composes with--only/--excludefor further filtering.normalize rules run --only/--exclude— glob pattern filtering.--only "*.rs"restricts to Rust files;--exclude "tests/"skips test directories. Applied pre-walk for syntax rules and advisory native rules; post-walk for fact rules.normalize structure rebuild --only/--exclude— glob pattern filtering for which files get indexed. Files not matching the filter are removed from the index after the walk.normalize analyze architecture --limit— caps the number ofcross_importsentries in the output (default 20,--limit 0disables). Reduces default JSON response from ~196KB to ~10KB.- Command aliases —
search/find→grep,lint→rules run,check→ci,index→structure rebuild,refactor→edit. Users from other tools find familiar names work transparently. - Tiered help output —
normalize --helpgroups commands into four sections (Core, Analysis, Utilities, Infrastructure) instead of a flat alphabetical list. Core commands (view, grep, edit, rules, structure, init) appear first. - Daemon-cache timing diagnostic —
normalize rules runprints[timings] daemon-cache: ...to stderr when diagnostics are served from the daemon's pre-warmed cache, making it visible that the fast path was taken. NORMALIZE_DAEMON_CONFIG_DIRtest override — when set, this env var redirects the daemon'sdaemon.sock,daemon.lock, anddaemon-spawn.lockto the named directory. Lets integration tests spawn isolated daemons without contending with the user's running instance. Production behavior is unchanged when unset.normalize trend— top-level subcommand for time-series health metrics. Replacesnormalize analyze complexity-trend,analyze length-trend,analyze density-trend,analyze test-ratio-trend, andanalyze trend. New names:normalize trend complexity,normalize trend length,normalize trend density,normalize trend test-ratio,normalize trend multi.normalize package tree --depth N— caps the dependency tree at depthN(0 = roots only). Limits both text and JSON output. Default: unlimited.
Changed
- All tree-sitter grammars now load uniformly via
dlopen(). Previously
normalize-surface-syntaxandnormalize-typegenstatically linked four
arborium-*grammar crates (TypeScript, JavaScript, Lua, Python — and
GraphQL in typegen), while every other grammar loaded dynamically through
the sharedGrammarLoader. The static linking is gone: those readers now
request grammars from the process-widegrammar_loader()singleton like
the rest of the codebase. Net effect on the binary: the four (five for
typegen) compiled-in parsers are removed; the runtime requirement that
the relevant.sofiles live in a search path was already true for all
other languages and is unchanged. Cargo featuresread-typescript,
read-javascript,read-lua,read-python,input-typescript, and
input-graphqlcontinue to work — they now gate the reader source code,
not a grammar dependency. - musl release build is now fully self-contained — no system runtime dependencies. Previously the
x86_64-unknown-linux-muslartifact was a static-pie binary (couldn'tdlopen()grammar.sofiles), then briefly a dynamic binary that required the system to provideld-musl-x86_64.so.1andlibc.musl-x86_64.so.1. The release tarball now bundles its own musl loader and libc alongside a tiny POSIX-sh wrapper script that invokes the bundled loader explicitly (exec "$DIR/runtime/ld-musl-x86_64.so.1" --library-path "$DIR/runtime" "$DIR/runtime/normalize.elf" "$@"). The artifact runs on any Linux x86_64 system — Alpine, distroless, NixOS withoutpkgs.musl, glibc-only distros — with no installed musl required.install.shextracts the tarball under~/.local/share/normalize/and symlinks the wrapper into~/.local/bin/. The musl artifact is now also the safe default on systems without glibc (and the only choice on NixOS). - rusqlite → libsql migration across the workspace.
normalize-facts(CA cache),normalize-native-rules(findings cache),normalize-syntax-rules(findings cache), and thenormalize syncpath-rewrite all moved offrusqliteontolibsql. Resolves a sqlite link conflict that previously forced workspace-wide symbol coordination. No user-visible behavior change; cache files at~/.config/normalize/ca-cache.sqliteand.normalize/findings-cache.sqliteare still SQLite and remain compatible. *-allowfiles removed in favor ofconfig.tomlentries. The 7 legacy.normalize/*-allowfiles (large-files-allow,hotspots-allow,duplicate-blocks-allow,duplicate-functions-allow,duplicate-types-allow,similar-blocks-allow,similar-functions-allow) are no longer loaded. Their entries now live directly inconfig.toml:large-files-allow→[rules.rule."long-file"] allow = [...];hotspots-allow→[analyze] hotspots_exclude; duplicate/similar command allowlists →[analyze.<subcommand>] allow = [...]. Migration: if you have custom entries in any*-allowfile, move them to the appropriateconfig.tomlsection.- Per-rule config moved to
[rules.rule."<id>"]. Previously[rules]hosted both engine-wide bare keys (e.g.global-allow,sarif-tools) and per-rule sub-tables (e.g.[rules."rust/dbg-macro"]) — a TOML namespace collision waiting to happen. Per-rule overrides are now nested under a dedicatedrulesub-table; the bare-key namespace under[rules]is reserved for engine-wide configuration. The legacy layout is still parsed for one release with a stderr deprecation warning. Migration: rename every[rules."<id>"]to[rules.rule."<id>"]. Engine-wide keys stay where they are. [walk] excludenow accepts gitignore-style glob patterns. Previously each entry was matched only against directory entry basenames. Patterns are now compiled via a gitignore matcher anchored at the project root, so any pattern that works in.gitignoreworks here. Existing configs (e.g.[".git", "worktrees"]) keep working unchanged because gitignore patterns without slashes still match at any depth.normalize init --setupdetects scratch directories. When.claude/worktrees/is present, the bootstrap step adds it to[walk] excludeby default. TheDefault::default()forNormalizeConfigis now genuinely empty; the opinionated bootstrap config lives separately and is only applied duringinit.normalize rules run --only/--excludepre-walk scoping. Glob patterns are applied before file parsing and walking, not just after. Single-file--onlyruns are now proportional to the matched file count, not the full tree.normalize rules runroutes through daemon when running. Ifnormalize daemon startis active,normalize rules run(and any invocation that hits fact rules) sends the request to the daemon via Unix socket and receives pre-warmed Datalog evaluation results instead of cold-evaluating from scratch (~45 seconds on large codebases). Falls back transparently when no daemon is running.normalize structure rebuilddefaults to incremental mode (mtime-based). Only files changed since the last build are re-indexed. Pass--fullto force a complete rebuild. When no files have changed, the command prints "Index up to date". The--jsonoutput includes anincremental: truefield when incremental mode was used.normalize view --dir-contextaccepts an integerNinstead of a boolean flag.Nselects context files using Pythonlist[:N]semantics on the target→root ordered list:1= target dir only,2= target + parent,-1= all ancestors,0= none.normalize view --dir-contextJSON output includes adir_contextfield inViewReportcontaining the merged context content. Previously the context was only prepended to text output; agents using--jsonreceived no context.normalize rules tagsalways populates therulesarray in JSON output. The--show-rulesflag has been removed.normalize syntax astdefault depth changed from unlimited (-1) to5. Pass--depth -1to restore the old unlimited behavior.normalize analyze docs --jsonby_languagefield serializes as named objects{"documented": N, "total": N}instead of positional arrays.normalize grammars list --jsonreturns objects withnameandpathfields instead of bare strings. Text output is unchanged.normalize analyze architecturecompact output no longer truncates hub and symbol paths with opaque worktree-hash prefixes. Paths are shown as clean workspace-relative paths.normalize contextcompact output includes<!-- source -->file path comments and---separators between blocks when multiple context files are merged. Single-block output is unchanged.normalize ci/normalize rules runcompact output(N files)header now reads(N files checked)to clarify it is the number of files scanned, not files with issues.normalize grepconsecutive matches within the same symbol are grouped under a single(SymbolName L48-61):header rather than repeating the symbol tag on every line.normalize view <file>:N-Mheader no longer duplicates the line range (wasfile.rs:10-20:10-20, nowfile.rs:10-20).normalize analyze length→normalize rank length;normalize analyze test-gaps→normalize rank test-gaps— ranking commands moved underrank.normalize analyze node-typesremoved — duplicate ofnormalize syntax node-types. Use the latter.large-filerule renamed tolong-file— consistency withlong-function. Update[rules.rule."large-file"]to[rules.rule."long-file"]in your config.long-file,high-complexity,long-functionavailable as native rules. Threshold-based health findings fromanalyze healthare now usable innormalize rules run --type native. Default disabled (advisory); enable via[rules.rule."long-file"] enabled = trueor--rule long-file. Defaults: 500 lines, complexity 20, 100 lines. Thresholds configurable viathresholdkey.FileRuletrait for native file-based rules —long-function,high-complexity, andlong-fileimplement aFileRuletrait providing automatic SQLite caching and parallel execution. New file-based rules get caching for free by implementingcheck_file()andto_diagnostics().validate-calls-scmmoved out of SARIF tools — the internal.calls.scmcapture-name validation is no longer a[[rules.sarif-tools]]entry. It is now a direct step inscripts/pre-commitusing a singlegrepinvocation (~5 ms vs ~762 ms).- Mtime-based cache for SARIF tools —
[[rules.sarif-tools]]entries support an optionalwatchfield (list of glob patterns). When set,normalize rules runcaches the tool's output in the SQLite findings cache keyed by the max mtime of all matching files, skipping the tool on warm runs where nothing changed.
Performance
- Persistent symbol cache for single-file commands.
Extractorchecks the CA cache (~/.config/normalize/ca-cache.sqlite) before running tree-sitter on a file. On cache hit (same blake3 content hash, grammar, andinclude_privatesetting), the storedVec<Symbol>is returned immediately — no parse, no query execution. Cross-file resolver paths are excluded since their results depend on other files. - Persistent tree-sitter query cache for facts extraction. Tree-sitter query results are cached in SQLite keyed by content hash and grammar version;
normalize structure rebuildreuses extraction output for unchanged files instead of re-running queries. normalize contextv2 daemon caching. Context queries hit the daemon cache for near-zero latency on warm runs.- rkyv binary IPC for daemon rules cache.
normalize rules runcommunicates with the daemon using a zero-copy binary protocol instead of three JSON round-trips: magic byte (0x01), 5-byte binary frame header ([type_byte][4-byte LE len]), rkyv-serialized payload. The daemon pre-builds an "all" blob on every refresh so the common unfiltered case is a direct blob read. Schema bump v8→v9. Round-trip: ~6–8 ms vs ~82 ms previously. - Parallel fact rule evaluation.
run_rules_batchevaluates all enabled Datalog rules in parallel using rayon. On a typical 8-core machine with 7 enabled rules,normalize rules run --type factdrops from ~5.5 s to ~2.5 s wall time. JIT is not used in the parallel path (the ascent-interpreter JIT internals are not thread-safe under concurrent engine initialization); the sequential JIT path is still used for the incremental/daemon-cached path viarun_rule_with_cache. - Syntax and native rules findings cache: WAL mode + single transaction. The per-file SQLite findings cache used by syntax rules and native rules opens with
PRAGMA journal_mode=WAL; PRAGMA synchronous=NORMAL;and wraps cache-write operations in a singleBEGIN/COMMITtransaction per run. Cold run: ~1.6 s (vs ~19 s before); warm run: ~0.84 s total. - SQLite findings cache for native and syntax rules. Warm runs of
long-file,high-complexity, andlong-functionskip unchanged files entirely using a SQLite-backed per-file cache stored at.normalize/findings-cache.sqlite. The syntax rules engine migrates from the JSONsyntax-cache.jsonto the same SQLite store. Cache keys include(path, mtime_nanos, config_hash, engine); changing the threshold or rule set invalidates only the affected entries. - Batched uncommitted-changes check for stale/missing-summary.
stale-summaryandmissing-summaryopen the git repository once and collect all changed paths into aHashSetbefore the directory loop, replacing hundreds of per-directory gix calls. Warm pre-commit runs drop from ~2.2 s per rule to ~170 ms. - Incremental git walk for stale/missing-summary cache. The
stale-summaryandmissing-summaryrules walk only the commits since the last cached HEAD instead of re-walking all of git history on every pre-commit run. Warm runs after a single commit now take milliseconds. - Parallel effective-files walk for advisory native rules. When
--only/--excludefilters are active, the file walk used to build the effective file list runs concurrently with the first group of native rules instead of sequentially after them. - Daemon file watcher respects
[walk] exclude. Previously the file watcher ignored[walk] excludeand registered tens of thousands of inotify watches under directories like.claude/worktrees/. The watcher now consults the same gitignore-style matcher as the walker, eliminating ~50k spurious inotify watches on typical projects. - Native rule timing diagnostics.
RUST_LOG=debug normalize rules run --type nativelogs per-rule and total elapsed time to stderr viatracing::debug!.
Fixed
- musl release tarball now ships musl-linked grammars. The release workflow previously built grammars once on the glibc host runner and bundled the same
.sofiles into both the gnu and musl tarballs. The musl-linkednormalizebinary uses the musl loader, which cannotdlopen()glibc-linked shared objects, so any grammar load on musl would fail at runtime. The workflow now invokescargo xtask build-grammars --target x86_64-unknown-linux-musl --cc musl-gccfor the musl matrix entry and packs the resultingtarget/x86_64-unknown-linux-musl/grammars/*.sointo the musl grammar artifact. Areadelf -dcheck in CI fails the build if any grammar still depends onlibc.so.6.cargo xtask build-grammarsgained--target <triple>and--cc <compiler>flags; both default to host behavior so existing invocations are unchanged. - libsql
block_onshim no longer deadlocks/panics when called from a tokio worker. The cache layers innormalize-facts,normalize-native-rules, andnormalize-syntax-rulescache an owned current-thread runtime when constructed from sync code (the common case). Previously the helper used that cached runtime unconditionally, which causedCannot start a runtime from within a runtimepanics whenever a#[tokio::test](or any other tokio task) hit a cache that had been initialized earlier from a sync test in the same process. The helper now inspects the call site's tokio context first — only falling back to the cached runtime when not already inside one — so the public API stays synchronous while remaining safe to call from any context. normalize updateno longer recomputes SHA-256 in O(n²). The hand-rolled hash implementation in the self-update path was quadratic and could take minutes on macOS releases (~45 MB binary). Replaced with thesha2crate; verification is now milliseconds.DaemonClientsocket path captured at construction, not at every method call.DaemonClient::new()previously re-readNORMALIZE_DAEMON_CONFIG_DIRon every method invocation. Under any concurrent use this raced.DaemonClient::new()now resolves the env var once and stores the resolvedPathBuf. NewDaemonClient::with_socket_path(PathBuf)constructor lets callers (tests, LSP servers talking to multiple workspaces, library embedders) target a specific socket without touching env vars.- Daemon refresh events suppressed during first 60 s after
add_root.FileIndex::incremental_refresh()short-circuited via aneeds_refresh()staleness gate (a cold-CLI optimization). The daemon'srefresh_rootcalled the gated variant, killing LSP push diagnostics during the first minute of a daemon session. AddsFileIndex::incremental_refresh_force()which skips the gate; the daemon now uses it. - Daemon
refresh_rootdeadlock on second incremental refresh.refresh_rootheld the per-rootFileIndexmutex across the entire match arm, including the save helpers which all re-acquire the same mutex. The lock is now released after the last directidxuse and the save helpers re-acquire it as designed. - Grammar load failure is a loud warning, not silent empty results. When
normalize structure rebuildencounters a file whose grammar.sois unavailable, it emits atracing::warn!(once per grammar per run) and skips the file entirely rather than indexing it as having zero symbols. End-to-end fix:SymbolParser::parse_filereturnsOption<Vec<FlatSymbol>>(None = grammar unavailable); both the full and incremental paths skip files that return None and never write them to the CA cache or SQLite. - JIT compilation re-enabled for Datalog rules.
ascent-interpreterupgraded to 0.2.0-alpha.1, which fixes the packed-tuple arity mismatch bug that caused aborts on non-trivial relations. JIT is active by default on x86_64; aarch64 continues to use interpreted evaluation. - Daemon memory leak.
WatchedRootno longer holds diagnostics or the reverse-dep graph in memory. After each refresh, issues are persisted to thedaemon_diagnosticstable in the SQLite index, then dropped from heap. The reverse-dep graph is derived transiently from theimportstable on each refresh and discarded after use. Steady-state daemon RSS reduced from ~2.3 GB (after 10 days) to near-zero. syntax-ruleswalker uses gitignore-style exclude. Previously the syntax-rules walker matched[walk] excludeagainst filename basenames only, ignoring the gitignore semantics that the rest of the system uses. Now consistent across the codebase.- Auto-build index for commands that need it. Commands that depend on the structural index (
test-gaps,blame,coupling-clusters) auto-build the index when it's empty instead of silently returning degraded results.ensure_ready_or_warnprints a hint to stderr when the index can't be built.
Internal
RuleOverridetyped per-rule config. Rule-specific fields (filenames,paths,threshold) moved out of the flatRuleOverridestruct into typed per-rule config structs. Common fields (severity,enabled,allow,tags) stay shared; rule-specific TOML keys land inextravia#[serde(flatten)]and are deserialized by each rule viaRuleOverride::rule_config::<T>().- Refactoring engine (
refactor/) — three-layer architecture for composable code transformations: semantic actions (query/mutation primitives), recipes, and a shared executor with dry-run/shadow support. Foundation formove,add-parameter,inline-function,introduce-variable,inline-variable. normalize-refactorcrate — refactoring engine extracted from the main crate. Clean dependency boundary onnormalize-edit,normalize-facts,normalize-languages,normalize-shadow.normalize-syntax-rulesfixfeature gate —apply_fixesandexpand_fix_templategated behinddefault = ["fix"]. Read-only rules consumers can disable withdefault-features = false.
Installation
curl -fsSL https://rhi.zone/normalize/install.sh | shirm https://rhi.zone/normalize/install.ps1 | iexManual download: pick the archive for your platform from the assets below and verify with SHA256SUMS.txt.