Skip to content

v0.2.23

Choose a tag to compare

@github-actions github-actions released this 30 May 03:30
· 40 commits to main since this release

sdivi-rust v0.2.23 — Three new pattern categories and callee-text classification

This release lands the M29–M33 milestone batch, which extends the pattern
measurement surface from five categories to eight and promotes classification
from node-kind-only to node-kind + callee-text. The headline additions are the
data_access, logging, and class_hierarchy categories, a new
classify_hint API that inspects the source text of a call to decide its
category, and the native-pipeline switchover that puts classify_hint behind
Pipeline::snapshot.

snapshot_version stays at "1.0". The category set, the pattern_metrics
field shapes, and the DivergenceSummary structure are all unchanged. What
changes is the per-category instance distribution on the next snapshot after
upgrade — see "Impact on existing baselines" below.

Highlights

  • M29 — data_access category. call_expression (TS/JS/Go) and call
    (Python) nodes are now bucketed under data_access. TS/JS/Go already
    collected these as PatternHints; Python's adapter gains "call" in its
    PATTERN_KINDS so every Python call now emits a hint.
  • M30 — logging category (catalog-only at introduction). Added to the
    contract so foreign extractors can emit category: "logging" and round-trip
    through compute_pattern_metrics/compute_delta. category_for_node_kind
    deliberately never returns it — the relevant node kinds overlap with
    data_access and resource_management, and only the callee name
    distinguishes them.
  • M31 — class_hierarchy category. A node-kind-routed category covering
    class_declaration / class_definition / abstract_class_declaration /
    interface_declaration / impl_item across TypeScript, JavaScript, Python,
    Rust, and Java. Go is skipped — it has no class/interface AST shape — so the
    category exists in the catalog but produces zero Go hits.
  • M32 — classify_hint(hint, language) -> Vec<&'static str>. The
    callee-text-aware classifier. Per-language regex tables on data_access,
    logging, and async_patterns (matches_callee) plus an inverted
    resource_management::excludes_callee for Rust macro disambiguation. Exposed
    through sdivi-core and the @geoffgodwin/sdivi-wasm WASM surface alongside
    a new WASM-safe PatternHintInput { node_kind, text } struct. The native
    pipeline is intentionally left on category_for_node_kind in M32 so this
    milestone ships pure-additive (no snapshot diff).
  • M33 — native pipeline switchover. crates/sdivi-patterns/src/catalog.rs
    now classifies hints via classify_hint. logging becomes natively
    populated, data_access narrows to actual data-access callees, TS/JS Promise
    chains (.then()/.catch()/.finally()) route to async_patterns, and Rust
    tracing::*!/log::*!/println!-family macros land in logging instead of
    resource_management. The multi-category return is honoured (a hint can land
    in more than one bucket); v0's per-language regex tables are disjoint so in
    practice each hint lands in at most one.

list_categories() now returns 8 entries:
async_patterns, class_hierarchy, data_access, error_handling,
logging, resource_management, state_management, type_assertions.

Impact on existing baselines

The category contract is unchanged, but the per-category instance counts and
entropy values shift on the first snapshot taken after upgrade:

  • data_access shrinks — only callees matching the per-language regex
    remain; structurally homogeneous non-data calls are dropped.
  • logging becomes non-zero on languages with a logging regex table
    (was catalog-only since M30).
  • async_patterns grows on TS/JS — Promise chains are now counted.
  • resource_management shrinks on Rust — logging macros leave the bucket.

Threshold gates (sdivi check) tuned against pre-M33 baseline numbers may trip
on the next snapshot. The escape hatch is unchanged: set
[thresholds.overrides.<category>] with an expires date inside your migration
window to defer recalibration until you have retuned. The M20 cross-architecture
threshold epsilon is far smaller than these instance-count shifts and will not
absorb them. A side-by-side pre/post worked example is in MIGRATION_NOTES.md.

What did not change

  • snapshot_version is still "1.0". The PatternCatalog JSON shape,
    pattern_metrics field names, and DivergenceSummary structure are
    unchanged.
  • No new .sdivi/config.toml keys. [thresholds.overrides.<new-category>]
    blocks are legal under the existing category-agnostic override loader and the
    existing expires-required rule.
  • Public API is additive: classify_hint, PatternHintInput, the per-category
    matches_callee/excludes_callee helpers, and the three new category names.
    category_for_node_kind is unchanged and preserved for callers that have a
    node kind but no source text.
  • Foreign extractors that emit PatternInstanceInput directly are unaffected —
    their inputs determine their outputs.
  • WASM dependency invariant (Rule 21) holds: regex is the only new entry in
    the sdivi-core wasm32 dependency tree; no tree-sitter/walkdir/ignore/
    rayon/tempfile.
  • Snapshot atomic-write, retention, exit-code, and determinism contracts
    unchanged. Same repo state + same seed still produces bit-identical output
    (a different bit-identical output than pre-M33).

Install

# crates.io
cargo install sdivi-cli

# pre-built binary (Linux x86_64 example)
curl -Lo sdivi https://github.com/GeoffGodwin/sdivi-rust/releases/download/v0.2.23/sdivi-x86_64-unknown-linux-gnu
chmod +x sdivi && mv sdivi ~/.local/bin/

# WASM / npm
npm install @geoffgodwin/sdivi-wasm@0.2.23

Documentation

Released under Apache 2.0.