Releases: jcreinhold/lean-semantic-search
Releases · jcreinhold/lean-semantic-search
v0.3.1
Changed
- Upgraded the
lean-rsworkspace crates (lean-rs-worker-protocol,lean-toolchain) to 0.2.2 and bumped the Lean
toolchain pin toleanprover/lean4:v4.31.0-rc2(header-identical to-rc1). RoleFeatures.factsFromStatementnow deduplicates role features in O(n) via a hash set on the (injective) feature
sort key, replacing the previous per-insert linear scan (O(n²) in a statement's feature count). The emitted feature
rows are byte-identical —featuresJson/sortedFeaturesre-sort by the same key, so only the distinct set, not
insertion order, was ever observable — so the change is a pure performance fix with no cache-key or version impact.
(runtime_source_digestupdated for the source edit.)
v0.3.0
Added
lean-semantic-search-runtime: a package-owned runtime crate that ships theLeanSemanticSearchLean capability
payload, materializes it in a caller-owned per-toolchain cache with a generated downstreamlean-toolchain, records
provenance, builds viaCargoLeanCapability::lean_sysroot, and returns aLeanBuiltCapabilityfor downstream hosts.
Changed
lean-semantic-search-runtimenow delegates source-package materialization to the sharedlean-toolchainhelper
while preserving its public runtime API. Cache keys remain digest/toolchain based, provenance sidecars and generated
downstreamlean-toolchainfiles are still recorded, and the runtime payload remains a zero-dependency Lake package.
v0.2.0
Added
lean-semantic-search-retrieval: aCorpustrait — the storage seam a later persistent store implements — with the
in-memory inverted index as the reference implementation, andretrieve_acrossfor fanning one anchor across a slice
of corpora into one bounded, ranked list.lean-semantic-search-store: a persisted, on-diskCorpusover SQLite — a streaming, order-agnostic build with a
query-bounded resident set and an atomic single-file publish.Store::open_freshreuses a corpus only on a matching
opaquecorpus_tokenand matchingschema_version/policy_version, reporting every mismatch or corruption as a
structuredCacheMissrather than an error;set_latest/cleanupare neutral, latest-pointer-protecting,
dry-run-by-default primitives over content-addressed corpus directories. The store records the versions and the opaque
token but never interprets the token's contents. Seedocs/architecture/05-sqlite-store.mdand
docs/architecture/06-cache-lifecycle.md.
Changed
lean-semantic-search-retrieval: bounded selection now bounds a fingerprint/statement lane and a role/binder lane
separately and unions them, so a selective role match is not crowded out behind a fingerprint cohort.
RETRIEVAL_POLICY_VERSIONmoves tolean-semantic-search.retrieval.v2. Ranking accumulates bydeclaration_id
rather than a dense row index, so a non-contiguous backend can implementCorpus. See
docs/architecture/04-persistence.md.
v0.1.0
Initial release of the shared semantic-search package for Lean tooling.
Added
lean-semantic-search-contract: stable serde DTOs, opaque keys, diagnostics, version constants, and
response envelopes — the cross-repository JSON contract.lean-semantic-search-capability: worker-facing command names, export names, advertised facts, and
empty-diagnostic helpers.lean-semantic-search-retrieval: storage-neutral semantic candidate generation over feature rows,
carrying its ownRETRIEVAL_POLICY_VERSION.- Lean feature-extraction package (
lean/LeanSemanticSearch): canonical traversal, role features, module
and declaration extraction, proof-goal features, JSON envelopes, and the five@[export]capability
entry points.