Open
Conversation
Wire the Lean 4 kernel (type-checker, WHNF reduction, definitional
equality, inductive checking) into the main codebase as `ix::kernel`,
ported from ix_old. This is the foundation for aux_gen, which will
regenerate canonical auxiliary constants (.rec, .below, .brecOn) for
alpha-collapsed and SCC-split mutual blocks.
Restructure ConstantMeta to support this:
- Rename the flat enum to ConstantMetaInfo, wrap it in a ConstantMeta
struct with an `info` field (matching the kernel's expected interface)
- Add ConstantMetaInfo::Muts variant for mutual block equivalence classes
- Add resolve_kvmap for kernel ingress of KVMap metadata
- Add Named.original field for aux_gen roundtrip decompilation
- Add Address::from_hex for kernel test fixtures
Update all downstream consumers (compile, decompile, FFI, serialization)
to use the new ConstantMeta wrapper and ConstantMetaInfo enum. Add
mutual inductive test fixtures in Tests/Ix/Compile/Mutual.lean covering
alpha-collapse, over-merge, and SCC-split patterns.
Extract the compile_env work-stealing scheduler from compile.rs into
compile/env.rs (Phase 2b-2c from TODO.md). The new scheduler uses
idempotent dependency tracking via Mutex<FxHashSet<Name>> instead of
AtomicUsize counters, preventing silent double-decrement corruption.
Adds a DashSet<Name> guard against duplicate block processing, and
drains aux_gen_extra_names after each block to unblock dependents of
"bonus" names compiled during parent inductive blocks.
Extend CompileState with fields needed for aux_gen (Phase 2a): kenv,
kintern, ungrounded, aux_gen_extra_names, aux_name_to_addr. Add
resolve_addr / resolve_addr_aux for two-tier name resolution (compiled
names first, aux_gen fallback second). Change blocks field from
DashSet<Address> to DashMap<Name, Vec<Vec<Name>>> to store canonical
class ordering for downstream aux_gen use.
Resolve all warnings under cargo xclippy -D warnings:
- Replace ~100 u64-to-usize casts with fallible u64_to_usize() helper
for future 32-bit target support
- Collapse identical match arms in congruence, ingress, def_eq, check
- Remove unused imports, variables, unnecessary mut across kernel
- Fix map().unwrap_or() → map_or() / is_some_and() patterns
- Rename to_node → node_for_key, to_ctor_when_k → synth_ctor_when_k
for clippy self-convention compliance
- Convert loop/match/break → while let in expr, tc, inductive
- Change add_eq_axioms(&mut KEnv) → (&KEnv) since DashMap is
interior-mutable
- Remove unnecessary path qualifications throughout kernel
- Add crate-level allows for type_complexity, too_many_arguments,
unnecessary_wraps
When sort_consts collapses N mutual inductives into fewer equivalence
classes, Lean's auto-generated auxiliaries (.rec, .recOn, .casesOn,
.below, .brecOn, .noConfusion) have the wrong arity. Rather than
surgically patching them, this commit regenerates them from the
canonical class structure, producing identical output regardless of
source declaration order.
Core additions:
- src/ix/compile/aux_gen/ — Full auxiliary regeneration module with
submodules for each auxiliary kind: recursor (3.5k lines, following
lean4/src/kernel/inductive.cpp), below, brecOn, casesOn, recOn,
noConfusion, plus expr_utils for FVar-based intermediate computation
and nested occurrence detection
- src/ix/compile/mutual.rs — Orchestrates aux_gen output into Ixon
blocks via compile_aux_block and generate_and_compile_aux_recursors
- src/ix/congruence.rs — Alpha-equivalence checker between Lean
expressions/constants, used to verify aux_gen output matches Lean's
native constants
Scheduler and compilation changes:
- Pre-compile PUnit, PProd, Eq, True into aux_name_to_addr so aux_gen
can reference them before the main scheduler processes their blocks
- Add compile_const_no_aux path to capture original Lean form; promote_aux
moves aux_gen constants to name_to_addr while storing the original
(addr, meta) in Named.original for decompilation roundtrip
- CompileError::MissingConstant now carries a `caller` field; scheduler
prints full dependency status on failure for easier debugging
- BlockCache.compiling tracks current constant name for error context
Soundness fix:
- Remove addr_to_name reverse index from Env — it was unsound for
alpha-equivalent constants where multiple names map to the same
Expr::Ref nodes instead of silently falling back to reverse lookup
Decompiler hardening:
- Remove fallback resolution paths that silently masked metadata
mismatches; missing Ref metadata is now a hard error
- Expand decompilation support for aux_gen-produced constants
Utilities and infrastructure:
- Name: add NameComponent, components(), strip_prefix(),
append_components(), last_str() for aux name manipulation
- Expr::pretty() for debugging expression trees
- ConstantMetaInfo::kind_name() diagnostic helper
Testing:
- New ValidateAux.lean test with 6-phase validation: compilation,
no ephemeral leaks, alpha-equivalence canonicity, decompilation
with/without debug info, aux congruence verification
- Mutual.lean: make all declarations `public` for cross-module access
- rs_compile_validate_aux FFI entry point with phased logging
- Restructure rs_tmp_decode_const_map into phased output
Also applies cargo fmt and clippy fixes across kernel files (mode.rs,
inductive.rs, congruence.rs, egress.rs, env.rs, tutorial/basic.rs,
meta.rs).
…he keys
Moves all shared type-checker state — intern table, WHNF/infer/def-eq/
ingress/recursor caches, and resolved primitives — from TypeChecker into
KEnv. TypeChecker becomes a lightweight thread-local handle (`TypeChecker<M>`
instead of `TypeChecker<'env, M>`) that holds only local context, equiv
manager, and control flags, making it cheap to create and safe to run in
parallel over a shared `Arc<KEnv>`.
Replaces pointer-based `ptr_key()` cache keys with blake3 `hash_key()` keys
throughout WHNF, def-eq, infer, and ingress caches, fixing the ABA problem
where deallocated Arc pointers could alias semantically different expressions.
All caches now use DashMap/DashSet for lock-free concurrent access.
Also:
- Add KernelCtx wrapper for compile-side aux_gen sort-level inference
- Extend aux_gen and decompile to handle nested _N-suffixed auxiliaries
(rec_N, below_N, brecOn_N, etc.)
- Improve ingress error handling: universe index, blob, and UTF-8 errors
are now explicit instead of silently defaulting
- Add Display impl for TcError and is_explicit() for KUniv
1. **Call-site surgery for alpha-collapsed recursors.** When
`sort_consts` collapses mutual inductives into fewer equivalence
classes, `.rec` gets regenerated with canonical motive/minor
layout, but user code still applies it with source-order arguments.
Add a surgery pipeline that reorders arguments at compile time
and reconstructs the source-order App spine at decompile time.
- New `src/ix/compile/surgery.rs`: `CallSitePlan` (per-recursor
permutation + keep mask), `collect_{lean,ixon}_telescope`
helpers, and `compute_call_site_plans` derivation.
- New `ExprMetaData::CallSite { name, entries }` arena node
carrying source-order `Kept { canon_idx, meta }` /
`Collapsed { sharing_idx, meta }` entries at the outermost
App position.
- Extend `ConstantMeta` with `meta_sharing`/`meta_refs`/
`meta_univs` extension tables forming a virtual address space
appended to the block cache during decompile. Serialization is
always-on (zero-length for pre-surgery constants).
- `compile_expr` adds a `BuildCallSite` frame that splits
telescopes into kept/collapsed args and emits a normal App
spine with CallSite metadata; `decompile_expr` adds a
`BuildTelescope` frame that walks entries in source order,
resolving kept positions from the canonical spine and
collapsed positions from the extension tables.
2. **aux_gen scope reduction and correctness fixes.** Empirically
confirmed (via 25k+ constant validate-aux roundtrip) that only
auxiliaries whose *value* references `.rec` directly need
regeneration. `.casesOn`'s public binder arity is invariant under
alpha collapse, so `.noConfusion`, `.ctorIdx`, `.ctor.inj*`,
`._sizeOf_*`, etc. correctly bind to the regenerated `.casesOn`
at address-resolution time with no patching.
- Delete `src/ix/compile/aux_gen/no_confusion.rs` and drop the
broken `.noConfusion` regeneration attempt from `mutual.rs`.
Remove unused `PatchedConstant::_NoConfusion{,Type}` variants.
- Expand the aux_gen module docs into a full taxonomy of which
auxiliaries need regeneration vs. inherit correctness.
- Add nested-inductive expand/restore model (`ExpandedBlock`,
`generate_recursors_from_expanded`) so recursors for
inductives with nested occurrences (e.g., `Array (Part α)`)
build uniformly against auxiliary `_nested.*` consts and
restore the originals afterwards.
- Propagate `is_reflexive` on `BelowIndc` so `.below` content
hashes for reflexive inductives like `Acc` match Lean's.
- Add `instantiate_rev` matching Lean C++ semantics for
multi-arg reverse instantiation.
- New `src/ix/compile/nat_conv.rs` with `try_nat_to_usize` /
`nat_to_usize` / `nat_to_u64` to make Lean-Nat-to-Rust-int
conversions explicit rather than silently producing 0 on
overflow. Threaded through all aux_gen modules.
- Harden `promote_aux`: the name-mismatch branch now returns
`CompileError::InvalidMutualBlock` instead of just logging —
silently continuing would splice foreign metadata into the
wrong Named entry.
3. **Env behind Arc<ConstantInfo>, with ref-graph precompile of
aux_gen prereqs.** Kernel caches were lifted into `Arc<KEnv>`
in the prior commit; this finishes the migration by changing
`Env::get` to return `Option<Arc<ConstantInfo>>` rather than
borrowed refs. Updated call sites in ground.rs (remove
`GroundError<'a>` lifetime), graph.rs, decompile.rs, ffi/,
aux_gen/, and throughout. Also fixes a nondeterministic
`MissingConstant` race in the compile scheduler:
- `precompile_aux_gen_prereqs` in `compile/env.rs` computes the
transitive SCC closure of aux_gen seed names (PUnit, PProd,
Eq{.refl,.symm,.ndrec}, rfl, HEq{,.refl}, eq_of_heq, True)
and compiles them in reverse-topo order into
`aux_name_to_addr` before any block's aux_gen fires. Prevents
work-stealing races where `.brecOn.eq`'s emitted `Eq.symm`
ref could race with `Eq`'s own SCC.
- Any prereq compile failure is now a hard error (was silent
fallback, which left names unresolved and reintroduced the
race).
- Scheduler uses `notify_one` per completion and `notify_all`
only at total completion — removes thundering herd on every
block.
- Don't register failed aux_gen names in
`aux_gen_extra_names` — downstream should get
`MissingConstant` rather than silently referencing broken
data.
Additional correctness and diagnostics:
- Gate slow-block / aux_gen timing diagnostics on `IX_TIMING`
env var (default off).
- Add `IX_CONGRUENCE_DUMP` env var in the congruence patch-check
to dump generated vs. original types/values on mismatch,
optionally filtered by name substring.
- `Named` gains `name_refs: Vec<Vec<Name>>` parallel to
`Constant.refs`, populated by every compile path. Currently
reserved — the decompiler resolves Ref names via the arena's
`name_addr` metadata (content-hashed, so alpha-collapse is
already disambiguated). Field kept as a schema-stable
extension point; documented accordingly.
- `decompile_expr` replaces `arena.nodes.get(i).unwrap_or(&Leaf)`
with `arena_lookup` that accepts `u64::MAX` as the "no
metadata" sentinel (returns Leaf) but errors with
`BadConstantFormat` on any other out-of-bounds index. Arena
corruption now fails loudly instead of silently stripping
metadata. Add `pop_result` helper for the same reason on
result-stack underflow.
- Sharing analyzer: `analyze_block` now returns topo_order as a
third tuple element, threaded through `decide_sharing`,
`build_sharing_vec`, and `analyze_sharing_stats` instead of
each re-sorting. `rewrite_expr` checks `hash_to_idx` before
the cache so stale cache entries don't block sharing
replacement — removes the need for two `cache.clear()` calls
in `build_sharing_vec`.
- Fix `decode_const_map` patch validation: `num_indices` was
hardcoded to 1 and `is_reflexive` to false; both now come from
the decoded `BelowIndcVal`.
- Clean up dead `_mk_unit_type` / `_mk_unit_val` Prop helpers in
cases_on.rs (only PUnit path is live).
Tests:
- `ValidateAux.lean`: extend test scope to `State` and `Lean`
prefixes for wider aux_gen coverage.
a97014c to
f0bceac
Compare
Makes `Benchmarks/Compile/CompileMathlib.lean` pass end-to-end through the
validator by closing every remaining aux_gen hash divergence and wiring
up the two diagnostic roundtrips + the `ix validate` CLI they share.
### New: `ix validate --path <file>` subcommand
Runs the full 8-phase aux_gen validation pipeline (compile + decompile +
roundtrip + alpha-equivalence + nested-detect checks) on any Lean file,
reusing the same Rust FFI (`rs_compile_validate_aux`) the `validate-aux`
test runner already calls. Separate from `ix compile` because validation
is expensive and primarily a correctness gate; separate from `lake test`
so Mathlib-scale validation doesn't force the test binary to rebuild on
every Mathlib update. Supports `--ns Aesop,SetTheory.PGame` namespace
filtering with transitive-dep closure.
Moves `buildFile` / `fetchMathlibCache` from `Cli/CompileCmd.lean` to
`Ix/Meta.lean` so both CLI entry points share the same Mathlib-cache
bootstrap, and drops `test-ffi` gating on `rs_compile_validate_aux` so the
production `ix` binary can call it.
### New: two roundtrip diagnostic tests
- `kernel-ixon-roundtrip` (replaces the old `kernel-roundtrip`): now runs
`Lean → compile → ixon_ingress → KEnv → ixon_egress → decompile_env →
Lean` rather than a second ad-hoc `KEnv → Lean` decompiler. Passing
through the validated `decompile_env` lets aux_gen auxiliaries
(`.brecOn*`, `.below`, `.brecOn_N.eq`) regenerate from the
kernel-canonicalized Ixon form — closing the binder-name / alpha-
collapse drift the old direct path couldn't see.
- `kernel-lean-roundtrip` (new): `Lean → lean_ingress → KEnv → lean_egress
→ Lean`, skipping compile + Ixon entirely. Used to bisect between
compile-pipeline bugs and ingress/egress bugs — if this is clean but
`kernel-ixon-roundtrip` fails, compile is losing info.
### New: `ixon_egress` (`src/ix/kernel/egress.rs`)
Inverse of `ixon_ingress`: `KEnv<Meta> + original IxonEnv → IxonEnv'`.
Reuses `apply_sharing_*` from the compile side so the output is ready for
`decompile_env`, and preserves each `Named.original` so aux_gen entries
survive the roundtrip. Parallelized over DashMap partitions (muts blocks
vs. standalone). Meta-only by design — generalizing to `Anon` requires
address-keyed lookups.
### New: `lean_ingress` (`src/ix/kernel/ingress.rs`)
Direct `LeanEnv → KEnv<Meta>`, bypassing `compile_env` + `ixon_ingress`.
Uses `lean_name_to_addr` for every `KId.addr`, populates `kenv.intern`
in-place, and emits `kenv.blocks` only for mutuals with >1 member. Only
needed by the bisecting `kernel-lean-roundtrip` test.
### Aux_gen correctness
- **Unsafe propagation** — plumbed `is_unsafe` through `.rec`, `.below`
(Type + Prop), `.brecOn` / `.brecOn.go` / `.brecOn.eq`, `.casesOn`,
`.recOn`. Matches Lean's `mkDefinitionValInferringUnsafe` +
`mkThmOrUnsafeDef`: unsafe `.brecOn.eq` flips from `Thm` to unsafe
`Defn` with `Opaque` hints; all type-level aux get `Unsafe`/`Opaque`
safety whenever the parent inductive is unsafe. Previously hardcoded
`Safe`, breaking content hashes on unsafe inductives.
- **Kernel WHNF in `find_rec_target` / IH builders** — `build_minor_type`,
`build_ih_type_fvar`, `build_rule_ih_fvar`, and `find_rec_target` now
delta-unfold field-domain heads via `TcScope::whnf_lean` instead of
pure `beta_reduce`. Matches Lean's `kernel/inductive.cpp::
is_rec_argument`; fixes `reduceCtorParam*` regressions where an
inductive appears under a definition head like `constType (n α) (n α)`.
- **K-target via WHNF-reduced sort** — `compute_is_large_and_k` uses the
kernel-derived `is_prop` rather than syntactic `peek_result_sort`, so
reducible-alias target types like `Presieve X := ∀ Y, (Y ⟶ X) → Prop`
correctly qualify for K.
- **`.brecOn.eq` indexed-Eq construction** — per-index sort levels
computed via `TcScope::get_level` (was hardcoded `Sort 1`); per-index
`Eq` vs `HEq` decided via `TcScope::is_def_eq` matching Lean's
`mkEqAndProof`; dependent index types substitute `outer_idx → new_idx`;
major's sort level threaded explicitly. Fixes `TRBTree.brecOn.eq`,
`Quiver.Path.brecOn.eq`, `PGame.Relabelling.brecOn.eq`,
`Monoid.CoprodI.NeWord.brecOn.eq`, `NFA.Path.brecOn.eq`, and similar
indexed-inductive cases.
- **`level_normalize`** — ported from Lean's `Level.normalize`
(`Lean/Level.lean:379-401`), applied by `TcScope::get_level` before
returning sort levels for forall types. Matches `inferForallType`;
without it, level trees stayed in our kernel's `mkLevelMax'` local-simp
form and produced alpha-equivalent-but-not-hash-equal PProd level args
(e.g., `SetTheory.PGame.brecOn.go` d=9 PProd.mk.lvl[1]).
- **Nested-inductive false-positive** — `try_detect_nested_fvar` no longer
matches against `flat`-stored external aux names (`Array`, `Option`,
…). Matching only against the block's original inductives mirrors the
kernel's `is_nested_inductive_app`; the old behavior flagged innocent
occurrences (e.g., `Option (Array Script.LazyStep)` inside
`Aesop.RappData`) and cascaded into phantom `_nested.Option_N` /
`.rec_N+1` / `.below_N+1` / `.brecOn_N+1` constants.
### Kernel ingress/egress
- **`RecRule.ctor` metadata field** — added `ctor: M::MField<Name>` to
kernel `RecRule` so ingress ↔ egress roundtrips the Lean rule's ctor
name. Unused by the kernel itself (dispatch is positional via `cidx`);
erased in `Anon` mode. Threads through `ingress_recursor`,
`egress_constant`, and every recursor-building site in
`kernel/inductive.rs`, including all tutorial tests.
- **Mdata preservation** — `lean_expr_to_zexpr_raw` accumulates consecutive
`Mdata(kv, _)` layers into a single `Vec<MData>` attached via the
`_mdata` constructors. Previously silently discarded, which lost every
`_recApp` / `_inaccessible` / `noImplicitLambda` / `borrowed` /
`sunfoldMatch` / `save_info` annotation. `kernel-lean-roundtrip`
guards against regressing this.
- **Binder-name tracking** — `lean_expr_to_zexpr_raw` now threads a
`binder_names` stack so `ExprData::Var`'s display name is populated by
de Bruijn lookup (cosmetic for pretty-printing; doesn't affect
type-checking).
- **`resolve_all` correctness** — errors on missing Named entries instead
of synthesizing a name-hash fallback. The old fallback produced ghost
KConsts (KIds referring to addresses where no KConst was ever stored),
causing obscure downstream lookup failures and alpha-collapse
confusion. `ingress_muts_inductive` also requires per-ctor Named
metadata to avoid synthesizing junk binder names from missing arenas.
- **Nat blob address fix** — uses `to_le_bytes()` (full BigInt width)
rather than `to_u64().unwrap_or(0).to_le_bytes()` which truncated values
≥ 2⁶⁴ to 0, hash-consing distinct Nats to the same KExpr.
- **Rename**: `ixon_to_zenv → ixon_ingress`, `egress_env → lean_egress`.
Disambiguates the four ingress/egress directions now that the Ixon and
Lean variants coexist.
### Mathlib-scale memory + parallelism
- **`Env::put` streaming** — collect only keys up front, `par_sort_unstable`,
look up each value via `DashMap::get` in the write loop so only one
value is live beyond the DashMap's own storage. Saves ~30 GB peak RSS
on Mathlib (Section 4 `Named` had the clone-into-Vec dominating peak).
`topological_sort_names` keeps the tuple-clone path: benchmarked 22s
faster than keys-only DFS because the Arc parent chain avoids 4.7M
shard lookups.
- **`validate-aux` parallelism** — Phases 1, 3, 4, 6, 7b, and Phase 2's
aux_gen+alpha step all run via `rayon::par_iter` with atomic
pass/fail counters and a 20-entry mutex-guarded failure sample. Phase
2 is restructured into three passes (serial collect → serial
pre-ingress → parallel aux_gen) because `p2_kctx` can't be populated
concurrently with aux_gen's reads.
- **Phase 7 memory accounting** — `std::mem::take(&mut stt.env)` extracts
the Ixon env before dropping the rest of stt + dstt in parallel via
`rayon::join`; serialize's 3 GB buffer dropped immediately after
deserialize; `fresh_stt` destructor offloaded to a background thread
so Phase 7b's parallel scan doesn't wait on it.
- **CLI-path destructor skip** — `rs_compile_env` and
`rs_compile_validate_aux` `std::mem::forget` their final state by
default (escape hatch: `IX_SKIP_DROPS=0`). On Mathlib this trims 60–90
seconds of shard-by-shard `Arc<NameData>` refcount chains at process
exit, where the OS reclaims the pages immediately anyway.
### Removed: `Named.name_refs`
The `Vec<Vec<Name>>` per-address name table was populated by every
compile path but never read by the decompiler — the arena's
`ExprMetaData::Ref { name }` already distinguishes alpha-collapsed Refs
via its name-content hash. Deletes the field, its `with_name_refs`
builder, every compile-site population, and the two `name_refs`
serialization loops. Small serialized-size win; no behavioral change.
…rant scheduler
This pass (1) closes three P1 soundness gaps in the kernel, (2) splits the
compile-time kernel context into pristine-originals and canonical halves
so original-constant verification is fully isolated from the aux_gen
pipeline, (3) adds native Int reduction parallel to Nat, (4) makes
compile_env tolerant of per-block failure so one bad inductive doesn't
stop the batch, and (5) wires a full-environment kernel-check FFI
(`CheckError = kernelException | compileError`) that lets Lean-side tests
distinguish compile-side rejections from kernel-side rejections.
### P1 soundness gaps — closed
- **P1-1 recursor rule verification** (`src/ix/kernel/inductive.rs`): the
"both gen and stored empty → error" guard was spuriously rejecting
`Empty.rec` / `False.rec` / `PEmpty.rec`. Zero-rule agreement is
*vacuous equality*, not a generator failure. Replaced with an
element-wise `is_def_eq(&gen_rule.rhs, &stored_rule.rhs)` gate
(also checks `fields` count); the one-sided `is_empty` branches
remain as honest mismatches. Regression test
`reject_bool_rec_with_swapped_rules` exercises the defeq gate with
type-correct-but-semantically-swapped minor bodies.
- **P1-2 mutual peer agreement** (same file): `check_inductive` now
enforces S3b — all peers must share parameter count AND
parameter-domain types, verified by a new `check_param_agreement`
walker that whnf-peels n_params foralls on both sides and
`is_def_eq`s each domain. Without this, `build_rec_type` would take
the shared param prefix uniformly from `ind_infos[0]` and produce
generated recursors whose param binders misalign with a peer's ctor
arguments — de-Bruijn-shifted iota, ill-typed stored terms. The
check is memoized on successful block completion in the new
`KEnv::block_peer_agreement_cache` (DashSet<KId>): peer agreement
is transitive, so O(N²) naive per-peer becomes O(N) per block.
References lean4 `check_inductive_types` and lean4lean's `Inductive/Add`.
Regression tests cover mismatched domains, mismatched counts, and a
happy-path sanity check.
- **P1-3 universe substitution out-of-range** (`tc.rs`, `error.rs`):
`subst_univ` / `instantiate_univ_params` now return `Result<_,TcError>`
and a new `TcError::UnivParamOutOfRange { idx, bound }`. Previously
any `Param(i)` with `i >= us.len()` silently produced an orphan
`Param` node. Arity is validated upstream (at `infer`'s Const node);
this is defense-in-depth so any code path reaching substitution
without that check fails loudly. All ~10 internal call sites
threaded with `?`. Regression test
`subst_univ_rejects_out_of_range_param`.
### Two-env split: `KernelCtx { kenv, orig_kenv }`
Add a second, pristine kernel environment populated once via
`lean_ingress(&lean_env)` at `compile_env` startup and never mutated.
Holds every Lean-original constant with all type references
self-consistent — no alpha-collapse, no aux rewriting, no staleness.
The existing `kctx.kenv` remains the incrementally-populated canonical
env.
- `mutual.rs` gains a new Phase 0 `check_originals` that typechecks
each Lean-stored inductive/ctor/recursor against `orig_kenv` before
any aux_gen work runs. Aux-recursor probe `<ind>.rec_1..16` to catch
recursors that live in their own SCC. Failures are recorded in
`stt.ungrounded` so the scheduler keeps running.
- Runs even for non-inductive blocks (`MutConst::Recr`-only SCCs from
`bad_raw_consts` can carry adversarial recursors that otherwise
wouldn't ever be checked).
- Old compile-overlay-polluted syntactic-compare false positives
(Array vs `_nested.Array_1`) are resolved by the split: Phase 0
runs against `orig_kenv`, the post-compile FFI check runs against
the aux-restored canonical env. Neither env sees the in-flight
compile overlay.
### LEON content-hash addressing in `orig_kenv`
`lean_ingress` now addresses every KId by `ConstantInfo::get_hash()`
(Blake3 over name + level params + type expression + variant-specific
fields: ctors, rules, `all`, value, hints, etc.) rather than
`lean_name_to_addr`. Two properties the name-hash scheme lacked:
- **Content-distinguishing**: a rogue Lean env can't shadow a primitive
by naming its own declaration `Nat`.
- **Compatible with `PrimOrigAddrs`**: the new hardcoded-LEON-hash
primitive table resolves addresses cleanly against `orig_kenv`,
avoiding the synthetic `@<hex>` KId fallback that otherwise
cascaded into `AppTypeMismatch` on every Nat literal reduction,
Bool literal, String coercion, etc.
Supporting machinery:
- `build_leon_addr_map` builds the `Name → Address` map in parallel
via rayon (returned as `DashMap` for signature compatibility with
`aux_n2a` downstream).
- Pass-1 ingress parallelized via `par_iter` (thread-safe: DashMap
intern, DashMap ingress_cache, unique LEON-hashed KIds mean no
shard contention on the insert).
- Block seeding (Phase A/B) uses the constant's **declaration order**
from `all` rather than FxHashMap iteration order, so
`discover_block_inductives` produces the order Lean's stored
recursors were generated against. Fixes spurious
`check_recursor: type mismatch` on `Lean.Xml.Content.rec`,
`Lean.Compiler.LCNF.Code.rec`, every
`Grind.Arith.*.*Cnstr*.rec`, etc.
- `ingress.rs::lean_ingress` pre-caches `Primitives::from_env_orig`
via new `KEnv::set_prims` so any `TypeChecker::new(orig_kenv)` sees
LEON-addressed primitives (the default `prims()` lazily initializes
to the canonical table, which would miss here).
### Native Int reduction
`try_reduce_int` in `src/ix/kernel/whnf.rs` (wired into `whnf`,
`whnf_no_delta`, and `lazyDeltaReduction` in `def_eq.rs`, parallel to
`try_reduce_nat`):
- Handles `Int.{ofNat,negSucc,add,sub,mul,neg,emod,ediv,bmod,bdiv,natAbs}`.
- Runs BEFORE delta so bodies like `Int.bmod`'s `Decidable.rec (LT.lt Int …)`
never expose the stuck `Int.decLt = decNonneg (b - a)` cascade.
- `IntVal = BigInt`; `extract_int_lit` reads canonical
`Int.ofNat n` / `Int.negSucc n` forms; `intern_int_lit` round-trips
back to ctor-headed shape; `int_ediv_emod` normalizes
truncated-div into Euclidean (non-negative remainder); `bmod`/`bdiv`
follow Lean's `[-(m/2), (m+1)/2)` window and the `Int.bmod x 0 = x`
/ `Int.bdiv x 0 = 0` corner cases.
- Lean's C++ kernel lacks a parallel `reduce_int` and reduces
symbolically through `Int.rec` + native Nat ops, which gets stuck
when any link of the chain is missing. Our kernel short-circuits.
Extends `Primitives<M>` with 12 Int fields, threads both `PrimAddrs`
tables with canonical + LEON hashes for each.
### `get_major_inductive_id` resilience
`src/ix/kernel/whnf.rs`: after peeling the stored
`params + motives + minors + indices` foralls, if the next domain
head isn't a `Const` resolving to `KConst::Indc`, scan up to
`MAX_EXTRA_FORALLS = 8` additional foralls for the first whose head
IS an inductive. Handles nested-inductive recursor shapes where
Lean's stored counts don't align with the kernel's view of the
forall structure after WHNF (e.g., extra instance/motive binders
not captured by `num_params/num_motives/...`).
### `is_rec_field` depth handling — corrected
The old path mixed (a) head-addr match and (b) a structural
same-head-Const check on the first `own_params` args — which silently
returned false whenever a spec_param was a bare `Var` (block param),
dropping the IH for any recursive field whose nested type used the
block's params directly (e.g. `head : Entry α β (Node α β)` in a
nested `List (Entry α β …)` scan).
Now:
- Caller passes `spec_params_lift_by: u64` explicitly, because stored
aux spec_params live at `depth = n_rec_params` and the lift required
depends on context:
- `build_minor_at_depth` pushes field locals → lift by
`self.depth() - n_rec_params`.
- `build_rule_rhs` uses virtual `Var(total_lams - 1 - j)` positions
without pushing → lift by `total_lams - n_rec_params`.
- Comparison uses `is_def_eq` on each (arg, lifted spec_param) pair —
handles alpha, whnf, beta, Var equality in a single shot.
- Same fix threaded through `build_minor_type` and `build_rule_rhs`
call sites; new doc header in the function explains the depth
contract and the interim-fix history.
### Tolerant scheduler: `stt.ungrounded` as DashMap
`compile_env` no longer aborts on per-block failure. Failures
(ill-formed inductives, cascading `MissingConstant`, …) are recorded
in `stt.ungrounded` (type changed from `FxHashMap` to `DashMap` for
concurrent writes); the rest of the env still compiles; dependents
hit `MissingConstant` and also land in `ungrounded`. Setup-phase
timing gated on `IX_QUIET` flag. `block_info` / `reverse_deps`
initialization is now parallel via rayon's `try_for_each`. Log spam
for cascading failures is gated on `IX_LOG_BLOCKS`.
### FFI: `CheckError` two-variant enum + `rs_kernel_check_consts`
- `CheckError` (Lean-side) gains a `compileError` ctor. Two variants
needed both (a) to disambiguate compile-side rejections from
kernel-side rejections in test output, and (b) to prevent Lean's
LCNF `hasTrivialStructure?` optimization from eliding a
single-ctor-single-field inductive to `String` — the heap ctor the
FFI allocates would otherwise be decoded as a string header and
SIGSEGV.
- `KERNEL_EXCEPTION_TAG = 0`, new `COMPILE_ERROR_TAG = 1`. New
`ErrKind` enum + `CheckRes` type on the Rust side.
- `run_checks_on_large_stack` / `check_consts_loop` now accept an
`ungrounded: FxHashMap<String, String>` snapshot; any constant
present there is reported as `compileError` without invoking the
kernel (matches the ix_old handling and lets `bad_raw_consts`
tests — e.g. `inductBadNonSort` failing `compute_is_large_and_k`
— roundtrip correctly).
- `build_uniform_error` now emits `ErrKind::Compile` (the setup-phase
failure happened before the kernel was consulted).
- `format_tc_error` catch-all uses `{other}` (hand-written `Display`)
instead of `{other:?}` to avoid dumping raw KExpr internals.
- `Array Bool` decoding now goes through `unbox_usize() >> 1`,
matching Lean's `lean_box(n) = (n << 1) | 1` tagged-scalar convention.
### Equiv-manager hot-path alloc reduction
`EquivManager::is_equiv` and `find_root_key` take `&EqKey` instead of
`EqKey`, eliminating Arc-clones on each call. `add_equiv` stays
by-value (insert requires ownership). `is_def_eq` builds a single
`a_key`/`b_key` binding up front and reuses it across is_equiv +
find_root_key + (at most one) add_equiv. The equiv-root second-chance
branch is the only remaining clone pair, and it's mutually exclusive
with the main-path add_equiv.
### Infer cache unification
Drop `infer_only_cache`; keep a single `infer_cache` that only stores
full-mode results. Infer-only reads happily consume them (validation
is strictly more than infer-only needs). Removes the
cache-duplication overhead and the subtle invariant that infer-only
results weren't supposed to leak into full-mode readers.
### `check_recursor_coherence` + coherence gating
New `check_recursor_coherence(id)`:
- `check_inductive` on the major (catches strict-positivity, bad ctor
return shape, field-universe violations — all of which the recursor
inherits).
- `compute_k_target(ind_id) == declared k` (K-reduction is sound only
for a narrow class of inductives; a mismatch is a soundness bug).
Plus `check_recursor` (the full gen-vs-stored path) now also gates
with `check_inductive` on the major before comparing rules. Cycle
invariant documented: `check_inductive` never calls
`check_recursor[_coherence]`, only `generate_block_recursors`.
### Lean-side
- `Ix/Ixon.lean`: `putConstantMetaIndexed` / `getConstantMetaIndexed`
always emit/consume three trailing length-prefixed extension tables
(`meta_sharing` / `meta_refs` / `meta_univs`) as zero-length on the
Lean side. Matches Rust's always-on wire format; Lean drops any
payload (it doesn't model call-site surgery data).
- `Ix/Meta.lean::getCompileEnv`: `loadExts := true` +
`enableInitializersExecution`, so persistent env extensions (e.g.
`registerTestCase` state) are hydrated from imported `.olean`s.
Without this, extensions silently initialize empty — breaks
extension-state reads via `get_env!`.
- `Ix/CompileM.lean`: new `rsLeonHashesFFI` opaque for the LEON hash
dump, consumed by the build-prim-origs test.
- `src/ix/address.rs`: `Address::to_unique_name` / `from_unique_name`
synthetic-`Name` codec (`Ix._#.<hex>`). Mirrors Lean-side
`Ix.Address.toUniqueName`; intended for KId/Named entries at
synthetic addresses that must not collide with Lean-originated names.
- FFI:
- `rs_leon_hashes`: hash every ConstantInfo in place, return
`Array (Ix.Name × Ix.Address)` — cheap relative to
`rs_compile_env_to_ixon`; used by test dumps.
- `Ixon.Named` FFI build/decode now handles the 3-field Lean
structure (addr, meta, original). Build allocates a 3-slot ctor
(was 2 — the missing slot caused Lean-side reads of slot 2 to
walk past the ctor header and SIGSEGV). `original` encodes
`Option (Address × ConstantMeta)` via the standard boxed
tagged-union pattern with scalar-optimized `None` handling on
decode.
- `rs_eq_env_serialization`: prints section-identifying diagnostics
under `IX_DEBUG_SERDE` — invaluable for opaque property-test
counter-examples.
### Tests
- `Tests/Ix/Kernel/BuildPrimitives.lean` (new): dumps the canonical
`(name, content_addr_hex)` pairs for paste into
`PrimAddrs::new`. Registered as `rust-kernel-build-primitives`.
- `Tests/Ix/Kernel/BuildPrimOrigs.lean` (new): dumps LEON-hash pairs
for `PrimAddrs::new_orig`. Shares `kernelPrimitives` +
`parseNameToLean` + `collectDeps` with BuildPrimitives as a single
source of truth. Registered as `rust-kernel-build-prim-origs`.
- `Tests/Ix/Kernel/CheckEnv.lean` (new): full-env typecheck via
`rsCheckConstsFFI`. Registered as `kernel-check-env`. Focus-mode
sister `kernel-check-const` walks a curated `focusConsts` list of
known-problematic names (Int64/Int32 lemmas stuck in
AppTypeMismatch; IR-scheme recursors currently rejected at
compile time; a single suspected WHNF loop commented out).
- `Tests/Ix/Kernel/Tutorial.lean`: transitive-closure seeding via
`collectDepsWithExtras` (seeds include both constant names and
`bad_raw_consts` names; walks refs through `env.constants` with
fallback to the raw-consts map). Turns a 45s test into a 5s test
by filtering ~200k unrelated Mathlib blocks. Error reporting
unpacks `CheckError` by ctor rather than `repr err` — derived
`Repr` is seconds-slow on multi-line kernel messages.
- `Tests/FFI/Lifecycle.lean`: `deferIO` + `mkSerdeRoundtripTest` put
all FFI calls under a lambda so they fire at test-execution time,
not `TestSeq` construction time. Previous code eagerly called
`rsSerEnvFFI` during test enumeration.
- `Tests/Gen/Ixon.lean`: `genNamed` now samples both `none` and
`some (addr, meta)` for the `original` field (3:1 frequency), so
the FFI `Named` roundtrip test actually exercises the `original`
encoding path.
- `Tests/Main.lean`: register the three new kernel test suites.
- Regression tests in `src/ix/kernel/inductive.rs` for all three P1
closures.
### Diagnostic env vars (all default off)
- `IX_TYPE_DIFF`: emit the `[type diff]` / `[rule rhs diff]` walk
from `check_recursor` mismatch. Default off — every mismatch in an
alpha-collapse regime produces thousands of lines. Uses
`KExpr::Display` so the format matches `TcError::AppTypeMismatch`.
- `IX_APP_DIFF`: print f/a/a_ty/dom and their whnf forms when infer's
App path rejects.
- `IX_DEF_EQ_TRACE=<prefix>`: trace every `is_def_eq` where either
side's head-const display form contains the prefix.
- `IX_DEBUG_SERDE`: section-level mismatch info for
`rs_eq_env_serialization`.
- `IX_QUIET`: suppress `[compile_env]` / `[lean_ingress]` phase
timings.
- `IX_LOG_BLOCKS`: gate the verbose dep-status block dump inside
the scheduler's failure path.
… cross-namespace twins
Ship the "anonymous canonicity" property end-to-end — compile, decompile,
congruence, and surgery now round-trip α-equivalent / cross-namespace /
source-reordered Lean mutuals to byte-equal Ixon. See the new
`docs/ix_canonicity.md` for the authoritative spec.
- `docs/ix_canonicity.md` (new, 1601 lines): theory, block layout,
metadata sidecar, compile/decompile pipeline, testing plan, and
open work.
- `Tests/Ix/Compile/Canonicity.lean` (new): 13 cross-namespace twin
families — simple/nested α-collapse, nested-aux hash ordering,
parameter rename, Prod binary-arity nesting, self-ref collapse,
over-merge + partial collapse, HO recursive fields, structures.
- `Tests/Ix/Compile/Mutual.lean`: new `NestedAuxOrdering{,Alpha,Prod}`
fixtures proving hash-sorted aux order is source-order-independent.
- `Tests/Ix/Compile/ValidateAux.lean`: seeds from the new Canonicity
namespace in addition to Mutual.
- New `AuxLayout { perm: Vec<usize>, source_ctor_counts: Vec<usize> }`
in `ix/ixon/env.rs`, where `perm[source_j] = canonical_i` maps Lean
source-walk aux positions to hash-sorted canonical positions.
- `ConstantMetaInfo::Muts` gains optional `aux_layout`, serialized in
the Rust Ixon format via `put_aux_layout` / `get_aux_layout`. Not
carried across the FFI — Lean's `ConstantMeta` has no `muts` variant,
so it's Rust-internal only.
- `CompileState.aux_perms: DashMap<Name, AuxLayout>` is populated per
block in `generate_and_compile_aux_recursors`, consumed by
`compute_call_site_plans` and `compile_aux_block_with_rename`.
- `aux_gen::generate_aux_patches` now returns `AuxPatchesOutput`
carrying `patches`, `aliases`, the hash-sort `perm`, and class/aux
counts so callers don't duplicate the canonical-layout logic.
- New module (1841 lines) replaces `aux_gen::canonicalize`. Walks
Lean-source-order originals and canonical aux_gen output in lockstep
with a `PermCtx` encoding (a) source→canonical aux positions and
(b) α-collapse const-name rewrites; FVar correspondence is
established at outer binder chains and apps are compared via
`app_spine_alpha_eq_ctx`, which peeks at rec heads to apply the
arg permutation.
- Fixes three failure modes of the old canonicalize-and-compare
approach: stale Const references for α-aliased auxes, BVar
arithmetic in rule-rhs bodies, opaque handling of `.recOn` /
`.brecOn.go` / `.brecOn.eq` value shapes.
- Wired into validate-aux Phase 2 via `build_perm_ctx_1b` in
`ffi/lean_env.rs` (also used by `rs_tmp_decode_const_map`).
- `compute_call_site_plans` now takes an `Option<&AuxLayout>` and
reads `num_motives` / `num_minors` directly from the Lean source
recursor (Lean's `numMotives = all.size + numNested`). Previously
deriving these from `original_all.len()` undercounted by
`numNested`, landing aux motives in the minor slice and producing
AppTypeMismatches like "Code minor in Array-Alt motive slot" on
`_sizeOf_N` bodies of nested mutuals (LCNF et al.).
- Source→canonical perm drives reordering of motive/minor args in
surgered `.rec` / `.below` / `.brecOn` spines.
- New `compile_aux_block_with_rename` maps canonical aux names to
Lean's source names when registering in the env, so user code
referencing `X.rec_1` resolves to the canonical aux at the right
semantic position under non-identity `perm`.
- `generate_recursors_from_expanded` threads
`source_of_canonical[canonical_i] = source_j` so aux recs emit as
`<all0>.rec_{source_j + 1}`, matching Lean's `.rec_N`. `below.rs`
and `brecon.rs` derive `below_N` / `brecOn_N` / `brecOn_N.{go,eq}`
from the rec's already-source-indexed suffix via
`aux_rec_suffix_idx` — below/brecon stay in lockstep with rec
under α-collapse.
- `aux_gen/nested.rs`:
- Aux `pre_flat` entries carry identity `spec_params` so
`find_rec_target` correctly identifies nested recursive fields
(otherwise fields like `List (A α)` missed their nested IH).
- `type_name_set` mirror of `types` for O(1) membership.
- `aux_seen` `Vec<(Hash, Name)>` → `FxHashMap` for O(1) dedup.
- Memoized `replace_all_nested` (per-constructor cache).
- Members track `source_owner` for downstream discovery.
- Aux ordering now runs `sort_consts` on temporary aux `Indc` values
instead of a separate name-based tiebreaker.
- New `CompileOptions { check_originals, max_workers }`. FFI
compile-only callers (`rs_compile_env_full`, `rs_compile_env`,
`rs_compile_phases`, `rs_compile_env_to_ixon`,
`rs_tmp_decode_const_map`) opt out of `check_originals` to avoid
retaining a second kernel-form copy of the full env. Kernel-check
path (`rs_kernel_check_consts`) enables it only when `expect_pass`
contains any `false` (the adversarial raw-constant case).
- New `preseed_expr_tables` / `collect_mut_const_exprs` walks block
exprs up-front to populate `cache.refs` and `cache.univs`
deterministically before per-constant compile, keeping content
hashes stable across scheduler orderings.
- Content-hash DAG memoization in `subst` (key `(Addr, depth)`) and
`inst_univ_inner` (key `Addr`, `us` fixed per call). Mirrors
lean4lean's `replaceM` / `PtrMap Expr Expr`. Unblocks hot paths
where the same subtree is walked hundreds of times (`_sizeOf_*`,
dependent-motive recursors).
- Removed the `2^20` Nat-literal cap on iota. `Nat.rec motive base
step N` expands ONE level per iota step and only actually recurses
if `step` forces `ih`; outer `MAX_WHNF_FUEL` correctly bounds
pathological cases. The old cap rejected legitimate reductions
inside `Int.Linear.Poly.combine_mul_k'` et al.
- `Int.pow` primitive added to `Primitives<M>` / `PrimAddrs` (anon
and meta hashes baked in) and to the `kernelPrimitives` dump list.
- `level::norm_add_node` now takes the current succ-accumulator `k`
— the earlier port silently dropped it, mis-normalizing
`Succ^n(imax(u, Param v))` for `n > 0`. `normalize_level` is now a
line-by-line port of lean4lean's `Level.Normalize`; `norm_level_le`
is a documented soundness-preserving strengthening (lean4lean's
`NormLevel.subsumption_eval` is `sorry` in `Verify/Level.lean:545`).
- Richer error variants and diagnostics:
`TcError::AppTypeMismatch { .., depth }`, `UnivParamOutOfRange`,
`TypeChecker::fuel_used()`.
- Gated diagnostic env vars for perf / reduction debugging:
`IX_IOTA_STUCK`, `IX_NAT_EXPAND_LOG`, `IX_WHNF_COUNT_LOG`,
`IX_DEF_EQ_COUNT_LOG`, `IX_INFER_COUNT_LOG`, `IX_SUBST_COUNT_LOG`,
`IX_DECL_DIFF`, `IX_PHASE_TIMING`.
- `ingress_expr` gains a `CallSite` arm that walks the IXON App
telescope and distributes per-arg arena indices from
`CallSiteEntry` across canonical positions. A plain App descent
propagated the single CallSite arena to every child and lost
per-arg binder names / failed the head Ref metadata lookup.
- `BlockCache` splits `sharing` (block-level, target of
`Expr::Share(idx)`) from `meta_sharing` (per-constant, target of
`CallSiteEntry::Collapsed.sharing_idx`). Treating them as one
silently returned the wrong subtree on any mutual block combining
`apply_sharing` output with surgered call-sites — the root cause
of the "Binder arena vs Expr::Ref Ixon" mismatch on surgered
`_sizeOf_N`.
- Projection decompile and recursor-rule decompile now call
`load_meta_extensions` so `Collapsed.sharing_idx` resolves against
the right table.
- `BuildTelescope` reverses popped args so App spines are rebuilt
in source order — fixes asymmetric-arg hash instability under
surgery.
- New `decode_name_array` decodes `Array Lean.Name` structurally
via a fresh `GlobalCache` (pointer-identity dedup of shared
subnames). Replaces the fragile `Name.toString` + `parse_name`
round-trip: Lean's escaped `Lean.Order.«term_⊑_»` no longer fails
lookup against the kernel's unescaped `Lean.Order.term_⊑_`.
- `rs_kernel_check_consts` signature change:
List (Name × CI) → Array Lean.Name → Array Bool → Bool →
IO (Array (Option CheckError))
with the trailing `Bool` toggling ephemeral per-constant progress
(in-place `[i/N] name ...` label; only slow/failing/not-found
constants get persistent log lines). Results are position-paired
with input names (no `(name, result)` tuple).
- `Tests/Ix/Kernel/Tutorial.lean` and `Tests/Ix/Kernel/CheckEnv.lean`
refactored to pass `Lean.Name` structurally throughout and use
`quiet = true` for full-env runs.
- Kernel modules gain `#[cfg(test)]` suites: `tc.rs` (context,
subst_univ), `def_eq.rs` (proof-irrelevance tier), `congruence.rs`
(level / expr / const), `infer.rs` (error paths), `error.rs`,
`primitive.rs`.
- `aux_gen/cases_on.rs` and `aux_gen/rec_on.rs` gain arity /
binder-name / structural tests.
First green run of `lake test -- kernel-check-env` end-to-end. The
kernel can now independently revalidate every stored mutual block in a
compiled environment without trusting compile-side metadata, the FFI
drives a parallel batch checker with progress / ETA / in-flight
reporting, and `lake exe ix check` ships the same pipeline as a
production CLI command.
The marquee change is §4.4 of the canonicity story (`docs/ix_canonicity.md`):
the kernel now ports `sort_consts` into `src/ix/kernel/canonical_check.rs`
(~1200 lines) and uses it as an independent oracle. The compiler's claim
that a block is canonical is verified, not trusted.
- New `src/ix/kernel/canonical_check.rs`: faithful Anon-mode port of
`compare_level`, `compare_expr`, `compare_indc`, `compare_ctor`,
`compare_recr`, `compare_defn`, plus `KMutCtx` block-local class map.
Two operating modes:
1. `validate_canonical_block_single_pass` — fast adjacent-pair
strong-`Less` validation against the stored Muts partition.
Rejects `Greater` (wrong order) and `Equal` (uncollapsed
alpha-equivalence). Falls back to full iterative refinement
when an adjacent pair is only weak `Less` and accepts only when
refinement returns the same singleton class order.
2. `sort_kconsts` / `sort_kconsts_with_seed_key` — iterative
partition refinement (sort → group → re-sort under updated
`KMutCtx`) for rediscovered nested auxiliaries.
- `TcError::NonCanonicalBlock { block, pos, ordering }`: rejection
variant emitted by primary-block validation. Display reports the
failing pair offset and direction.
- `ingress_muts_block` now calls `validate_canonical_block_single_pass`
on every stored Indc block.
- `canonical_aux_order` in `src/ix/kernel/inductive.rs`: synthesizes
`KConst::Indc` views of each rediscovered aux (instantiating ext
type with `spec_params`, replacing aux ctor result heads with the
synthetic aux KId) and runs `sort_kconsts_with_seed_key` to compute
the kernel-canonical aux order. The seed key uses a compiler-shaped
`<all0>._nested.<Ext>_<N>` name so alpha-equivalent aux remain
distinct synthetic members until partition refinement collapses them
— matching compile-side `sort_consts` behaviour.
- `RecursorAuxOrder::{Source, Canonical}` on `KEnv`: Lean's original
recursors use source/queue aux order, so `lean_ingress` builds the
orig_kenv with `Source` and skips canonical re-sorting; compiled
Ixon environments use `Canonical`. `build_flat_block` consumers
branch on this flag.
- Compiled Ixon `Muts(Indc, …)` blocks now contain only user reps —
positions 0..n-1. Aux inductives are transient compile-time
entities; the only persistent footprint is the recursor block
(`<addr>.rec_N`) and downstream aux derivatives (`.below_N`,
`.brecOn_N`).
- The kernel rediscovers aux from primary ctor walks and recomputes
the canonical aux order itself via `sort_kconsts`. Stored aux
recursors validate by position against the kernel-canonical aux
via `is_def_eq` on the recursor type.
- `docs/ix_canonicity.md` rewritten: §4 is now four operational
invariants (4.4 = kernel-side canonicity), §6.0 / §6.2 / §10.3
document the no-aux-in-inductive-block layout and `CallSite`
metadata alignment, §15 / §17 / §18 update the table-of-properties.
`sort_aux_by_content_hash` renamed to
`sort_aux_by_partition_refinement` everywhere.
- `ExprMetaData::CallSite` gains `canon_meta: Vec<u64>` — one arena
root per canonical-order arg in the IXON App spine. `entries`
stays in source order for decompile; `canon_meta` is the
canonical-order metadata sidecar that kernel ingress uses to
attach binder/reference metadata to each canonical argument.
Required for split-SCC minor adaptation: source-order minors stored
as `Collapsed` no longer have a Kept entry from which ingress could
recover the canonical wrapper's metadata.
- `ingress_expr` CallSite arm walks the IXON spine through
`IxonExpr::Share` indices when distributing canonical-arg arenas;
earlier impl crashed with "invalid Share index" on shared subtrees.
- `decompile_expr` mirrors the Share-walk via
`collect_ixon_telescope_expanding_shares`. The
Kept-entries-vs-canonical-args invariant relaxes from `==` to
`<=` because compile-side may synthesize a canonical wrapper arg
for which there is no source-order Kept entry (split-SCC minor).
- `decompile_inductive` scopes per-constructor `meta_sharing` /
`meta_refs` / `meta_univs` so aux-generated `.below` constructor
metadata cannot leak across sibling constructor arenas.
- `KEnv` gains `block_check_results: DashMap<KId, Result<(), TcError>>`,
`block_checks_in_progress: Mutex<HashSet<KId>>`,
`block_check_cv: Condvar`, plus `BlockCheckStart::{Cached, Owner}`
and `BlockCheckToken`. Concurrent checkers cooperate: the first
caller for a homogeneous block becomes owner; siblings park on the
condvar until the owner publishes a cached result.
- `check_const` classifies a constant's block as Defn / Inductive /
Recursor and routes through whole-block coordination when possible.
Mixed blocks fall back to legacy per-member checking. Ctor / aux
members of an inductive block also dispatch to the parent block
check so a single member request validates the whole block.
- `populate_recursor_rules_from_block` replaces `try_late_rule_generation`:
rule RHSs are populated once from the recursor block's peers using
full major-premise signature matching (not just inductive address),
which disambiguates duplicate nested auxiliaries at the right
recursor positions.
- `block_peer_agreement_cache` collapses O(N²) peer iteration to O(N)
per block; `infer_only_cache` now isolates infer-only synthesis
results so unchecked entries never contaminate the validated
`infer_cache`.
- New BitVec definitional reducer: `BitVec.toNat (BitVec.ofNat w n)`
→ `n % 2^w`, `BitVec.ult` → `Nat.ble (succ x.toNat) y.toNat`,
`Decidable.decide (x < y)` for BitVec routes through the same path.
- New String primitive reducer: `String.back ""` → `Char.ofNat 65`,
`String.utf8ByteSize` → byte length, `String.toByteArray ""` →
`ByteArray.empty`.
- Symbolic Nat reduction: predicate-by-ctor (`Nat.beq` / `Nat.ble`
on `Nat.succ` chains stay in literal/ctor form), peek through
symbolic `Nat.add`, lower-bound proofs for `Nat.mod` literal
numerator vs symbolic denominator, `Nat.sub` / `Nat.add` peeling on
literal RHS. New `extract_nat_value` recovers numerals from
`OfNat.ofNat Nat <lit>` and `Nat.succ <lit>` forms exposed by iota.
- Stuck-predicate detection: `Nat.beq` / `Nat.ble` between an unknown
argument and a recursive Lean model (`Nat.rec`, `Nat.casesOn`,
`BitVec.toNat`, `Fin.val` projections) stays stuck instead of
unfolding the model and peeling huge literals.
- Nat-literal iota runaway guard: replaces the old static `2^20`
literal cap with a per-recursor "consecutive predecessor" detector
(`nat_iota_run`), bounded by `MAX_CONSECUTIVE_NAT_LITERAL_IOTA`
(8192) and `MAX_LARGE_NAT_LITERAL_IOTA` (16384). Lean fuel-style
literals (`hugeFuel := 100_000_000`) still reduce when actual
recursion is bounded by data structure.
- Projection-definition rewriting: `Subtype.val` and the rest of
Lean's `Defn { kind = Definition, val = λ … Prj _ _ (Var n) }`
wrappers are reduced to bare `Prj` so cheap primitive recognizers
still match the head.
- `Fin.val` over `Decidable.rec`: `(Decidable.rec false_minor true_minor d).val`
reduces to a `Decidable.rec` whose minors return the projected Nat,
unlocking cases-on style defs that produce `Fin` values from a
`Decidable` decision.
- `System.Platform.numBits` reduces directly to `64` — Lean now
defines it via `Subtype.val (System.Platform.getNumBits ())` whose
inner `getNumBits` is opaque/extern. `PUnit._sizeOf_1` and
`Unit._sizeOf_1` reduce to `1` directly, matching the closed-form
Lean SizeOf instance that's otherwise stuck on an open unit var.
- Multi-arg beta uses `simul_subst` (one substitution pass instead of
N), and the WHNF main loop adds cycle detection (`seen` history)
so a reduction that loops without making progress breaks instead
of fueling out.
- def_eq cheap exits run before `tick`: `ptr_eq`, `hash_eq`, and
structural `compare_kexpr` all short-circuit before charging
recursive fuel. Beta/iota/zeta-only app congruence (Tier 1d) runs
before exposing recursive Lean models; congruence after
`whnf_no_delta` keeps primitive wrappers stuck when both sides
share the same head.
- `MAX_REC_FUEL` raised from 200K to 1.5M for BVDecide-style
generated mutual proofs that genuinely exceed a million kernel
steps after cache hits stop consuming fuel. `IX_MAX_REC_FUEL`
env var lets workers override.
- New diagnostic dumps gated by env vars: `IX_DELTA_TRACE`,
`IX_PROJ_TRACE`, `IX_NAT_TRACE`, `IX_ETA_TRACE`,
`IX_PROJ_DELTA_TRACE`, `IX_DEF_EQ_MAX_DUMP`,
`IX_WHNF_FUEL_DUMP`, `IX_RECURSOR_DUMP`. All gated on
`debug_label_matches_env` so a single label-filtered run can drill
into one constant without the rest of the env's traffic.
- New `Ix/Cli/CheckCmd.lean` (production CLI command) and
`Ix/KernelCheck.lean` (shared FFI binding + `CheckError` enum).
The single `@[extern "rs_kernel_check_consts"]` declaration moves
out of `Tests/Ix/Kernel/Tutorial.lean` to `Ix.KernelCheck`, with
Tutorial.lean / CheckEnv.lean re-exporting for backward compat.
- Flags: `--path <file>` (required), `--ns <prefixes>` (transitive
closure for filtered runs), `--verbose` (per-constant lines vs
default ephemeral progress).
- Emits a machine-readable `##check##` summary line for CI.
- `rs_kernel_check_consts` ungated from `feature = "test-ffi"` and
promoted to a production FFI. Test-only roundtrip helpers
(`rs_kernel_roundtrip{,_no_compile}`) stay cfg-gated.
- `run_checks_parallel_on_large_stacks` spawns N large-stack workers
with a cooperative scheduler that batches block-coordinated members
into a single owner task. `ParallelProgress` reports periodic
done/total, rate, ETA, and oldest in-flight constants.
- Tunables: `IX_KERNEL_CHECK_PROGRESS_MS`, `IX_KERNEL_CHECK_SLOW_MS`,
`IX_KERNEL_CHECK_ACTIVE_SLOW_MS`, `IX_KERNEL_CHECK_INFLIGHT`,
`IX_KERNEL_CHECK_NAME_CHARS`. Slow threshold default raised from
1s to 7s (Mathlib full-env runs swamp the log otherwise).
- `KernelCtx::orig_kenv` is now populated only when
`CompileOptions::check_originals` is set. Trusted compile paths
leave it empty so production builds don't retain a second
kernel-form copy of the entire env.
- `eager_reduce` is a synthetic kernel-only marker because Lean's
`eagerReduce` shares a canonical content address with `id`;
address-only dispatch on the real constant would be unsound.
- `KEnv::insert` panics on insertion at any address in
`PrimAddrs::reserved_marker_addrs()` so user envs cannot smuggle
a constant into the synthetic-marker slot.
- `Lean.reduceBool` and `Lean.reduceNat` are real primitives now
(address-dispatched, not synthetic), and the primitive table grows
to include `System.Platform.getNumBits`, `Subtype.val`,
`String.toByteArray`, `ByteArray.empty`, `Decidable.rec`, `Fin`.
- `safe definition references unsafe X` rejection extended from
unsafe defs to unsafe inductives, ctors, and recursors. The
`is_unsafe` flag participates in `compare_kindc` so alpha-collapse
cannot merge a safe inductive with an unsafe one.
- `check_inductive_member` skips strict-positivity (A3) for unsafe
inductives, matching Lean's behaviour.
- `KernelMode::meta_name`: extracts the underlying Lean `Name` from a
metadata field in Meta mode. Used by canonicity-validation and
diagnostic paths that need the source name (e.g. for the
compiler-shaped seed key in `canonical_aux_order`).
- `src/ix/graph.rs` no longer treats `InductiveVal.all` /
`RecursorVal.all` as outgoing edges. Mutual-block members that
don't structurally reference each other now split into minimal
SCCs as they should. New regression tests
(`inductive_all_members_are_not_graph_edges`,
`recursor_all_is_metadata_not_graph_edge`) lock the behaviour in.
- New `BRecOnCallSitePlan` and `brec_on_call_site_plans` /
`below_call_site_plans` on `CompileState`. `.brecOn`'s telescope
is `params, motives, indices, major, handlers` (vs `.rec`'s
`params, motives, minors, indices, major`) and motive permutation
/ drop is shared with the recursor plan. `.below` is the
motive-only `params, motives, indices, major` variant.
- `rec_name_to_brecon_name` / `rec_name_to_below_name` derive
matching aux names per source/canonical layout.
- `Tests/Ix/Kernel/CheckEnv.lean`: skips constants from
`Tests.Ix.Kernel.TutorialDefs` (pure-Lean fixtures that intentionally
don't roundtrip through Ix). Uses parallel quiet mode by default.
- `Tests/Ix/Kernel/Tutorial.lean` / `BuildPrimitives.lean`: import
`Ix.KernelCheck`, expose new primitives in `kernelPrimitives`.
- New whnf / def_eq unit tests covering each new reducer:
`whnf_string_legacy_back_empty_literal`,
`whnf_string_utf8_byte_size_literal`,
`def_eq_string_to_byte_array_empty`,
`whnf_bitvec_ult_zero_rhs_is_false`,
`whnf_bitvec_to_nat_ofnat_zero_is_zero`,
`whnf_decide_bitvec_lt_zero_is_false`,
`whnf_nat_ble_symbolic_succ_stays_stuck`,
`whnf_nat_predicates_reduce_one_symbolic_ctor_layer`,
`whnf_nat_mod_literal_by_symbolic_lower_bound`,
`def_eq_nat_add_literal_lhs_not_succ_chain`, plus block-coordination
tests `checking_one_definition_checks_sibling_block` and
`concurrent_definition_block_checks_share_result`.
- Replace two-pass standalone+muts buffered ingress with a single parallel
try_for_each that inserts directly into KEnv, bounding peak memory by
in-flight worker outputs instead of materializing every converted constant
before assembly.
- Add ixon_ingress_owned that consumes IxonEnv so it can be dropped before
the heavy kernel check loop runs. rs_kernel_check_consts adopts it and
also drops the ungrounded map and rust_env_arc early.
- Workers resolve Named metadata on demand via lookup_name instead of
cloning every payload during partition.
- IX_QUIET-gated phase timings (validate, lookups, partition, stream).
- Drive-by: bulk let-chain brace reformatting across compile, decompile,
kernel, and aux_gen to match current rustfmt.
Lays the groundwork for measuring the kernel performance plan (plans/okay-let-s-write-a-lucky-dolphin.md) per audit §10. Counters live in a new kernel::perf module; KEnv carries a PerfCounters field that is dumped from a Drop impl when IX_PERF_COUNTERS=1 is set. Unset is the production default — every increment short-circuits via a LazyLock<bool> so the cost is a single cached branch on the hot path. Wired into the cache get sites the audit identified: - whnf_cache hits/misses (whnf.rs around 201/211) - whnf_no_delta_cache hits/misses (whnf.rs around 437/446) - infer_cache and infer_only_cache hits/misses (infer.rs around 45/51) - def_eq_cache hits/misses (def_eq.rs around 137/149) - def_eq_failure set hits/inserts (def_eq.rs around 360) - per-constant peak/avg MAX_REC_FUEL consumption (recorded in TypeChecker::reset before the next constant resets it) Verified against lake test -- kernel-check-env --ignored: 192156/192156 constants pass in 251s (4% over the 242s baseline; overhead is the LazyLock branch + atomic load in the disabled path). P1: def_rank_id replaces def_weight_id for hint-priority comparison Audit Tier 1 #3 (kernel-perf-adversarial-audit-2026-04-26.md, §4.2): the prior u32 encoding mapped Abbrev to u32::MAX-1 and saturating-added Regular(h) to h+1, which collide at h ≥ u32::MAX-2. When that happens the delta-direction logic treats Abbrev and a maximally-heavy Regular as "same height" and unfolds both, instead of preferring Abbrev as Lean does (compare(d_t->get_hints(), d_s->get_hints()) at type_checker.cpp:910). Replace the u32 weight with a (class: u8, height: u32) tuple compared lexicographically: - Opaque / Theorem / unknown → (0, 0) - Regular(h) → (1, h) (height ordering preserved within class) - Abbrev → (2, 0) (strictly above every Regular) Update the two call sites (is_def_eq lazy-delta height comparison and lazy_delta_step). The map_or default for "missing head" is preserved as (u8::MAX, u32::MAX) — the branch is dead in practice (a_delta && b_delta imply both heads are present) but kept consistent with the prior u32::MAX sentinel. Two regression tests: - def_rank_abbrev_above_saturated_regular: Abbrev outranks Regular(u32::MAX) (the previous saturation collision). - def_rank_regular_orders_by_height: height monotonically orders Regular ranks within the class. Verified with lake test -- kernel-check-env --ignored: 192156/192156 in 256s (no regression vs the 242s pre-perf-counters baseline; the +14s is from the IX_PERF_COUNTERS=unset LazyLock branch added in the prior commit, not this change). P2: peel_proj_forall fast-paths syntactic Pi in projection inference Audit Tier 1 #2 (kernel-perf-adversarial-audit-2026-04-26.md, §7.2): infer_proj's two parameter-consuming loops (param peel and field peel) called self.whnf(&r)? unconditionally per iteration, on a body mutated by subst at the previous step. The whnf cache rarely hits between iterations and each call re-traverses the substituted body. Extract a peel_proj_forall(&r, err) helper that: - tries ExprData::All(..) syntactically first (no WHNF call), and - falls back to full self.whnf(e) only when the binder isn't already syntactic Pi. This mirrors Lean's inferProj at type_checker.cpp:582–610. Both projection-inference loops now call peel_proj_forall instead of unconditional whnf. Behaviorally equivalent — same WHNF semantics on miss, no semantic change otherwise. Verified with lake test -- kernel-check-env --ignored: 192156/192156 in 258s (parity with the post-pre-work, post-P1 baseline; no measurable regression and the cache-hit-rate counters will move on tactic-heavy workloads under IX_PERF_COUNTERS=1). P3a: WhnfFlags substrate (no behavior change) Lays the foundation for the Lean4Lean architectural alignment described in plans/okay-let-s-write-a-lucky-dolphin.md. Phase 3a is substrate-only — no call site is migrated to cheap mode yet, so behavior is unchanged. Adds: - WhnfFlags { cheap_rec, cheap_proj } with FULL and CHEAP consts and is_full(). CHEAP is currently equal to FULL until Phase 3c wires it. - whnf_core_with_flags (private): the existing whnf_core impl, now threading flags into recursive calls and try_iota_with_flags. - whnf_core / whnf_core_cheap (super): FULL/CHEAP wrappers. - whnf_no_delta_with_flags (private): the existing whnf_no_delta impl with the Prj branch gated on cheap_proj — falls back to full whnf on the projected value when not cheap. - whnf_no_delta (pub) / whnf_no_delta_cheap (super): wrappers. - try_iota_with_flags: gates major-premise WHNF and string-literal constructor reduction on cheap_rec. - try_proj_app_reduce_with_flags: gates projected-value WHNF on cheap_proj. Cache reads/writes (whnf_no_delta_cache, equiv-manager second-chance) are gated on flags.is_full(): cheap callers neither read nor write the cache, preserving the invariant that any cached entry is a fully-reduced normal form. Phase 3b will inline the projection branch into whnf_core to match Lean4Lean's two-layer architecture (refs/lean4lean/Lean4Lean/ TypeChecker.lean:266, 297). Phase 3c will flip CHEAP to enable cheap_proj and migrate specific def-eq sites. Verified with lake test -- kernel-check-env --ignored: 192156/192156 in 243s (matches the 242s baseline; substrate adds no measurable overhead when CHEAP == FULL). P3b: inline projection into whnf_core (Lean4Lean architectural alignment) Move the Prj branch from whnf_no_delta_with_flags into whnf_core_with_flags so our whnf_core matches Lean4Lean's whnfCore semantics exactly (refs/lean4lean/Lean4Lean/TypeChecker.lean:284-292, 337-341). Before this commit: whnf_core — beta + zeta + iota + cheap projection (recursive whnf_core on val, no delta) whnf_no_delta — whnf_core + FULL projection (full whnf on val) + native primitives + projection_definition + quotient whnf — whnf_no_delta + delta Lean4Lean's architecture has no whnf_no_delta layer. Their whnfCore includes projection, with the cheap_proj flag deciding whether the projected value uses whnfCore (cheap) or whnf (full). After this commit: whnf_core (with WhnfFlags) — beta + zeta + iota + projection (cheap_proj controls val reduction) whnf_no_delta — whnf_core(_, FULL) + native primitives + projection_definition + quotient whnf — whnf_no_delta + delta The bare-Prj branch in whnf_no_delta_with_flags is removed — whnf_core now handles it directly. The App-of-Prj branch stays in whnf_no_delta because whnf_core's loop returns once the outermost Prj is resolved; try_proj_app_reduce_with_flags gives one more attempt at the same cheap_proj policy when the outer expression is App(Prj, ...). Pure refactor, no semantic change with CHEAP == FULL. Verified with lake test -- kernel-check-env --ignored: 192156/192156 in 270s (within noise of the 243s pre-refactor baseline). Phase 3c will flip CHEAP to enable cheap_proj=true and migrate def-eq's lazy-delta sites surgically per Lean4Lean's pattern. P3c (postponed): document the HeaderParsedSnapshot regression P3a (substrate) and P3b (Lean4Lean architectural alignment) are committed. Phase 3c — flipping CHEAP to enable cheap_proj=true and migrating the def-eq lazy-delta sites — was attempted but reproduced 5 failures on chained projections in Lean.Language.Lean.HeaderParsedSnapshot.* even after P3b inlined the projection branch into whnf_core. The substrate is left in place; CHEAP stays equal to FULL until the regression's root cause is understood. Notes on the regression for the next investigator: - Failures: HeaderParsedSnapshot.{stx,result?,metaSnap,toSnapshot,ictx}, all with 'projection: type mismatch with declared struct'. - The struct `extends` a parent, so each projection is a chained Prj whose val is itself a Prj into the parent. - The error comes from infer.rs's infer_proj at the head-vs-struct_id address compare, after FULL whnf on val_ty. That whnf is FULL, but val_ty was inferred via paths that may have consulted a def-eq cache populated under cheap mode. Possible cache-poisoning suspect: def_eq_cache writes a `false` result for inputs whose lazy-delta loop bottomed out under cheap projections. The cache key uses raw a/b hashes, not cheap-reduced shapes, so a stored `false` is indistinguishable from a FULL `false` by future readers. - Lean4Lean does not have an analogous wide def_eq_cache; their failure cache is keyed only on same-spine pairs in lazyDeltaReductionStep. Future P3c iterations should either prove the cache poisoning theory incorrect or restrict def_eq_cache writes to FULL-derived results.
Port the cheapProj path for def-eq WHNF without letting cheap reduction results pollute full-mode caches. Def-eq now uses a private `WhnfFlags::DEF_EQ_CORE` path for structural and no-delta WHNF, matching Lean/lean4lean's cheap projection behavior while leaving public WHNF semantics full. Cheap def-eq calls are tracked with a depth counter so cheap `false` results are cached only in a cheap-only cache; cheap `true` remains safe to promote to the full def-eq cache. WHNF caches no longer lookup or mirror through def-eq equivalence roots. That relation is too broad for WHNF cache sharing because proof irrelevance, eta, and structure eta establish def-eq without guaranteeing identical WHNF. Add focused regression coverage for `HeaderParsedSnapshot` extended-structure projections.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
new kernel approach, closing #300