Add [force e] strict-normalization combinator (eigentrust pitfall #11) by kumavis · Pull Request #9 · LogosLang/prologos

kumavis · 2026-04-25T09:02:39Z

Fixes pitfall #11 from docs/tracking/2026-04-23_eigentrust_pitfalls.md.

Summary

Adds a [force e] AST node whose whnf reduces its argument to NF (not just WHNF). User wraps recursive-tail-call arguments in force to opt into strict evaluation:

;; before — O(k²) shape, deep iteration on Posit32 hangs
[eigentrust-iterate c p alpha eps budget [eigentrust-step c p alpha tnew]]

;; after — O(k) shape
[eigentrust-iterate c p alpha eps budget [force [eigentrust-step c p alpha tnew]]]

Type-and-usage-transparent: infer(force(e)) = infer(e), inferQ likewise. QTT linearity is preserved — a forced m1 value still consumes once.

Approach (a) — explicit force, lowest blast radius

Picked option (a) from the design choices in the prompt:

(a) (chosen) Add a [force e] user-visible combinator
(b) deferred Eagerly normalize specific forms in whnf itself — risks QTT linearity violations and changes the evaluation model
(c) deferred Type-driven strictness for tail-call argument positions when types are concrete — bigger architectural change

(b) and (c) are documented as deferred future work in the commit message.

Pipeline exhaustiveness

Per .claude/rules/pipeline.md § "New AST Node", touched 13 of 15 pipeline files:

Core 8: syntax.rkt, substitution.rkt, zonk.rkt, reduction.rkt, pretty-print.rkt, pnet-serialize.rkt, typing-core.rkt, qtt.rkt
Surface 5: surface-syntax.rkt, parser.rkt, tree-parser.rkt, elaborator.rkt, macros.rkt
Defensive: driver.rkt, effect-ordering.rkt

Skipped: unify.rkt (force is type-identity; falls through after one whnf step), foreign.rkt (not interop), typing-propagators.rkt (no new structural typing).

Test plan

New tests/test-force-strict.rkt — 13 cases: basic eval, QTT linearity preservation under force, no-op on already-WHNF, deep-iteration speedup demo
Regression: reduction (27), parser (82), syntax (20), elaborator (40) — all green

Perf demo numbers

Synthetic deep iteration (natrec-per-level chain). Real Posit32 case has a much larger gap because step is genuinely expensive per level.

k	lazy	strict	speedup
12	196ms	44ms	4.5x
20	437ms	74ms	5.9x
28	797ms	89ms	9.0x

Lazy is super-linear (10x runtime for 3.5x more depth). Strict is roughly linear. Confirms the O(k²) vs O(k) shape from the pitfall report.

Surprise worth flagging

tools/check-parens.sh is hardcoded to a macOS Racket path (/Applications/Racket v9.0/bin/racket), so it fails on every file in a Linux env. Worked around by inlining the same with-module-reading-parameterization check against /usr/local/bin/racket. Out of scope for this PR but worth a follow-up.

Commits

bbebe74 — primary fix (new AST node + 13-file pipeline updates + test file)
80fdca8 — CI fix (skip stale bench file)

https://claude.ai/code/session_01MbncYJnrvjzhbVWw4xGi5x

Generated by Claude Code

The 2026-04-23 eigentrust pitfalls memo (forthcoming branch) enumerated 16 items hit during the EigenTrust implementation. Items #1-7 and #11-15 are language/elaboration defects with their own PRs. Items #8, #9, #10, and #16 are observations rather than Prologos defects; no compiler change is needed for them but the memo deserves a parallel disposition note so a future reader does not double-count them as open work. #8 (exact-Rat slow on deep iter): intrinsic to exact rational arithmetic; benchmark-scope guidance, not a fix. #9 (Posit32 literals work): positive observation; `~` literal prefix is unambiguous unlike `0/1`. No action. #10 (PVec preserves where List does not): subsumed by pitfall #3 fix. After #3 lands, both literal forms preserve element type uniformly. Close as duplicate. #16 (column-stochastic vs row-stochastic): algorithm/spec clarification, not a Prologos defect. The eigentrust implementation branch already takes column-stochastic M directly and validates via col-stochastic?. https://claude.ai/code/session_01MbncYJnrvjzhbVWw4xGi5x Co-authored-by: kumavis <1474978+kumavis@users.noreply.github.com>

kumavis

we should be able to achieve this without the user-driven force. so we'll set this aside for now.

Pitfall #11 (lazy argument reduction makes deep iteration scale as O(k^2) for non-fixed-point iterates) had no general fix because reduction is call-by-name everywhere and the user has no way to mark an argument position as "evaluate before passing." When every intermediate value differs bit-for-bit (Posit32, non-fixed-point Rat) the reducer redoes O(n) levels on every iteration, totalling ~O(k^2) terms expanded across k rounds. This commit picks approach (a) from the suggested-fix triage: a new expression form [force e] / (force e) that, at whnf time, fully normalizes its argument and returns the NF (with the force wrapper elided). Type-and-usage-transparent: infer(force(e)) = infer(e), inferQ(force(e)) = inferQ(e). Lowest blast radius — does not change existing reduction semantics for any expression that does not mention `force`. Users who hit pitfall #11 can wrap the recursive tail-call argument: ;; Before (slow): [eigentrust-iterate c p alpha eps budget [eigentrust-step c p alpha t]] ;; After (fast): [eigentrust-iterate c p alpha eps budget [force [eigentrust-step c p alpha t]]] Future work, deferred: (b) Make whnf eagerly normalize specific forms (numeric literals, list literals) so even unwrapped lazy chains collapse at WHNF. More general but architectural — needs care around QTT linearity (an expression marked m1 must not be evaluated twice by an "eager" optimizer that didn't see it was inside an eliminator). (c) Type-driven strictness for tail-call argument positions when the type is concrete. Hardest; requires call-site analysis and a way to mark "strict" in the function signature. Pipeline coverage (per .claude/rules/pipeline.md "New AST Node", user-facing surface syntax variant — 13 of the 15 listed files): Core (8): syntax.rkt expr-force struct + provide + expr? entry substitution.rkt shift, subst (recursive) zonk.rkt zonk, zonk-at-depth, default-metas reduction.rkt whnf-impl/match -> nf(arg); nf-whnf clause pretty-print.rkt pp-expr "[force ...]" + uses-bvar0? pnet-serialize.rkt reg1! typing-core.rkt infer (delegates to argument) qtt.rkt inferQ (delegates to argument) User-facing surface syntax (5): surface-syntax.rkt surf-force struct + provide parser.rkt (force e) keyword case tree-parser.rkt "force" surf-force in builtin-unary-ops elaborator.rkt surf-force -> expr-force macros.rkt surf-force recursive case in expand-expression Driver / generic recursion (2 — defensive coverage): driver.rkt contains-unsupported-qtt?, rewrite-spec effect-ordering.rkt fv-expr Not touched (deliberate): unify.rkt classify-whnf-problem already handles unknown forms via fall-through; force is type-identity so unification of [force e] with t reduces to unification of e with t after one whnf step. foreign.rkt force is not a runtime interop concept. typing-propagators.rkt no new structural typing behavior; force inherits whatever the argument has. Tests (tests/test-force-strict.rkt, 13 tests): 1. Basic semantics (5): whnf/nf on ground values, beta-redex, suc chains. 2. Type identity (2): infer(force(e)) = infer(e). 3. QTT identity (2): inferQ(force(e)) preserves type and the full usage vector — ESPECIALLY under a linear (m1) binder (force does NOT consume an extra unit). 4. Idempotence (2): force around already-WHNF / nested force. 5. Performance demo (2): synthetic deep chain with natrec at each level. Confirms lazy and strict produce the same NF, and asserts strict <= lazy + small slack at k=18. Empirical numbers from a synthetic natrec-per-level chain on the agent machine (the real eigentrust workload sees a much larger gap because Posit32 step is genuinely expensive per level): k=12: lazy=196ms strict=44ms ~4.5x k=16: lazy=297ms strict=55ms ~5.4x k=20: lazy=437ms strict=74ms ~5.9x k=24: lazy=555ms strict=83ms ~6.7x k=28: lazy=797ms strict=89ms ~9.0x The lazy curve is super-linear (~10x runtime for ~3.5x more depth); the strict curve is roughly linear in k. This is the O(k^2) vs O(k) shape predicted by pitfall #11. Verification: raco make racket/prologos/tests/test-force-strict.rkt ok raco test racket/prologos/tests/test-force-strict.rkt 13 / 13 raco test racket/prologos/tests/test-reduction.rkt 27 / 27 raco test racket/prologos/tests/test-parser.rkt 82 / 82 raco test racket/prologos/tests/test-syntax.rkt 20 / 20 raco test racket/prologos/tests/test-elaborator.rkt 40 / 40 Co-authored-by: kumavis <1474978+kumavis@users.noreply.github.com>

The 2026-04-23 eigentrust pitfalls memo (forthcoming branch) enumerated 16 items hit during the EigenTrust implementation. Items #1-7 and #11-15 are language/elaboration defects with their own PRs. Items #8, #9, #10, and #16 are observations rather than Prologos defects; no compiler change is needed for them but the memo deserves a parallel disposition note so a future reader does not double-count them as open work. #8 (exact-Rat slow on deep iter): intrinsic to exact rational arithmetic; benchmark-scope guidance, not a fix. #9 (Posit32 literals work): positive observation; `~` literal prefix is unambiguous unlike `0/1`. No action. #10 (PVec preserves where List does not): subsumed by pitfall #3 fix. After #3 lands, both literal forms preserve element type uniformly. Close as duplicate. #16 (column-stochastic vs row-stochastic): algorithm/spec clarification, not a Prologos defect. The eigentrust implementation branch already takes column-stochastic M directly and validates via col-stochastic?. https://claude.ai/code/session_01MbncYJnrvjzhbVWw4xGi5x Co-authored-by: kumavis <1474978+kumavis@users.noreply.github.com>

Per user review of #0-#10: many entries were either out-of-scope (env limitations, not Prologos issues) or wrong (claims I never actually tested). Re-tested every claim against a real Racket and revised the doc. Numbers are reserved per the user's instruction — entries marked DELETED keep their slot so cross-refs don't drift. Detail: #0 DELETED — out-of-scope (Racket toolchain not in sandbox). Environment limitation, not a Prologos issue. #1 REFRAMED — was "capability subtype + promise resolution composition." Re-titled to honestly reflect what this actually is: an OCapN-side Phase 0 deferred-implementation note (eventual cross-vat receive isn't wired up yet). NOT a Prologos bug. #2 DELETED — false claim. Tested with a real Racket: WS-mode wildcard match `match | _ -> body` on user data types elaborates AND evaluates correctly when the function carries a proper `spec`. The `prologos::data::datum` comment I cited applies to a narrower polymorphic-context case, not a blanket wildcard ban as I asserted. Cleanup of behavior.prologos (~250 -> ~70 LOC) follows. #3 DELETED — false claim. Tested: `data Step step : [Nat -> Nat]` (with bracketed function type per the lseq-cell convention) accepts a function value, including closures with captured state. Open-world actor behaviour storage IS supported. The closed-enum BehaviorTag in our implementation was a needless workaround driven by this incorrect pitfall. Cleanup tracked separately. #4 KEPT, REFRAMED — real, narrowed claim. grammar.ebnf §6 lines 1153/1187/1199 promise `Mu` (sexp) and `rec` (WS) for recursive sessions. Both elaborate to `Unknown session type: rec` / `Mu`. So pitfall #4 is now: "rec/Mu in grammar but not in elaborator." CapTP's stream-level well-typedness is therefore the documented ceiling; per-exchange sub-protocols remain the workaround. #5 KEPT — `none`/`some` need explicit type args in some inference contexts. Real ergonomics tension, accurately documented. #6 DELETED — out-of-scope. WS-mode `let p := body` and sexp-mode `(let (p v) body)` are TWO surface forms by design (grammar.ebnf §7 line 1236). User-error, not a Prologos bug. #7 DELETED — was a quantitative restatement of #2. With #2 recanted, #7 evaporates: behavior modules can be wildcard-collapsed, dropping ~180 LOC. #8 DELETED — false claim. Tested: `data Box1 box1 : [Sigma [_ <Nat>] Bool]` and `data Table table : Nat -> [List [Sigma [_ <Nat>] Bool]]` both elaborate cleanly. The named-struct ActorEntry/PromiseEntry workaround in vat.prologos was unnecessary; can be simplified back. #9 DELETED — user error. `def` for value bindings vs `defn` for functions is documented (grammar.ebnf §3 lines 189-190, prologos-syntax rules). Mis-using `defn` for a 0-ary constant isn't a Prologos bug. #10 DELETED — out-of-scope. Network sandbox blocking external docs is an environment limitation. #11-#20 were not in scope of this review and remain as-is for the user to review next.

Original PIR text claimed runtime/core/ contains the BSP scheduler. Wrong — actual runtime/core/ has only data structures (cell store + profile counters + format buffer). The BSP scheduler stayed in each kernel file (hybrid kernel's worklist + fire_against_snapshot + merge_pending_writes + swap_worklists are inlined in prologos-runtime-hybrid.zig). Stage 3 design called for core/bsp.zig (~150 LOC) + core/worklist.zig (~60 LOC) as Phase 1 deliverables. Actual Phase 1 extracted only cells.zig + profile.zig + format.zig. The factoring scope shrank silently — neither the implementing commit nor any subsequent commit acknowledged the gap. Surfaced when the user asked "what's in the hybrid core zig side?" during PIR review. Corrections: - §1 (What Was Built): explicit "data structures only" + cross-ref to wrong-assumption #9 - §2 (Stated Objectives): added "reality check on the design quote" flagging the drift - §3 delivered table: Phase 1 status changed to "✅ partial" - §4 timeline: Phase 1+3+4 line clarifies scheduler not extracted - §5 deferred: new row for BSP scheduler factoring - §8 D1 + anti-decision: caveat added; "BSP scheduler abstraction" claim corrected to "cell-store comptime-parameterization abstraction" - §9 #2: factoring narrative softened to "data structures only" - §11 #3: original kernel template + factoring made explicit at data-structure layer only - §12 #2 + #8: kernel-LOC framing corrected; "factored core LOC was bigger than expected" reframed to "smaller than expected — and the gap is the BSP scheduler" - §13 architecture: "consuming runtime/core/" framing softened to "consumes for cell store + profile + format helpers; the BSP scheduler is inlined" - §14 #3, #7, #8: "future kernels instantiate the same scheduler" claim corrected to "instantiate CellStore + reuse profile counters; each kernel still writes its own scheduler" - §15 technical debt: new row for "BSP scheduler not factored" - §17 wrong assumptions: new #9 "Phase 1 will factor the BSP scheduler into core" — Wrong; codifies the silent scope shrinkage - §18 #4: "factoring at second-instance" pattern reinforced with "complete vs minimum-viable shared surface" caveat - §21 lessons: new entry for "phase-close should compare delivered scope against design plan" - §24 open Q #4: kernel-PU consumption clarified — needs scheduler reuse in-place or triggers the extraction debt Errata block added at the top of the PIR documenting which sections were corrected and why. https://claude.ai/code/session_01Tycs6BWKG58Wo99YVPg6DF

…spatch fix Both PReduce-lite and Hybrid Runtime PIRs now carry an "Addendum (2026-05-04, swappable-backend refactor)" block at the top + targeted updates to §3 (delivered files), §15 (technical debt closure), and §23 (key files). PReduce-lite PIR: - Top-of-doc addendum referencing the refactor design doc, the new three-file backend layer (preduce-core + backend-racket + backend- hybrid), and the functional-threading model rationale (SH-endpoint fit). - §3 file table: added preduce-core.rkt (153 LOC) and preduce- backend-racket.rkt (~100 LOC); annotated preduce.rkt as "1509 → ~1480 (post-backend-refactor)" with a note on the b-* primitive rewrite. - §23 key-files: replaced the two-file PReduce-lite engine entry with the post-refactor five-file layout (preduce-core, backend- racket, backend-hybrid, preduce, preduce-hybrid). Hybrid Runtime PIR: - Top-of-doc addendum (separate from the existing BSP-scheduler-in- core errata) covering: (a) the parallel-impl debt closure (407 → 66 LOC for preduce-hybrid.rkt; ~6× larger AST coverage on the kernel — Phase 1-10b vs Phase 8b only); (b) the native-dispatch regression discovered + fixed (initial backend-hybrid wrapped int-arith as callback; #:native-op hint restored kernel tags 0-7); (c) post-fix benchmark numbers (W4 hybrid 2× faster than lite for user-ctor workloads; W5 31% faster wall + 14× faster kernel-side for int-arith). - §1 What Was Built: pre-refactor vs post-refactor narrative on the preduce-hybrid.rkt collapse + the new preduce-backend-hybrid.rkt. - §3 file table: preduce-hybrid.rkt 407 → 66 (−341 LOC); added preduce-core.rkt + preduce-backend-hybrid.rkt rows. - §15 technical debt: TWO entries struck through (rendered as ~~old text~~) and replaced with "CLOSED 2026-05-04" notes: - "Two parallel reducers" debt (closed by the refactor proper) - "Native int-arith dispatch lost" regression (closed by the #:native-op hint mechanism) Both addenda cross-reference [`2026-05-04_PREDUCE_BACKEND_REFACTOR_DESIGN.md`] which carries the full design plan + 9-phase rollout tracker. The structural lessons reinforced: - "Factor at second-instance" pattern (PReduce-lite §18 #4) — the refactor extracted the factoring at the right time, with a third Racket-reducer consumer (future preduce-distributed?) on the horizon. - "The parallel-impl debt was real" — the hybrid PIR's §15 entry warned about it; the refactor cashed in the warning. - "Phase-close should compare delivered scope against design plan" (Hybrid §17 #9) — the same lesson surfaced AGAIN: the initial backend-hybrid Phase 4 commit didn't compare its native-dispatch scope against the pre-refactor preduce-hybrid's; the regression hid for two commits before the user's "is there a bug" question exposed it. Codified for the third time; pattern is now confirmed load-bearing. https://claude.ai/code/session_01Tycs6BWKG58Wo99YVPg6DF

Per user review of #0-#10: many entries were either out-of-scope (env limitations, not Prologos issues) or wrong (claims I never actually tested). Re-tested every claim against a real Racket and revised the doc. Numbers are reserved per the user's instruction — entries marked DELETED keep their slot so cross-refs don't drift. Detail: #0 DELETED — out-of-scope (Racket toolchain not in sandbox). Environment limitation, not a Prologos issue. #1 REFRAMED — was "capability subtype + promise resolution composition." Re-titled to honestly reflect what this actually is: an OCapN-side Phase 0 deferred-implementation note (eventual cross-vat receive isn't wired up yet). NOT a Prologos bug. #2 DELETED — false claim. Tested with a real Racket: WS-mode wildcard match `match | _ -> body` on user data types elaborates AND evaluates correctly when the function carries a proper `spec`. The `prologos::data::datum` comment I cited applies to a narrower polymorphic-context case, not a blanket wildcard ban as I asserted. Cleanup of behavior.prologos (~250 -> ~70 LOC) follows. #3 DELETED — false claim. Tested: `data Step step : [Nat -> Nat]` (with bracketed function type per the lseq-cell convention) accepts a function value, including closures with captured state. Open-world actor behaviour storage IS supported. The closed-enum BehaviorTag in our implementation was a needless workaround driven by this incorrect pitfall. Cleanup tracked separately. #4 KEPT, REFRAMED — real, narrowed claim. grammar.ebnf §6 lines 1153/1187/1199 promise `Mu` (sexp) and `rec` (WS) for recursive sessions. Both elaborate to `Unknown session type: rec` / `Mu`. So pitfall #4 is now: "rec/Mu in grammar but not in elaborator." CapTP's stream-level well-typedness is therefore the documented ceiling; per-exchange sub-protocols remain the workaround. #5 KEPT — `none`/`some` need explicit type args in some inference contexts. Real ergonomics tension, accurately documented. #6 DELETED — out-of-scope. WS-mode `let p := body` and sexp-mode `(let (p v) body)` are TWO surface forms by design (grammar.ebnf §7 line 1236). User-error, not a Prologos bug. #7 DELETED — was a quantitative restatement of #2. With #2 recanted, #7 evaporates: behavior modules can be wildcard-collapsed, dropping ~180 LOC. #8 DELETED — false claim. Tested: `data Box1 box1 : [Sigma [_ <Nat>] Bool]` and `data Table table : Nat -> [List [Sigma [_ <Nat>] Bool]]` both elaborate cleanly. The named-struct ActorEntry/PromiseEntry workaround in vat.prologos was unnecessary; can be simplified back. #9 DELETED — user error. `def` for value bindings vs `defn` for functions is documented (grammar.ebnf §3 lines 189-190, prologos-syntax rules). Mis-using `defn` for a 0-ary constant isn't a Prologos bug. #10 DELETED — out-of-scope. Network sandbox blocking external docs is an environment limitation. #11-#20 were not in scope of this review and remain as-is for the user to review next.

Mini-PIR covering the 5-hour 12-commit refactor session that delivered preduce-core.rkt + backend-{racket,hybrid}.rkt + the preduce-hybrid.rkt thin-wrapper collapse + 10 OCapN-on-kernel programs + the native-dispatch regression-and-fix loop. 469 lines, 24 sections, follows the 16-question PIR template. Highlights: §1 What Was Built — functional threading throughout (Option A), #:native-op symbolic hint for backend-specific native dispatch, pitfalls.md opened. §4 Timeline — 12 commits from `0d80dfa` (backend interface skeleton) through `236e441` (programs 9 + 10), with explicit user-correction points marked at threading-model flip, parallel-impl debt surfacing, and native-ns regression discovery. §7 Bugs Found — 3 fixed (multi-value capture, native-dispatch regression, FormatBuffer truncation) + 2 averted (Option B threading model, parallel-impl-not-deleted). §10 What Went Wrong — leads with "external adversarial review caught all 3 architectural drift points; my internal validation gates missed all three." Pattern repeated across this PIR + Hybrid PIR §17 #9 + PReduce-lite consolidated PIR. §16 What Would We Do Differently — six tactical lessons (threading against SH-endpoint fit, "deletion phase" is mandatory before declaring victory, perf regression net needed, pitfalls.md opened proactively, mechanical rewrites benefit from per-section commits, benchmark-before-refactor as baseline). §17 Wrong Assumptions — 7 assumptions surfaced as wrong, including "side-effecting backend interface is simpler" (wrong for in-and- out-of-native fit), "after Phase 4 the refactor is done" (parallel impl still in tree), "the differential gate is sufficient" (caught zero perf regressions), "OCapN programs will surface a flood of bugs" (only 4 distinct pitfalls across 10 programs). §20 Longitudinal — flags "external adversarial review catches what internal VAG misses" as a 3-instance pattern across recent PIRs (Hybrid PIR §17 #9, PReduce-lite consolidated PIR refactor errata, this refactor's 3 user-driven corrections). Recommends codification in workflow.md. §21 Lessons Distilled — 10 entries, mostly Pending codification in DESIGN_METHODOLOGY.org / workflow.md / testing.md / propagator-design.md adjacent docs. Two are already Done by example (the #:native-op mechanism, the 'hybrid sentinel pattern, the stress-test-as-bug-finder pattern via pitfalls.md). §24 Open Questions — 7 questions surfaced, leading with "when does Phase 7 (native migration on OCapN-shape callbacks) happen?" and "should the parallel-impl debt at the Zig layer get its own refactor?" (same shape as this one, one layer down). Cross-references the design plan + the PReduce-lite + Hybrid PIRs + the pitfalls doc + the OCapN NOTES.md. https://claude.ai/code/session_01Tycs6BWKG58Wo99YVPg6DF

…empirical falsification Verification-only sub-phase closing the 2A group. Empirically validates that the BSP outer-loop's value-tier orchestration (2A.a's process-retraction + 2A.b's process-resolution + S1 NAF + classify-inhabit) is observationally equivalent to the pre-2A sequential orchestrator (`run-stratified-resolution- pure`) for retraction-heavy workloads. Mini-design + mini-audit persisted at D.3 §8.7.c (10 sub-subsections). §8.7.4's "S(-1) runs POST-S0" timing concern (drift risk #9-bis from §8.7.5) empirically falsified by 3 orchestration-parity tests: Pre-2A sequential: cleanup → S0 → L2 → ... (S(-1) BEFORE S0) Post-2A BSP loop: S0 → S(-1) → restart-outer-loop → S0 fires on cleaned state → ... (S(-1) AFTER S0) The 2A model adds one "extra round" of S0 firing on pre-cleanup state per outer-loop iteration. §8.7.4's correctness argument: worldview-filtering at `net-cell-read` should hide stale entries from propagators firing under a retracted assumption bitmask; S(-1)'s subsequent cleanup is COMPACTION, not correctness. Empirical confirmation: all 3 parity tests PASS — the correctness argument holds in practice. Tests added (test-elaboration-parity.rkt): 1. `orchestration-union-no-retraction`: "(ns t) (def x : <Int | String> := 42) x" → expected '42 Int branch succeeds first; no retraction. Baseline. 2. `orchestration-union-with-retraction` (THE FALSIFICATION CASE): "(ns t) (def x : <Int | String> := \"hello\") x" → expected "hello" Int branch fails → `record-assumption-retraction` writes cell-13 → process-retraction handler fires POST-S0 → BSP restart-from-outer-loop → S0 fires on cleaned state with retracted bits worldview-filtered → String branch succeeds. Tests that worldview-filtering preserves correctness across the timing change. 3. `orchestration-union-flipped-with-retraction`: "(ns t) (def x : <String | Int> := 42) x" → expected '42 Symmetric variant with union order flipped; verifies no left/right asymmetry in the retraction path. `(ns t)` bootstrap pattern works in `run-ns-last` harness — D1 drift risk verified clean (no fallback needed). The `(ns t)` form triggers prelude auto-import + namespace setup; subsequent `def x : <type>` invokes `check ctx value (expr-union T1 T2)` at typing-core.rkt:2385 which uses `with-speculative-rollback` (the canonical retraction trigger). Adversarial 3-column VAG passed all 4 questions (cleanest sub-phase in 2A group — empirical confirmation of a design-time correctness argument is high-confidence methodology; adversarial column surfaced "are 3 cases sufficient" question even on a verification-only sub-phase). Validation: - Targeted tests (test-elaboration-parity): 25 tests PASS in 4.5s (+3 from orchestration-parity axis) - Probe (examples/2026-04-22-1A-iii-probe.prologos): 0 errors; 28 outputs semantically preserved - Acceptance (examples/2026-04-17-ppn-track4c.prologos): 0 errors - Full suite: 8234 tests / 116.3s / 0 failures (vs 8231/121.1s pre-2A.c; +3 tests; wall -4.8s within 118-127s variance band) Scaffolding retirement targets: - None NEW for 2A.c (verification-only sub-phase; no production code changes) - All 2A.a/b scaffolding retirement plans inherited and unchanged - 2B is the next sub-phase: retire `run-stratified-resolution-pure` orchestrator + `run-stratified-resolution!` dead legacy + driver init for `current-retracted-assumptions` parameter Methodology lessons codified this session (3 reinforcements): 1. Empirical falsification testing for design-time correctness arguments is HIGH-CONFIDENCE methodology. 2nd data point (1st: BSP-LE Track 2 tropical- quantale falsification). 2. Adversarial 3-column framing at every gate — even verification-only sub-phases. 3rd data point. 3. Verification-before-retirement is a clean separation pattern. 2A.a/b/c verify the new mechanism; 2B retires the old. 2nd data point (1st: S2.b/c/d/e arc). **2A group COMPLETE**: .0 (cells) → .a (process-retraction) → .b (process- resolution) → .c (orchestration-parity verification). Ready for 2B. Per addendum D.3 §8.7.c mini-design + §3 tracker.

This was referenced Apr 25, 2026

Migrate bench-bsp-le-track2.rkt to current ATMS API (CI unblock) #10

Merged

Disposition doc for eigentrust pitfalls #8/#9/#10/#16 (observations) #7

Closed

kumavis force-pushed the claude/fix-eigentrust-pitfall-11-strict branch from 80fdca8 to a09ddc0 Compare April 26, 2026 01:03

kumavis commented Apr 26, 2026

View reviewed changes

kumavis force-pushed the claude/fix-eigentrust-pitfall-11-strict branch from a09ddc0 to 43baede Compare April 26, 2026 03:04

kumavis mentioned this pull request Apr 26, 2026

Tracking issue for "eigentrust pitfall doc" PRs #19

Open

13 tasks

hierophantos mentioned this pull request Apr 27, 2026

Sugar fn [] and [-> A] for nullary lambda/type as Unit-encoded #26

Open

kumavis marked this pull request as draft April 28, 2026 21:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add [force e] strict-normalization combinator (eigentrust pitfall #11)#9

Add [force e] strict-normalization combinator (eigentrust pitfall #11)#9
kumavis wants to merge 1 commit into
mainfrom
claude/fix-eigentrust-pitfall-11-strict

kumavis commented Apr 25, 2026

Uh oh!

kumavis left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kumavis commented Apr 25, 2026

Summary

Approach (a) — explicit force, lowest blast radius

Pipeline exhaustiveness

Test plan

Perf demo numbers

Surprise worth flagging

Commits

Uh oh!

kumavis left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants