Skip to content

v0.9.0 — binding hardening + trigger-vs-truth evidence

Choose a tag to compare

@ajaysurya1221 ajaysurya1221 released this 14 Jun 11:55
· 58 commits to main since this release
4d3de74

Symbol-binding correctness + honest evidence for it. Binding widens when a claim is re-checked; the checker still decides truth — a watched file changing never makes a claim BROKEN by itself.

Highlights

  • Symbol→defining-file binding (5 limitations closed) + 3 TDD-hardened precision nits (C4 nodeid whitespace parity; backticked common-word over-binding guard; ambiguous pyproject-script target rejection).
  • Binding-lifecycle benchmark — 808 known-truth (artifact, mutation) pairs over 63 domains, scored in two layers:
    • selection (re-check trigger) recall 0.54 → 1.00 vs a pre-binding checker-path watcher, at 1.00 precision (vs 0.92 for the rejected "any file with the token" shortcut) — the false-TRUSTED trigger reduction.
    • verdict (BROKEN) precision 1.00, zero false BROKEN; ERRORED reported separately, never an alarm.
    • the gutted-body ceiling is shown, not solved: an existence checker fires the trigger but yields 0 BROKEN; only a behavior checker catches it.
      dorian bench binding-lifecycle · docs/BENCHMARK_BINDING_LIFECYCLE.md
  • Offline public-case reproductions of still-open problem classes — solved 2 / partial 1 / not_solved 2; labels derived from dorian's actual behavior. dorian bench realworld-usecases · docs/REALWORLD_USECASES.md
  • README + roadmap refreshed; CodeRabbit review (1 critical, 1 major, 2 minor) addressed; a CI rmtree-race in the bench teardown fixed.

In-fixture, synthetic results — a reproducible demonstration of the mechanism on these suites, not a claim about any real repository. Full gate green; matrix CI 3.11/3.12/3.13 + CodeRabbit pass.