Skip to content

Releases: ajaysurya1221/dorian

dorian v1.1.1 — golden path now blocks broken promises

18 Jun 19:09
27037d0

Choose a tag to compare

dorian v1.1.1 — the golden path now blocks broken promises

A small, focused follow-up to v1.1.0. No breaking changes — this changes a single
scaffold default; the warrant format, checker grammar, exit codes, and trust semantics are unchanged.

What changed

  • dorian init's starter claim is now load-bearing. In v1.1.0 the scaffolded starter claim sealed
    as non-load-bearing, so when a later change broke it the warrant folded to DEGRADED (exit 3)
    and the scaffolded GitHub Action (fail_on: revoked) does not block on DEGRADED. So a first-time
    user following the golden path would break the promise and watch it silently ship. The starter is
    now load-bearing: breaking it folds to REVOKED (exit 4) and the default Action blocks the PR.

Why it matters

dorian's whole pitch is "broken promises do not silently ship." The golden path (dorian init) is the
first thing a new user runs, and it should demonstrate the gate — not degrade quietly under the
default Action. This release makes the out-of-the-box experience match the promise. It also resolves an
internal inconsistency: the scaffolded change note already called these "load-bearing facts."

This was caught by the v1.1.0 public-install smoke test (install from PyPI → initverify → break
revalidate), which observed WARRANTED → DEGRADED (exit 3) where the product promise calls for a
block.

Install

pip install -U dorian-vwp        # 1.1.1
dorian --version

Quickstart (the golden path, now blocking)

cd your-repo
dorian init                                                # claims.json + change note + workflow
dorian verify dorian-change-note.md --claims claims.json   # seals green — exit 0
# ...later, a change breaks the sealed fact...
dorian revalidate --since <base>                           # REVOKED, exit 4 — the Action blocks the PR

GitHub Action

Unchanged. The scaffolded workflow pins ajaysurya1221/dorian/action@v1.1.1 and uses the default
fail_on: revoked, which now blocks when the starter claim is broken.

Security notes

No change to the security boundary. dorian init still writes files only — it never runs a checker,
never executes code, never writes outside the repo root, and never overwrites without --force. The
starter claim remains a read-only C3 checker (config-value:/path:). --deny-exec/--deny-shell
and checker_trust: base are fail-closed policies, not a sandbox.

Known limitations

  • dorian verifies explicit, checkable claims — not arbitrary correctness; it is not a sandbox.
  • One documented, reproduced real cross-PR catch (encode/httpx) — not broad real-world validation.
  • A load-bearing starter means that if you keep the example claim and legitimately change the fact it
    pins (e.g. rename your package), the next PR is REVOKED until you update the claim or supersede the
    warrant — which is the intended "you changed a sealed fact, acknowledge it" behavior. Replace the
    starter with your real load-bearing facts.

Tests / gates

Full suite green at 1.1.1 (the 1.1.0 suite plus 2 new dorian init tests pinning the load-bearing
starter and the end-to-end break → REVOKED → exit 4), ruff clean, wheel/sdist build + twine check
pass, and a public-install smoke test from PyPI.

Upgrade notes

pip install -U dorian-vwp. Already-sealed warrants are unaffected. If you want a previously
scaffolded starter claim to block, set its load_bearing to true in claims.json and re-seal with
dorian verify, or re-run dorian init --force to regenerate the scaffold.

dorian v1.1.0 — CI truth gate: dorian init + clearer PR comment

18 Jun 18:30
a27b540

Choose a tag to compare

dorian 1.1.0

A productization release that makes the first run easy. No breaking changes: the warrant
format, checker grammar, exit codes, and trust semantics are unchanged. 1.1.0 adds the missing
golden-path onboarding command, makes the PR-comment output customer-readable, and cleans up a
packaging-hygiene defect — it changes a command surface and output formatting, not verification.

What's new

  • dorian init (new command) — first-run scaffolding so a new user reaches a sealed warrant in
    minutes instead of hand-writing JSON. It writes three files:

    • a born-verifiable starter claims.json (a config-value: claim about the pyproject package
      name when available — the same checker family that caught encode/httpx #3592 — otherwise a
      path: existence claim about a file that is present, so the first dorian verify seals green, not red);
    • dorian-change-note.md, the change note those claims back;
    • .github/workflows/dorian.yml, the GitHub Action workflow, pinned to this package's version.

    It writes files only — it never runs a checker, never executes code, never writes outside the
    repo root, and never overwrites an existing file without --force (re-running is idempotent).
    Supports --dry-run (print the plan, write nothing) and the global --json (machine-readable
    summary). The scaffolded starter checker is always a read-only C3 family, never an executable
    C4/C5 — safe by default.

  • Customer-readable PR commentdorian revalidate --format md (the body the GitHub Action
    posts) now leads with an explicit Status: Blocked / Passed / Errored verdict, an aggregate
    trust-change counts table (how many touched warrants this change moved to REVOKED / DEGRADED /
    TRUSTED / UNKNOWN), a sealed in <artifact>.warrant line under each affected artifact, and a
    verdict-keyed What to do: remediation line. The existing per-claim verdict table, fold
    transitions, recall section, and stats footer are unchanged; the comment stays deterministic (no
    timestamps, no absolute paths) and keeps the 160-char content-carryover bound on every detail cell.

Packaging hygiene

  • Guards against editor/file-sync duplicate artifacts. A .gitignore rule and a Hatch build
    exclude now keep stray … 2.py sync-duplicate files (the kind macOS / cloud sync leaves behind,
    e.g. suggestclaims 2.py) from ever being tracked or packaged into a wheel — even from a dirty
    working tree. To be precise: these files were never tracked in git, so they were never in a
    CI-built wheel or on PyPI; the guards additionally make a local uv build from a dirty tree
    provably clean (33 modules, no space-named files).

Tests & gates

Full suite green at 1.1.0 (883 tracked tests: the 1.0.2 suite plus 8 new dorian init tests, and
new assertions pinning the enhanced PR-comment output), ruff clean, wheel/sdist build +
twine check pass. The new dorian init golden path is covered end to end (initverify
exits 0 and writes a warrant — a tool whose pitch is "don't ship false claims" must not ship a
false scaffold).

The reproducible benchmark suites are not re-run here: 1.1.0 adds a command, output formatting,
and a packaging cleanup, none of which touch the checker, binding, or fold code the suites measure,
so the recorded figures stand unchanged (last executed at 1.0.2; see
docs/BENCHMARK_CURRENT.md).

Honest scope (unchanged)

dorian has one documented, reproduced real cross-PR catch on frozen public SHAs (encode/httpx
requires-python floor; see docs/REAL_CATCH_LOG.md) — not broad
real-world validation. The benchmark suites are reproducibility evidence on frozen fixtures only.
--deny-exec/--deny-shell are fail-closed policies, not sandboxes; checker_trust: base is a
checker-source trust root, not a sandbox. dorian init and suggest-claims scaffold starter claims
for review — existence/value checks, not behavior (a gutted body keeps a symbol: claim green). A
warrant id is content-addressed and tamper-evident, but its body includes the seal timestamp, so a
fresh seal yields a different id — what reproduces is the outcome, not the id.

Install

pip install dorian-vwp        # 1.1.0 on PyPI

dorian init                                                # scaffold a starter setup
dorian verify dorian-change-note.md --claims claims.json   # seal the warrant — exit 0

dorian v1.0.2

17 Jun 09:41
8790329

Choose a tag to compare

dorian 1.0.2

An announcement-readiness hotfix on top of 1.0.1. No breaking changes: the warrant format,
checker grammar, exit codes, and trust semantics are unchanged. The point of this release is
public-facing coherence — one version across PyPI, README, the Action docs, and the GitHub
release — plus two edge-case bug fixes, a real SCA-scope fix, and CI credential hardening.

It resolves the post-1.0.1 use-and-see validation findings (Codex GPT-5.5 HOTFIX_BEFORE_ANNOUNCE).

Why this release exists

1.0.1 was real on GitHub but pip install dorian-vwp still served 1.0.0, while the README
documented 1.0.1-only commands (suggest-claims, export --in-toto). 1.0.2 is published to
PyPI so the documented install path and command surface agree with the package a new user gets.

Public-trust fixes

  • PyPI coherence (FINDING-01) — 1.0.2 is published to PyPI via the Trusted-Publisher
    workflow, so pip install dorian-vwp provides the documented command surface. README,
    release docs, and install examples point to one coherent version.
  • Immutable Action ref (FINDING-02) — the README Getting-Started snippet pinned
    dorian/action@main (a moving target). It and the action docs now use @v1.0.2. A new
    version-sync guard fails CI if dorian/action@main reappears in public copy-paste.
  • SCA audits the project, not the tool (FINDING-03)security.yml ran uvx pip-audit,
    which audited pip-audit's own isolated environment, not dorian's dependencies. It now
    exports the resolved project set (uv export --all-extras --dev --no-emit-project) and audits
    that, with a step that asserts the project deps (duckdb/anthropic/pytest) are actually present.
  • Checkout credential hardening (FINDING-04) — every actions/checkout step now sets
    persist-credentials: false; none of these workflows perform authenticated git operations
    (the release/publish lanes mint short-lived OIDC tokens, not git credentials).
  • Release-gate determinism — the release-gate test job disables the uv cache
    (enable-cache: false) so release validation runs from a clean resolve.

Bug fixes

  • export of an artifact literally named *.warrant (FINDING-05)dorian export
    unconditionally stripped a .warrant suffix, so dorian export foo.warrant looked for the
    wrong sidecar and failed. It now prefers reading the input as the artifact (so foo.warrant
    exports its own foo.warrant.warrant sidecar) and only treats a .warrant-suffixed input as
    a sidecar path when the artifact has no sidecar of its own. Regression-tested.
  • suggest-claims on PEP 263 (non-UTF8) Python (FINDING-06) — the file was read with a
    hardcoded UTF-8 decode, so valid Python declaring e.g. # -*- coding: latin-1 -*- was
    rejected. It now parses the file's bytes so the encoding cookie is honored; an unknown/declared-
    wrong codec surfaces as a clear usage error, not a traceback. Regression-tested.
  • symbol_index non-git robustnesspyproject_script_definers reached git ls-files
    unguarded, so with a precomputed definer map plus a [project.scripts] table a non-git
    checkout raised GitError instead of degrading to {} (breaking the documented "non-git
    yields {}" contract of claim_symbol_watch_paths). The git call is now guarded; the symbol
    binding is preserved, only the git-dependent script-target resolution degrades. Regression-tested.

Docs / guards

  • Attestation-interop example (FINDING-07) — the in-toto example pinned a fixed
    "dorianVersion": "1.0.0"; it is now version-neutral, with a guard against a hardcoded value.
  • Stronger determinism test (FINDING-08) — the in-toto determinism test now asserts both
    CLI invocations succeed before comparing output bytes.
  • Stale-wording guard (FINDING-09) — the version-sync guard now also catches "until the PyPI
    release" (the narrower variant that slipped past the "first PyPI release" family).

Tests & gates

Full suite green at 1.0.2 (874 tracked tests, +5 new regression tests for the fixes above), ruff
clean, bandit clean, project-scope pip-audit clean, wheel/sdist build + twine check pass.
Both reproducible benchmark suites were re-run at 1.0.2 and reproduce the 1.0.1 figures
exactly
(large-mutation P=R=0.93; binding-lifecycle to the same content-derived run_id), so
the hotfix touches no checker numeric behavior. The documented encode/httpx real catch was
independently reproduced on this build (verify exit 0 → revalidate exit 4 → REVOKED,
httpx-python-floor-38 BROKEN).

Honest scope (unchanged)

dorian has one documented, reproduced real cross-PR catch on frozen public SHAs — not broad
real-world validation. The benchmark suites are reproducibility evidence on frozen fixtures only.
--deny-exec/--deny-shell are fail-closed policies, not sandboxes; checker_trust: base
is a checker-source trust root, not a sandbox. suggest-claims checks existence/value, not
behavior (a gutted body keeps a symbol: claim green); the in-toto export is experimental and
not a registered in-toto predicate. A warrant id is content-addressed and tamper-evident, but
its body includes the seal timestamp, so a fresh seal yields a different id — what reproduces is
the outcome, not the id.

Install

pip install dorian-vwp        # 1.0.2 on PyPI

dorian v1.0.1

17 Jun 04:57
84d6e05

Choose a tag to compare

dorian 1.0.1

A hardening, DX, and interop patch on top of 1.0.0. No breaking changes; the warrant format,
checker grammar, exit codes, and trust semantics are unchanged. The headline addition is the
first documented, reproducible cross-PR catch on a public repo.

Proof

  • docs/REAL_CATCH_LOG.md — one documented catch on encode/httpx
    (BSD-3): a load-bearing claim sealed when requires-python was ">=3.8" was flipped
    WARRANTED → REVOKED (exit 4) by a real later upstream PR (#3592,
    "Drop Python 3.8 support") while httpx's own test suite stayed green and no stateless per-PR
    review would have re-opened the original claim. From-scratch reproduction included. This is
    one documented catch with honest scope, not a validation claim.

Security

  • C4 hardening: a pytest: checker nodeid whose file part is empty or starts with -
    (e.g. pytest:-pevil, pytest:--collect-only) is now rejected as ERROR(bad_program)
    before any subprocess spawns — it can no longer reach pytest as an option. Red/green tested.
  • C5 sqlite reconcile timeout: a pathological reconcile query (e.g. an infinite recursive
    CTE the read-only authorizer permits) is now bounded by a per-query wall-clock deadline and
    returns ERROR(query_timeout) instead of hanging the process — closing a DoS that survived
    --deny-exec (typed C5 reads are deliberately not exec-gated). Red/green tested.
  • Supply chain: every third-party GitHub Action is pinned to an immutable commit SHA (each
    verified via git ls-remote); a new security.yml runs pip-audit (SCA) and bandit
    (SAST), and Dependabot keeps the pins and deps fresh. bandit excludes only dorian's
    documented, policy-gated execution primitives, with a reason per check.

Performance

  • dorian verify now builds the whole-repo Python-symbol and config-key indexes once per
    run instead of 2×/3×; output is byte-identical (pinned by a call-count spy + the existing
    watch/read-set assertions).

Features (additive, opt-in)

  • dorian suggest-claims <file.py> — a deterministic, zero-model C3 counterpart to
    suggest-data-checks. Proposes symbol: claims for non-private defs/classes and py-const:
    claims for literal module constants, runs each, and emits only the passing ones, so the
    {"claims": [...]} fragment seals unmodified. load_bearing defaults to false; ambiguous
    symbols are skipped. Scaffolding for review (existence/value, not behavior) — see
    docs/design/SUGGEST_CLAIMS.md.
  • dorian export --in-toto <artifact> — project a sealed .warrant into an experimental
    in-toto ClaimVerification Statement (deterministic, no signing, no network, zero deps).
    Experimental interop — see docs/ATTESTATION_INTEROP.md.

Docs / DX

  • The runnable "Try it in 30 seconds" demo is promoted above the fold and the Demo badge points
    at it; the illustrative /login story is clearly labeled.
  • New: docs/WRITING_GOOD_CLAIMS.md (worked good/bad claim pairs + the gutted-body ceiling),
    docs/SECURITY_AND_SAFE_RUNNERS.md (one safe public-fork recipe), a sharpened
    docs/USE_WITH_CLAUDE_CODE.md, and the public benchmark protocol reconciled with what shipped.

Honest scope (unchanged from 1.0.0)

The public benchmark is reproducibility evidence on frozen SHAs only, not general real-world
validation. Trigger and truth layers are reported separately, and ERROR is not BROKEN.
--deny-exec/--deny-shell are fail-closed policies, not sandboxes; checker_trust: base
is a checker-source trust root, not a sandbox. suggest-claims checks existence/value, not
behavior (a gutted body keeps a symbol: claim green); the in-toto export is experimental.
A warrant id is content-addressed and tamper-evident, but its body includes the seal
timestamp, so a fresh seal yields a different id — what reproduces is the outcome, not the id.

Install

pip install dorian-vwp

PyPI publishing is a separate step and is not performed by this GitHub Release; pip will
serve 1.0.0 until 1.0.1 is published to PyPI via the Trusted Publisher workflow.

dorian 1.0.0

16 Jun 09:05

Choose a tag to compare

dorian 1.0.0 includes deterministic, token-free claim revalidation for trusted repositories; structural checkers including py-signature:, py-const:, code:, and config-value:; a public micro-benchmark with machine-derived structural claims reproduced on named repositories pinned at frozen SHAs; and release provenance through GitHub artifact attestations.

The public benchmark is reproducibility evidence on frozen SHAs only, not a general real-world validation claim. Trigger and truth layers are reported separately, and ERROR is not BROKEN. --deny-exec and --deny-shell are fail-closed policies, not sandboxes. trusted-base is a checker-source trust root, not a sandbox.

Artifacts from the successful release gate (build + 3.11/3.12/3.13 test matrix + SHA-256 + Sigstore build-provenance attestation):

  • dorian_vwp-1.0.0-py3-none-any.whl
  • dorian_vwp-1.0.0.tar.gz
  • SHA256SUMS

PyPI publishing is separate and is not performed by the GitHub Release step.

dorian 1.0.0rc1

15 Jun 14:49

Choose a tag to compare

dorian 1.0.0rc1 Pre-release
Pre-release

dorian 1.0.0rc1 — V1 release candidate

Prerelease. A release candidate, not final 1.0.0. dorian is a local-first,
deterministic, token-free verifier of the claims a change makes about its sources.

This RC lands the V1 strengthening program (research-report driven), independently
audited before tagging. All additions are additive and backward-compatible; default
behavior is unchanged unless you opt in.

Highlights

  • Python structural checkerspy-signature: and py-const: (C3 subgrammars, AST-based)
    close the symbol: existence ceiling and the string:/regex: comment-survival false-pass
    for Python signatures and constants. py-const: compares value and type (30 != 30.0,
    1 != True).
  • Semantic-context searchcode: runs a regex over comment/docstring-stripped Python.
  • Checker-strength / claim-risk diagnosticsdorian bindings (human + JSON) classifies
    each checker's truth strength and flags kind-vs-strength adequacy mismatches; advisory only.
  • Multi-index binding — config keys in tracked .toml/.json widen re-check triggers
    (TOML/JSON only; YAML excluded to keep zero runtime deps), with provenance and ambiguity skip.
  • Trusted-base checker-source moderevalidate --checker-source base / Action
    checker_trust: base: runs only base-approved checker specs for public/fork PRs.
  • dorian bench warrant-quality — offline per-claim mutation scoring (trigger vs verdict).

Security

  • checker_trust: base is a checker-source trust root, not a sandbox: a base-approved
    pytest: checker can still execute PR-head code — for untrusted forks pair it with
    deny_exec: true (or external isolation). --deny-exec/--deny-shell are fail-closed, not
    sandboxes. No pull_request_target; no secrets required or exposed.
  • The trusted-base exploit matrix (tests/test_trusted_base.py, 10 cases) proves PR-added /
    PR-modified executable checkers never execute (sentinel-verified) and a missing/tampered base
    sidecar fails closed (ERRORED, never BROKEN, never green).

Benchmark scope

Synthetic-suite reproducibility, not broad real-world validation. Numbers
(docs/BENCHMARK_CURRENT.md, measured at commit 33e9eaf): large-mutation 240 pairs P=R=0.93
(11.6× / 10.4× false-positive reduction vs file watchers); binding-lifecycle 808 pairs,
selection recall 0.54 → 1.00, alarm precision/recall 1.00, 0 errored; realworld 5 cases
(2 solved / 1 partial / 2 not_solved). Binding improves selection; it does not prove
semantic behavior (the gutted-body ceiling is shown, not solved). Historical v0.7.0 / 0.9.0
docs are preserved and labeled historical.

Remaining non-goals (post-V1, why this is an RC not final 1.0.0)

Real-repo public micro-benchmark (protocol-only); declarative/route/SQL binding indices;
YAML config binding; audit-event/state single-transaction atomicity; --extract stays
draft/experimental. See docs/V1_SCOPE.md.

Verification (release commit 24ae7c8)

  • uv run pytest735 passed (incl. slow: wheel build, real pytest subprocess, regex timeout)
  • uv run ruff check / ruff format --check → clean
  • uv build + clean-venv install → dorian 1.0.0rc1
  • benchmarks re-run identical; trusted-base exploit matrix passes
  • independently re-audited (6 read-only auditor lenses): 2 release-blocking doc-drift issues
    and several should-fixes found and repaired before tagging.

Invariants preserved: ERROR is never BROKEN; checkers are read-only (except C4/C5-shell);
binding selects re-check candidates only; zero runtime dependencies.

v0.11.0 — security hardening: deny-exec + C3 regex ReDoS backstop

15 Jun 05:02
78dcd1a

Choose a tag to compare

Security-hardening release. Opt-in, fail-closed controls over the executable checker families, a real fix for catastrophic-regex stalls, and honest security/validation docs — all backward-compatible (trusted/internal repos are unchanged).

Highlights

  • deny-exec / deny-shell execution policy--deny-exec / --deny-shell (env DORIAN_DENY_EXEC / DORIAN_DENY_SHELL) on seal, verify, revalidate, and rebind. The executable families (C4 pytest:, C5 shell:) ERROR instead of running, gated at the single run_checker choke point. A blocked claim never seals (born-verifiable) and never silently passes revalidate (ERRORED, never VERIFIED/BROKEN). Fail-closed; not a sandbox.
  • C3 regex ReDoS backstop — the match runs in a spawned worker killed at spec.timeout_s, so catastrophic backtracking ERRORs (regex_timeout) instead of stalling. No new core runtime dependency.
  • Drift guardstest_version_sync (pyproject == __init__ == CLI) and test_cli_docs_sync (every README command resolves).
  • Honesty & onboarding docsSECURITY.md, docs/SECURITY_BOUNDARY.md, validation-honesty / release-checklist / dependency / benchmark-reproducibility / shadow-pilot docs, 6 issue templates, a manual OIDC PyPI publish workflow, and a roadmap backlog with an explicit "do not build" list.

Adversarial audit

A five-lens review caught a real escape: dorian rebind re-runs checkers but did not receive the policy and had no flag, so it executed code under DORIAN_DENY_EXEC=1. Fixed, with a red-green-verified regression test.

Caveat

deny-exec removes code execution but not the self-attested-verdict problem; the public-fork-PR story remains the deferred trusted-base Action mode (designed, not built). dorian is for trusted/internal repositories, or --deny-exec everywhere else.

Verification

CI green on Python 3.11 / 3.12 / 3.13; 636 tests pass. Core runtime dependencies: none.

v0.10.0 — opt-in weak-binding gate

15 Jun 02:01

Choose a tag to compare

A pre-release adding an opt-in, seal-time review gate plus Claude Code onboarding. No change to default behavior, trust/claim state, the sidecar schema, fold policy, checker grammar, dependencies, or the GitHub Action.

New: --binding-gate off | warn | fail

On dorian verify and dorian seal (default off):

  • warn — seal, then print weak-binding diagnostics (exit 0).
  • fail — refuse the seal before writing any sidecar (atomic no-write, exit 4) when a claim carries a high-risk weak-binding flag: unbacked, short-literal, ambiguous-mention, trigger-only-symbol, unwatched-mention.
  • single-file is warn-only — the expected shape of an honest one-checker C3 path:/symbol:/regex: claim.

Weak binding is a false-confidence smell: the gate never marks a claim false and never touches trust or claim state. dorian bindings stays a pure linter.

Also in this release

  • Claude Code onboarding: docs/USE_WITH_CLAUDE_CODE.md + a runnable examples/claude-code/ pack.
  • docs/PUBLIC_BENCHMARK_PROTOCOL.md (pre-registered, protocol-only) and docs/TRUSTED_BASE_ACTION_DESIGN.md (design only, not implemented).
  • docs/START_HERE.md navigation index.

CI green on Python 3.11 / 3.12 / 3.13. Not published to PyPI.

🤖 Generated with Claude Code

v0.9.1 — evidence wording polish

14 Jun 14:06

Choose a tag to compare

A patch release — no product behavior change. It makes the v0.9.0 evidence harder to misread.

  • README badge/version updated: the stale hardcoded v0.8 badge is replaced with a dynamic GitHub-release badge (reads the latest release, so it can't drift again).
  • "proof" softened to "evidence": the throwaway self-host demo line now reads "evidence that the mechanism can catch this kind of checked break on real code," preserving the caveat that it was a throwaway demo, not a committed artifact or benchmark figure.
  • "5 limitations closed" clarified: v0.9.0 closed five binding-hardening issues, not all of dorian's strategic limitations. These remain open and visible: ambiguous symbols are skipped unless disambiguated; Python-only definition indexing; revalidate stays sidecar/store-driven; auto-binding is verify-only; trigger coverage is not behavior proof; real-world reproductions are scoped, not universal validation.
  • Benchmark counts were re-checked against the machine outputs (summary JSON / generated doc / README) and agree (808 pairs · 63 domains · 122 artifacts · 122 claims · 408 mutations; checker_path recall 0.54 → bound 1.00; verdict precision 1.00, zero false BROKEN). The full benchmark was not rerun — this patch changed docs/version only — so the historical run provenance honestly stays dorian 0.9.0.
  • Real-world scope clarified: a cited discovery catalog (12 public problems across 5 categories) was added to the protocol so the 5 report cases (3 hermetic reproductions + 2 documented) read as a selection, not a cherry-pick.
  • Trigger-vs-truth framing preserved: binding widens the re-check trigger; the checker still decides truth; a watched file changing never makes a claim BROKEN by itself.
  • Added a docs-polish guard test; full gate green (582 passed), ruff clean. The v0.9.0 tag was not modified.

v0.9.0 — binding hardening + trigger-vs-truth evidence

14 Jun 11:55
4d3de74

Choose a tag to compare

Symbol-binding correctness + honest evidence for it. Binding widens when a claim is re-checked; the checker still decides truth — a watched file changing never makes a claim BROKEN by itself.

Highlights

  • Symbol→defining-file binding (5 limitations closed) + 3 TDD-hardened precision nits (C4 nodeid whitespace parity; backticked common-word over-binding guard; ambiguous pyproject-script target rejection).
  • Binding-lifecycle benchmark — 808 known-truth (artifact, mutation) pairs over 63 domains, scored in two layers:
    • selection (re-check trigger) recall 0.54 → 1.00 vs a pre-binding checker-path watcher, at 1.00 precision (vs 0.92 for the rejected "any file with the token" shortcut) — the false-TRUSTED trigger reduction.
    • verdict (BROKEN) precision 1.00, zero false BROKEN; ERRORED reported separately, never an alarm.
    • the gutted-body ceiling is shown, not solved: an existence checker fires the trigger but yields 0 BROKEN; only a behavior checker catches it.
      dorian bench binding-lifecycle · docs/BENCHMARK_BINDING_LIFECYCLE.md
  • Offline public-case reproductions of still-open problem classes — solved 2 / partial 1 / not_solved 2; labels derived from dorian's actual behavior. dorian bench realworld-usecases · docs/REALWORLD_USECASES.md
  • README + roadmap refreshed; CodeRabbit review (1 critical, 1 major, 2 minor) addressed; a CI rmtree-race in the bench teardown fixed.

In-fixture, synthetic results — a reproducible demonstration of the mechanism on these suites, not a claim about any real repository. Full gate green; matrix CI 3.11/3.12/3.13 + CodeRabbit pass.