feat(proof): backend-assurance harness for prim__eqChar (1/4)#129
Merged
Conversation
Adds the first backend-assurance harness deliverable for the epic #87 Tier C class-(J) believe_me axioms in src/abi/Boj/SafetyLemmas.idr. Covers two axioms — charEqSound (SafetyLemmas.idr:53) and charEqSym (SafetyLemmas.idr:60) — both of which ride on Idris2's prim__eqChar primitive, which the 2026-05-18 audit classified as class (J), irreducible in-language. The harness shrinks the trusted base by replacing "we trust the backend" with two complementary artefacts: • elixir/test/backend_assurance/prim_eq_char_test.exs — BEAM-native property tests via stream_data over the legal codepoint space (0..0xD7FF + 0xE000..0x10FFFF, surrogates excluded). Covers soundness (=:= true ⇒ same value), reflexivity, symmetry, and a distinct-codepoints negative-edge invariant guarding against an over-accepting Char-equality (e.g. case-insensitive collation leakage). Plus an explicit boundary table (0, 0x7F, 0x80, 0xD7FF, 0xE000, 0xFFFF, 0x10000, 0x10FFFF). All pass on Elixir 1.18.4 / OTP 27 locally; 4 properties + 1 test in 0.09s. • docs/backend-assurance/prim__eqChar.md — trusted-extraction prose argument citing the Idris2 0.8.0 lowering for both shipping backends: Chez (R6RS char=? is a total equivalence relation per §11.11) and BEAM (=:= on codepoint integers per erlang(3)). Discusses surrogates (excluded), normalisation (out of scope), and case folding (out of scope — and explicitly guarded by the negative-edge property). The in-language proof does NOT change. The %unsafe / believe_me sites in SafetyLemmas.idr stay; PROOF-NEEDS.md now cites the harness in a new Backend-assurance-evidence column on the axiom audit table, with the other three primitives (prim__strToCharList / prim__strAppend / prim__strSubstr) marked _pending_ for follow-on PRs in the same shape. CI: new .github/workflows/backend-assurance.yml runs the harness on PRs touching SafetyLemmas.idr or the harness itself. Pinned to erlef/setup-beam v1.24.0 (commit fc68ffb9) per estate post-rotation workflow-pinning policy. Concurrency-cancel + path-filtered to keep the shared Actions pool cheap. Constraints honoured (per PROOF-NEEDS.md backend-assurance framing): • No new believe_me — harness is purely external evidence. • No in-language discharge — ruled out by 2026-05-18 audit. • Two backends — Chez argued from R6RS, BEAM exercised by tests. • Honest framing — if the primitive ever fails its axiom, adjust the axiom; don't paper over. Refs #87 (Tier C backend-assurance campaign). One PR per primitive; this is 1/4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
🔍 Hypatia Security ScanFindings: 30 issues detected
View findings[
{
"reason": "Stale AI session file -- delete",
"type": "stale",
"file": "GEMINI.md",
"action": "delete",
"rule_module": "root_hygiene",
"severity": "medium"
},
{
"reason": "Issue in quality.yml",
"type": "missing_workflow",
"file": "quality.yml",
"action": "create",
"rule_module": "workflow_audit",
"severity": "high"
},
{
"reason": "Issue in security-policy.yml",
"type": "missing_workflow",
"file": "security-policy.yml",
"action": "create",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Action hyperpolymath/standards/.github/workflows/governance-reusable.yml@main needs attention",
"type": "unpinned_action",
"file": "governance.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "high"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/sanctify-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/academic-workflow-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/fireflag-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/ephapax-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/bofig-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/hesiod-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
}
]Powered by Hypatia Neurosymbolic CI/CD Intelligence |
🔍 Hypatia Security ScanFindings: 30 issues detected
View findings[
{
"reason": "Stale AI session file -- delete",
"type": "stale",
"file": "GEMINI.md",
"action": "delete",
"rule_module": "root_hygiene",
"severity": "medium"
},
{
"reason": "Issue in quality.yml",
"type": "missing_workflow",
"file": "quality.yml",
"action": "create",
"rule_module": "workflow_audit",
"severity": "high"
},
{
"reason": "Issue in security-policy.yml",
"type": "missing_workflow",
"file": "security-policy.yml",
"action": "create",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Action hyperpolymath/standards/.github/workflows/governance-reusable.yml@main needs attention",
"type": "unpinned_action",
"file": "governance.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "high"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/sanctify-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/academic-workflow-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/fireflag-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/ephapax-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/bofig-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/hesiod-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
}
]Powered by Hypatia Neurosymbolic CI/CD Intelligence |
🔍 Hypatia Security ScanFindings: 30 issues detected
View findings[
{
"reason": "Stale AI session file -- delete",
"type": "stale",
"file": "GEMINI.md",
"action": "delete",
"rule_module": "root_hygiene",
"severity": "medium"
},
{
"reason": "Issue in quality.yml",
"type": "missing_workflow",
"file": "quality.yml",
"action": "create",
"rule_module": "workflow_audit",
"severity": "high"
},
{
"reason": "Issue in security-policy.yml",
"type": "missing_workflow",
"file": "security-policy.yml",
"action": "create",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Action hyperpolymath/standards/.github/workflows/governance-reusable.yml@main needs attention",
"type": "unpinned_action",
"file": "governance.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "high"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/sanctify-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/academic-workflow-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/fireflag-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/ephapax-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/bofig-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/hesiod-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
}
]Powered by Hypatia Neurosymbolic CI/CD Intelligence |
This was referenced May 20, 2026
Merged
hyperpolymath
added a commit
that referenced
this pull request
May 21, 2026
…im__strSubstr (4/4) (#139) ## Summary Continues the backend-assurance campaign for epic #87 Tier C. Slice 1/4 (`prim__eqChar`, covering `charEqSound` + `charEqSym`) landed via #129. This PR bundles **slices 3/4 and 4/4** per explicit owner direction — both `prim__strAppend` and `prim__strSubstr` are length-arithmetic on opaque strings and share the same Chez-vs-BEAM trusted-extraction argument shape, so a combined slice trades the nominal "one PR per primitive" rule for review ergonomics. Slice 2/4 (`prim__strToCharList`, covering `unpackLength`) remains pending; ships as a follow-on PR. Discharges **external evidence** for two class-(J) `believe_me` axioms in `src/abi/Boj/SafetyLemmas.idr`: | Site | Axiom | Type | |------|-------|------| | `:226` | `appendLengthSum` | `length (s ++ t) = length s + length t` | | `:233` | `substrLengthBound` | `LTE (length (substr start len s)) len` | The in-language proof does **not** change. The `%unsafe` / `believe_me` sites stay in source. The harness shrinks the trusted base from "we trust the backend" to "we read the lowering and randomly tested the operation against the property". ## Artefacts (matching the #129 template) - `elixir/test/backend_assurance/prim_str_append_test.exs` — BEAM property tests over the legal codepoint space (surrogates excluded). 4 properties + 1 boundary table covering all four UTF-8 encoding widths (ASCII / Latin-1 / CJK / astral). - `elixir/test/backend_assurance/prim_str_substr_test.exs` — BEAM property tests for the length bound. 4 properties + boundary table covering len=0, start-past-end, full-string slice tightness, and multi-byte codepoint slices. - `docs/backend-assurance/prim__strAppend.md` — trusted-extraction prose. Chez argued from R6RS §6.7 (string model) + §11.12 (`string-append`); BEAM argued from UTF-8 prefix-free codepoint counting via Elixir `<>` + `String.length/1`. - `docs/backend-assurance/prim__strSubstr.md` — trusted-extraction prose. Chez argued from R6RS §11.12 (`substring`) + codegen clamp; BEAM argued from `String.slice/3` codepoint-counting clamp. - `docs/backend-assurance/README.md` — coverage-table rows flipped from _pending_ for both primitives. - `PROOF-NEEDS.md` — backend-assurance evidence column filled. Class column updated to **J ✓** marker for the externally- validated rows (a bare **J** indicates harness not yet landed). Audit-table rationales updated to note the harness; remaining- axiomatic-surface justifications reworded; standing-note "primitives landed" line bumped to reflect 3/4 done. The existing CI workflow `.github/workflows/backend-assurance.yml` already path-filters on `elixir/test/backend_assurance/**` and `docs/backend-assurance/**`, so it picks up the new test modules automatically — no workflow change in this PR. ## Constraints honoured Per `docs/backend-assurance/README.md`: - **No new `believe_me`.** External evidence only. - **No in-language discharge.** Ruled out by 2026-05-18 audit. - **Two backends.** Chez argued from R6RS; BEAM exercised by tests. - **Honest framing.** Surrogates excluded; normalisation out of scope and documented; negative `len`/`start` ruled out by upstream `Nat` typing. ## Test plan - [x] Elixir parse-check on both new test files (syntax-only, via `Code.string_to_quoted!`) - [x] BEAM operation smoke on boundary strings — `String.length` / `String.slice/3` direct, outside the test harness. 25 `appendLengthSum` cases + 80 `substrLengthBound` cases all hold. - [ ] CI backend-assurance workflow: BEAM property tests green - [ ] CI: existing workflows unaffected (no path-filter collisions) > **Note:** this session could not run `mix test --only backend_assurance` > locally — the asdf `elixir 1.18.4-otp-25` install on the work > machine is empty (only `.mix` archives, no binary), and system > `elixir 1.14` fails the project's `~> 1.15` requirement. Same gap > pattern slice 1/4 followed; CI is the gate. ## Heads-up for the reviewer `PROOF-NEEDS.md` now uses a **J ✓** marker on the audit-table `Class` column for the three externally-validated axioms (`charEqSound`, `charEqSym` from #129; `appendLengthSum`, `substrLengthBound` from this PR). The fifth axiom (`unpackLength`) remains bare **J** until slice 2/4 lands. The wording change is deliberate per owner direction — the underlying classification is still class (J); the marker reflects external validation. Refs #87 (Tier C backend-assurance campaign). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hyperpolymath
added a commit
that referenced
this pull request
May 21, 2026
…— campaign complete (#140) ## Summary **Closes the backend-assurance campaign** for epic #87 Tier C. With this slice merged alongside #129 (slice 1/4) and #139 (slices 3/4 + 4/4), all 5 class-(J) `believe_me` axioms in `src/abi/Boj/SafetyLemmas.idr` have external evidence. Discharges external evidence for the final class-(J) axiom: | Site | Axiom | Type | |------|-------|------| | `:218` | `unpackLength` | `(s : String) -> length (unpack s) = length s` | The in-language proof does **not** change. The `%unsafe` / `believe_me` site stays in source. The harness shrinks the trusted base from "we trust the backend" to "we read the lowering and randomly tested the operation against the property". ## Artefacts (matching the #129 / #139 template) - `elixir/test/backend_assurance/prim_str_to_char_list_test.exs` — BEAM property tests over the legal codepoint space (surrogates excluded). 4 properties + 1 boundary table: - length preservation: `length(String.to_charlist(s)) == String.length(s)` - empty-string corner: `String.to_charlist("") == []` - round-trip identity: `to_string(to_charlist(s)) == s` - charlist element-type sanity (legal codepoints, no surrogates) - `docs/backend-assurance/prim__strToCharList.md` — trusted-extraction prose. Chez argued from R6RS §6.7 (string model) + §11.12 (`string->list`); BEAM argued from UTF-8 prefix-free codepoint counting via Elixir `String.to_charlist/1` + `String.length/1`. - `docs/backend-assurance/README.md` — coverage-table row flipped from _pending_. - `PROOF-NEEDS.md` — backend-assurance evidence column filled for `unpackLength`. Class column updated to **J ✓** marker (convention introduced by #139). Site line number corrected from `:211` to `:218` (the file has shifted since the 2026-05-18 audit; previous PRs didn't sync this row). The existing CI workflow `.github/workflows/backend-assurance.yml` already path-filters on `elixir/test/backend_assurance/**` and `docs/backend-assurance/**`, so it picks up the new test module automatically — no workflow change in this PR. ## Constraints honoured Per `docs/backend-assurance/README.md`: - **No new `believe_me`.** External evidence only. - **No in-language discharge.** Ruled out by 2026-05-18 audit. - **Two backends.** Chez argued from R6RS; BEAM exercised by tests. - **Honest framing.** Surrogates excluded; normalisation out of scope and documented; round-trip property guards against a backend whose charlist conversion drops or duplicates codepoints in a way that happens to preserve length. ## Test plan - [x] Elixir parse-check on the new test file (syntax-only, via `Code.string_to_quoted!`) - [x] BEAM operation smoke on boundary strings — 9 strings (empty, ASCII, Latin-1, CJK, astral, max codepoint, etc.); direct `String.to_charlist/1` + `String.length/1` + `to_string` calls outside the test harness. Length preservation and round-trip identity both hold across all 9. - [ ] CI backend-assurance workflow: BEAM property tests green - [ ] CI: existing workflows unaffected (no path-filter collisions) > **Note:** this session could not run `mix test --only backend_assurance` > locally — the asdf `elixir 1.18.4-otp-25` install on the work > machine is empty (only `.mix` archives, no binary), and system > `elixir 1.14` fails the project's `~> 1.15` requirement. Same gap > pattern slices 1/4 and 3/4+4/4 followed; CI is the gate. ## Expected conflicts with #139 Both this PR and #139 branch off `origin/main` (per saved stacked-PR-avoidance feedback), so they share a base. Conflicts on merge are limited and trivial: | File | Conflict shape | Resolution | |---|---|---| | `docs/backend-assurance/README.md` | This PR flips the `prim__strToCharList` row; #139 flips the `prim__strAppend` and `prim__strSubstr` rows. | Different rows of the same table — line-level conflict only; accept both. | | `PROOF-NEEDS.md` audit table | This PR updates row 3; #139 updates rows 1/2/4/5 and adds the **J ✓** explanation paragraph. | Different rows; accept both. | | `PROOF-NEEDS.md` remaining-axiomatic-surface table | This PR updates `unpackLength` row; #139 updates the other four. | Different rows; accept both. | | `PROOF-NEEDS.md` standing note "Primitives validated so far" | #139 bumps it to 3/4; reviewer should apply 4/4 (campaign complete) once both merge. | Manual one-line edit during merge of whichever PR lands second. | The **J ✓** marker is used on row 3 here even though the explanation paragraph is contributed by #139 — post-#139-merge the explanation is in place, and the marker's meaning is self-evident from the value. ## Heads-up for the reviewer After this PR + #139 merge, the 4-slice backend-assurance harness is **complete**. All 5 class-(J) axioms in `SafetyLemmas.idr` have external evidence: - `charEqSound`, `charEqSym` via `prim__eqChar` (#129) - `unpackLength` via `prim__strToCharList` (this PR) - `appendLengthSum` via `prim__strAppend` (#139) - `substrLengthBound` via `prim__strSubstr` (#139) Future trusted-extraction work would only be needed if BoJ ships a new codegen backend (e.g. JavaScript, native ML). Refs #87 (Tier C backend-assurance campaign — closes the harness sequence). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First backend-assurance deliverable for epic #87 Tier C — the
class-(J)
believe_meaxioms insrc/abi/Boj/SafetyLemmas.idrthatthe 2026-05-18 audit ruled irreducible in Idris2 0.8.0 (opaque
Char/
Stringprimitives, foreign-backed). CoverscharEqSoundandcharEqSym— both ride onprim__eqChar.The in-language proof does not change. The
%unsafe/believe_mesites stay inSafetyLemmas.idr. The harness shrinks thetrusted base from "we trust the backend" to "we read the backend
lowering and randomly tested the operation".
Artefacts
elixir/test/backend_assurance/prim_eq_char_test.exs— BEAM-nativeproperty tests via
stream_dataover the legal codepoint space(
0..0xD7FF∪0xE000..0x10FFFF, surrogates excluded). Coverssoundness, reflexivity, symmetry, distinct-codepoints negative
edge, plus an explicit boundary table. 4 properties + 1 unit test;
all pass locally on Elixir 1.18.4 / OTP 27 in ~0.09 s.
docs/backend-assurance/prim__eqChar.md— trusted-extraction proseargument citing Idris2 0.8.0's lowering for each shipping backend:
Chez (R6RS
char=?is a total equivalence relation per §11.11) andBEAM (
=:=on codepoint integers pererlang(3)). Discussessurrogates (excluded), normalisation (out of scope), case folding
(out of scope and explicitly guarded by the negative-edge property).
docs/backend-assurance/README.md— campaign overview, coveragetable (1 done / 3 pending), and constraints.
tests/backend-assurance/README.md— pointer to where the runnableharness lives.
PROOF-NEEDS.md— new Backend-assurance-evidence column on theaxiom audit table; other three primitives marked pending.
.github/workflows/backend-assurance.yml— path-filtered CI jobrunning
mix test --only backend_assuranceon PRs touchingSafetyLemmas.idr, the harness, or workflow itself. Pinned toerlef/setup-beam@fc68ffb9(v1.24.0) per estate post-rotationworkflow-pinning policy. Concurrency-cancel guard included.
Scope & coordination
This PR is one slice of four:
prim__eqChar(this PR) — backscharEqSound,charEqSymprim__strToCharList— backsunpackLength(pending follow-on)prim__strAppend— backsappendLengthSum(pending)prim__strSubstr— backssubstrLengthBound(pending)One PR per primitive keeps review chunks small. Estimate ~3 months
end-to-end.
Constraints honoured
Per the backend-assurance framing in
PROOF-NEEDS.md:believe_me— purely external evidence.the axiom; don't paper over.
Test plan
cd elixir && mix test --only backend_assurance— 4 properties + 1 test passRefs #87 (Tier C backend-assurance campaign). One PR per primitive;
this is 1/4.
🤖 Generated with Claude Code