Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 82 additions & 0 deletions .github/workflows/backend-assurance.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
# SPDX-License-Identifier: MPL-2.0
# Copyright (c) 2026 Jonathan D.A. Jewell (hyperpolymath) <j.d.a.jewell@open.ac.uk>
#
# Backend-Assurance Harness (epic #87 Tier C)
#
# Runs the BEAM-native property tests that validate the backend
# operations underlying the 5 class-(J) believe_me axioms in
# src/abi/Boj/SafetyLemmas.idr. See docs/backend-assurance/ and
# PROOF-NEEDS.md for the campaign framing.
#
# This is *external* evidence — it does not change the in-language
# proof; the believe_me sites stay in source. The harness shrinks the
# trusted base from "we trust the backend" to "we read the lowering and
# randomly tested the operation".
#
# Triggers only on changes that could affect the harness or the
# proof contract it backs.

name: Backend-Assurance Harness

on:
push:
branches: [main]
paths:
- 'src/abi/Boj/SafetyLemmas.idr'
- 'elixir/test/backend_assurance/**'
- 'elixir/mix.exs'
- 'elixir/mix.lock'
- 'docs/backend-assurance/**'
- '.github/workflows/backend-assurance.yml'
pull_request:
branches: [main]
paths:
- 'src/abi/Boj/SafetyLemmas.idr'
- 'elixir/test/backend_assurance/**'
- 'elixir/mix.exs'
- 'elixir/mix.lock'
- 'docs/backend-assurance/**'
- '.github/workflows/backend-assurance.yml'

concurrency:
group: backend-assurance-${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: ${{ github.event_name == 'pull_request' }}

permissions:
contents: read

jobs:
property-test:
name: BEAM property tests
runs-on: ubuntu-latest
timeout-minutes: 10
defaults:
run:
working-directory: elixir
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2

- name: Set up BEAM (Elixir + OTP)
uses: erlef/setup-beam@fc68ffb90438ef2936bbb3251622353b3dcb2f93 # v1.24.0
with:
elixir-version: '1.18.4'
otp-version: '27.0'

- name: Cache deps + _build
uses: actions/cache@d4323d4df104b026a6aa633fdb11d772146be0bf # v4.2.2
with:
path: |
elixir/deps
elixir/_build
key: backend-assurance-${{ runner.os }}-${{ hashFiles('elixir/mix.lock') }}
restore-keys: |
backend-assurance-${{ runner.os }}-

- name: Fetch deps
run: mix deps.get

- name: Compile
run: mix compile --warnings-as-errors

- name: Run backend-assurance property tests
run: mix test --only backend_assurance --color
39 changes: 32 additions & 7 deletions PROOF-NEEDS.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,19 +72,44 @@ All five are class **(J)** — genuinely unavoidable in Idris2 0.8.0
(opaque `Char`/`String` primitives, foreign-backed operations). See the
**Axiom Audit (2026-05-18)** table above for per-site detail.

| Axiom | Site | Justification |
|-------|------|---------------|
| `charEqSound` | `SafetyLemmas.idr:53` | Soundness of `prim__eqChar` — backend primitive correctness |
| `charEqSym` | `SafetyLemmas.idr:60` | Symmetry of `prim__eqChar` — backend primitive correctness |
| `unpackLength` | `SafetyLemmas.idr:211` | `prim__strToCharList` preserves length — backend primitive correctness |
| `appendLengthSum` | `SafetyLemmas.idr:219` | `prim__strAppend` length semantics — not reducible at type level |
| `substrLengthBound` | `SafetyLemmas.idr:226` | `prim__strSubstr` length bound — not reducible at type level |
| Axiom | Site | Justification | Backend-assurance evidence |
|-------|------|---------------|----------------------------|
| `charEqSound` | `SafetyLemmas.idr:53` | Soundness of `prim__eqChar` — backend primitive correctness | `docs/backend-assurance/prim__eqChar.md` + `elixir/test/backend_assurance/prim_eq_char_test.exs` |
| `charEqSym` | `SafetyLemmas.idr:60` | Symmetry of `prim__eqChar` — backend primitive correctness | `docs/backend-assurance/prim__eqChar.md` + `elixir/test/backend_assurance/prim_eq_char_test.exs` |
| `unpackLength` | `SafetyLemmas.idr:211` | `prim__strToCharList` preserves length — backend primitive correctness | _pending_ |
| `appendLengthSum` | `SafetyLemmas.idr:219` | `prim__strAppend` length semantics — not reducible at type level | _pending_ |
| `substrLengthBound` | `SafetyLemmas.idr:226` | `prim__strSubstr` length bound — not reducible at type level | _pending_ |

Note: `logSafeBounded` in SafeAPIKey.idr no longer uses `believe_me` directly;
it calls the documented SafetyLemmas axioms via structural proof.
`charEqSound` is consumed by `Safety.idr` (`shellContra`, `sqlContra`);
`unpackLength` by `SafeAPIKey.idr` (`sufficientEntropyNonEmpty`).

### Backend-assurance harness

The "Backend-assurance evidence" column above cites the external
evidence reducing each class-(J) axiom's trusted base. Two artefact
shapes per primitive:

- **Trusted-extraction validation** under `docs/backend-assurance/<primitive>.md`
— prose argument citing each shipping backend's lowering of the
primitive (Idris2 0.8.0 sources for Chez, OTP/R6RS for the runtime
operations) and why the lowering satisfies the axiom.
- **Property-test harness** under `elixir/test/backend_assurance/`
— BEAM-native property tests via `stream_data` covering the
codepoint / string spaces. Run via `mix test --only backend_assurance`
and gated in CI by `.github/workflows/backend-assurance.yml`.

This harness does not change the in-language proof — the `believe_me`
sites stay in `SafetyLemmas.idr`. The harness shrinks the trusted base
from "we trust the backend" to "we read the lowering and randomly
tested the operation".

First primitive landed: `prim__eqChar` (covering both `charEqSound`
and `charEqSym`). Remaining three (`prim__strToCharList`,
`prim__strAppend`, `prim__strSubstr`) tracked under epic #87 Tier C
backend-assurance campaign — one PR per primitive.

## Priority Going Forward

**LOW** — All named proof obligations closed. Future work:
Expand Down
62 changes: 62 additions & 0 deletions docs/backend-assurance/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Backend-Assurance Harness

External evidence for the class-(J) `believe_me` axioms in
`src/abi/Boj/SafetyLemmas.idr`. None of these documents or tests change
the in-language proof — the `believe_me` sites stay in source. The
harness shrinks the trusted base by replacing "we trust the backend"
with "we have read the backend lowering and randomly tested the
operation against the claimed property".

Per `PROOF-NEEDS.md` (Axiom Audit 2026-05-18):

> The 5 string/char-primitive axioms are **irreducible within Idris2**
> (opaque primitive types). They cannot be "proved away" in-language;
> the only reduction path is to shrink the trusted base by validating
> `prim__eqChar` / `prim__strToCharList` / `prim__strAppend` /
> `prim__strSubstr` against the chosen backend (Chez/BEAM) via an
> external trusted-extraction or property-test harness, then citing
> that evidence here.

This directory is where the trusted-extraction validation lives. The
companion property-test harness lives under
`elixir/test/backend_assurance/` (BEAM half) and is wired into CI via
`.github/workflows/backend-assurance.yml`.

## Coverage

| Primitive | Axioms covered | Trusted-extraction | Property test |
|--------------------|--------------------------------------|----------------------------------|----------------------------------------------------------------|
| `prim__eqChar` | `charEqSound`, `charEqSym` | `prim__eqChar.md` | `elixir/test/backend_assurance/prim_eq_char_test.exs` |
| `prim__strToCharList` | `unpackLength` | _pending_ | _pending_ |
| `prim__strAppend` | `appendLengthSum` | _pending_ | _pending_ |
| `prim__strSubstr` | `substrLengthBound` | _pending_ | _pending_ |

Each row is delivered as one PR per primitive. `prim__eqChar` is the
first; the other three follow the same shape.

## Constraints

- **No new `believe_me`.** External evidence only. The 5 axioms stay.
- **No in-language discharge.** Already ruled out by the 2026-05-18 audit.
- **Two backends, not one.** The trusted-extraction doc must cite both
Chez Scheme (idris2's default codegen) and BEAM (Erlang/Elixir, where
BoJ's REST surface runs). Property tests cover BEAM; Chez is argued
prose-side from R6RS semantics.
- **Honest framing.** If a primitive turns out to *not* satisfy its
axiom under some input class, document the failure and adjust the
axiom — do not paper over it.

## Running locally

cd elixir
mix deps.get
mix test --only backend_assurance

The harness has no Idris2 dependency; it validates the *backend*
operations the Idris2 axioms ultimately depend on.

## References

- `PROOF-NEEDS.md` — axiom audit and class-(J) framing.
- `src/abi/Boj/SafetyLemmas.idr` — the `believe_me` declarations.
- epic #87 Tier C — campaign tracking.
121 changes: 121 additions & 0 deletions docs/backend-assurance/prim__eqChar.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
# Backend-Assurance: `prim__eqChar`

Trusted-extraction validation for the two class-(J) axioms over
Idris2's `prim__eqChar` primitive:

- `charEqSound : (c1, c2 : Char) -> c1 == c2 = True -> c1 = c2`
(`src/abi/Boj/SafetyLemmas.idr:53`)
- `charEqSym : (x, y : Char) -> (x == y) = (y == x)`
(`src/abi/Boj/SafetyLemmas.idr:60`)

Both are `%unsafe` `believe_me` declarations in Idris2 0.8.0 because
`Char` is an opaque primitive type with no in-language induction
principle. This document argues — by inspecting the backend lowerings
that BoJ actually ships against — that the believed properties hold.

The companion property test
(`elixir/test/backend_assurance/prim_eq_char_test.exs`) exercises the
BEAM half of this argument over the codepoint space.

## What `prim__eqChar` is

`prim__eqChar : Char -> Char -> Int` is a primitive arithmetic
operation declared in `Core.Primitives` in the Idris2 compiler. The
`==` method on `Char` calls it via the `Eq Char` instance in
`Prelude.EqOrd`:

public export
Eq Char where
x == y = boolOp prim__eqChar x y
x /= y = not (x == y)

so the question reduces to: does `prim__eqChar` satisfy soundness and
symmetry on each shipping backend?

## Chez Scheme backend (Idris2 default codegen)

In `Compiler.Scheme.Chez`, `prim__eqChar` is one of the arithmetic
primitives lowered directly to the R6RS predicate `char=?` (via the
generic `op` translation table that maps `EQ CharType` to
`char=?`). On Chez 9.x:

- **Soundness.** R6RS §11.11 specifies `char=?` returns `#t` iff its
arguments denote the same Unicode codepoint. The Char value is the
codepoint; two Chars for which `char=?` returns `#t` are the same
value. Hence `c1 == c2 = True -> c1 = c2` holds.
- **Symmetry.** R6RS §11.11 specifies `char=?` is a total
equivalence relation. Symmetry is part of "equivalence relation";
hence `(x == y) = (y == x)` holds.

Both properties are part of the Scheme standard and are upheld by the
Chez implementation. No further evidence needed beyond citing the
standard.

## BEAM backend (Erlang / Elixir, where BoJ runs)

BoJ's REST surface is Elixir on the BEAM. The Idris2 proofs are
compile-time-only artefacts; the runtime characters that flow through
the system are BEAM codepoints (Erlang integers in the range
`0..0x10FFFF` excluding the surrogate gap `0xD800..0xDFFF`). On BEAM:

- **Char encoding.** Erlang represents a character as an
integer codepoint. Strings are either lists-of-integers ("traditional"
Erlang strings) or UTF-8 binaries — but the per-character equality
operation that matters for the axioms is integer equality on
codepoints.
- **Lowering of `==`.** Integer equality on the BEAM is `=:=`
(the strict-equality operator). It returns `true` iff both
operands are the same term; for two integer codepoints `a` and `b`
this is exactly value equality.
- **Soundness.** `a =:= b = true` implies `a` and `b` are the same
integer codepoint, hence the same `Char`. Trivially.
- **Symmetry.** `=:=` is documented in OTP `erlang(3)` as a total
commutative operator on terms; for any `a`, `b`, `a =:= b` ⟺
`b =:= a`.

The property test exercises both properties over random codepoints
sampled from the legal range (excluding surrogates), plus explicit
boundary codepoints (`0`, ASCII boundary `0x7F`, BMP boundaries
`0xD7FF`/`0xE000`, BMP/astral boundary `0xFFFF`/`0x10000`, max
`0x10FFFF`).

## Why this isn't circular

The harness does not call `prim__eqChar`. It calls Erlang `=:=`
directly on integers. The argument is: *the operation that Idris2
lowers `prim__eqChar` to on the BEAM is `=:=` on the codepoint
integer*, so demonstrating `=:=` satisfies the properties is
sufficient. The trusted-extraction step is reading the lowering; the
property-test step is verifying the operation behaves as the lowering
claims.

For Chez, we do not run a Scheme harness — R6RS is sufficient
documentary evidence that `char=?` is a total equivalence relation. If
BoJ ever ships a backend whose Char equality is not a built-in
equivalence-checked primitive, this document gets a new section and a
matching property test.

## Edge cases considered

- **Surrogates** (`0xD800..0xDFFF`): excluded from the codepoint
generator. These are illegal as standalone Char values per Unicode;
if the test layer ever needs to assert behaviour on them, that is a
bug in the system under test, not in `prim__eqChar`.
- **Normalisation** (`é` as one codepoint vs two): not in scope. The
axiom is about codepoint equality, not grapheme-cluster equality.
`prim__eqChar` is per-codepoint; `é` (`U+00E9`) and `e` (`U+0065`) +
combining acute (`U+0301`) are *correctly* different chars under the
axiom.
- **Case folding.** Not in scope; `prim__eqChar` is case-sensitive by
spec, and the property-test `distinct codepoints are not =:=`
invariant guards against a backend slipping case-insensitive
collation into char equality.

## References

- Idris2 0.8.0 `src/Core/Primitives.idr` — primitive operation table.
- Idris2 0.8.0 `src/Compiler/Scheme/Chez.idr` — Chez codegen lowerings.
- R6RS §11.11 "Characters" — `char=?` specification.
- OTP `erlang(3)` — `=:=`/`=/=` specification.
- `PROOF-NEEDS.md` — axiom audit (2026-05-18).
- `src/abi/Boj/SafetyLemmas.idr` — axiom declarations (lines 53, 60).
Loading
Loading