feat(web): Astro site + 6-slide deck + WASM-compiled rewriter demo by debuggingfuture · Pull Request #2 · debuggingfuture/secure-program-synthesis

debuggingfuture · 2026-05-24T20:37:50Z

Summary

Adds the public face of the Postern artifact — an Astro site that
renders the paper, embeds a six-slide deck, and runs a WASM-compiled
column-grant rewriter from the browser. Lands on top of PR #1 (the
core proof + code stays focused there).

Routes:

/ landing — claim, contributions, theorem status with honest
proved / partial / stated classification, repo pointers.
/paper — paper/paper.md rendered to HTML at build time with
KaTeX (server-side), Mermaid (client-side), and BibTeX citations
resolved against paper/references.bib.
/slides — six reveal.js slides: title/thesis, ETL collapse
problem, architecture (verbatim mermaid from paper §Design), Lean
theorem status table, Rust components, open problems.
/demo — interactive form for the financial-institution scenario.
Raw tables (sensitive columns flagged), policy in surface syntax
alongside its Datalog desugaring (one right(p,r,c). fact per
grant), and a try-it form that invokes the WASM rewrite_plan
with a structured input → structured output (refusal reason
surfaced verbatim).

Key actions

New prototype/crates/postern-wasm/ wraps postern-core::rewrite
behind a single rewrite_plan(JsValue) -> JsValue. Workspace
member registered; cargo check --workspace clean.
pnpm prebuild / predev invokes wasm-pack build --target web;
output drops to web/src/wasm/ (gitignored).
Vite raw imports (?raw) inline paper/paper.md,
paper/references.bib, and the .postern policy at SSR — keeps
source-of-truth in the repo root, not duplicated under web/.
astro.config.mjs sets vite.{server,preview}.allowedHosts: [".ts.net"] per repo convention; server.host: true so tailnet
proxy reaches the dev server.
wrangler.toml configures Cloudflare Pages project postern-web
(placeholder name); pnpm deploy:cf runs wrangler pages deploy dist. Not run from this PR — user approves first.

Disclosures the site surfaces (and is honest about)

The plan rewriter theorems in verifier/lean/Postern.lean are
sorry-free with axioms bounded by {propext, Quot.sound}.
eval_monotone in verifier/lean/Datalog.lean has a written
proof that reduces to herbrandBound_mono — a labelled sorry
exposed rather than hidden. eval_sound and eval_terminates
are stated with sorry. The site classifies these as partial
and stated rather than proved.
The WASM today is the column-grant rewriter (postern-core); the
Biscuit-Datalog migration is task #4 of the pivot and is disclosed
in the demo's "About this demo" panel.
Conformance: 18/18 cases pass (15 accept, 3 refuse) per
cargo run -p postern-diff -- corpus/postern-corpus.json.

Rationale

Astro picked over Slidev-as-iframe — single TypeScript build for
all four routes, mermaid + reveal.js as ordinary client deps, and
KaTeX runs server-side so search/print is content-faithful.
KaTeX SSR + a small BibTeX parser instead of pulling pandoc into
the build — keeps deploy reproducible and lets us own the
citation-numbering output. The bibtex npm package on the registry
is unmaintained; a 100-line regex parser handles the 15 entries in
references.bib cleanly.
WASM via wasm-pack --target web so the demo runs offline with
zero backend; deploys as static files to Cloudflare Pages.
Synthetic 5-row tables in scenario-data.ts match the schema of
the Kaggle transactions-fraud-datasets referenced in paper §5,
without shipping a third-party dataset.

Insight

Vite's ?raw import is the cleanest way to bring repo-root files
into the Astro SSR context — earlier fs.readFileSync(..., __dirname/...) worked in dev but broke in astro build because
the bundle output is re-located.
markdown-it's typographer rewrites whitespace around
text-token placeholders into U+FFFD replacement chars; switching
math to direct <span class="math-inline"> HTML emission +
typographer: false is the robust fix.
is:raw on <pre class="mermaid"> is required so Astro's JSX
parser doesn't choke on Mermaid's node{{label}} syntax.

What's deferred

wrangler pages deploy (user approves first).
Wiring the Biscuit-Datalog evaluator into the WASM surface (paper
task #4 — separate work).
A CI workflow for web/ (no actions yet exist on this branch;
flagged here so it's tracked).
Cross-relation joins / aggregation pages in the demo (paper §6
open problems).

Test plan

pnpm build (Astro) clean
pnpm astro check clean (0 errors, 0 warnings)
cargo check --workspace clean
cargo test -p postern-core 9/9 pass
cargo run -p postern-diff -- corpus/postern-corpus.json 18/18 pass
WASM smoke: init() + rewrite_plan from Node returns
accept / refuse-unknown / refuse-forbidden-filter
Dev server serves /, /paper, /slides, /demo (HTTP 200)
In-browser verification of mermaid renders + WASM demo
submission flow (eyeball test by user)

Notes

Branch off research/initial-scaffold; merge target is the same.
This branch carries no edits to paper/paper.md or
verifier/lean/; an in-progress paper revision and a Datalog
Lean lemma series were stashed locally so they stay associated
with PR Postern: Lean-verified access gateway for agentic data lakehouses #1.

Adds a feature-gated module `postern_core::datalog_biscuit` that evaluates `Policy::allowed prin rel` through `biscuit-auth`'s public Datalog evaluator (`biscuit_auth::AuthorizerBuilder` → `Authorizer::query`) — the same evaluator the production token-verification surface uses, only without the token-handling layers we put out of scope in paper §6 (attenuation, audience, expiry, key rotation). API surface (off by default): postern_core::datalog_biscuit::allowed_via_biscuit(pol, prin, rel) -> Result<Vec<String>, DatalogError> Compiles each `Grant { principal, relation, columns }` to one `right(p, r, c).` ground fact, builds an unauthenticated `Authorizer`, and queries `data($c) <- right(prin, rel, $c)`. Three smoke tests in the new module (run with `cargo test -p postern-core --features datalog-biscuit`): conforms_to_column_grant_on_crm_users_data -- biscuit-auth result == `Policy::allowed` (mem-set) refuses_cardops_on_users_data -- empty result for cross-department query union_of_grants_for_fraudrisk_on_users_data -- two grants on same (prin, rel) flat-union correctly The feature defaults OFF so the WASM crate (postern-wasm in PR #2) stays slim; turning it on pulls in ed25519-dalek, p256, prost, etc. The default build path (`Policy::allowed`) is unchanged — this is strict additive scaffolding. `cargo fmt --all --check` clean. `cargo clippy --features postern-core/datalog-biscuit -- -D warnings` clean. Full workspace tests still 9 + 6 + 3 + 18/18 conformance.

@mem0

* feat(postern): research artifact — Lean spec, Rust mirror, diff-test, paper Track 3 of the Secure Program Synthesis Hackathon 2026-05-22. Acceptance criteria met: - Fully proved Lean 4 theorem. `Postern.rewrite_sound` (no `sorry`). CheckAxioms.lean confirms dependencies are only Lean's standard `propext` and `Quot.sound`. Two supporting theorems (`rewrite_touched`, `rewrite_schema_subset`) also fully proved. - Differential testing. `lake exe postern-corpus` emits a JSON corpus of (catalog, policy, principal, plan, expected_rewrite). The Rust harness (`postern-diff`) replays each case through `postern_core` and asserts byte-equal plans + schemas + touched relations. 10 / 10 cases pass on the financial-institution scenario. Repo layout: paper/ Pandoc-md paper + BibTeX, ~6 pages verifier/lean/ Lean 4 spec, theorems, corpus emitter Postern.lean Plan IR, policy DSL, rewriter, proofs CheckAxioms.lean #print axioms audit Main.lean postern-corpus exe prototype/ Rust workspace crates/postern-core Types + rewrite mirroring Lean crates/postern-diff Differential test runner scenarios/financial-institution/ Kaggle transactions-fraud-datasets demo (CRM / CardOps / FraudRisk) scripts/reproduce.sh One-shot Lean → corpus → Rust → diff Single-relation Scan / Project / Filter IR for now; joins and filter-predicate side-channels are paper §6 (open challenges). * feat(postern): land review fixes — filter soundness, refusals, +8 corpus cases Response to a five-agent multi-perspective review (research scientist, Principal SWE, Lean 4 expert, academic-presentation expert, security researcher). All P0s addressed, most P1s landed. # Lean spec - Rewriter signature now returns `Option Plan` — `none` is an explicit refusal. Closes the "unknown relation = silently empty schema" catalog-drift bypass and the filter side-channel that the previous IR admitted. - New theorems (all `sorry`-free): * `rewrite_filter_sound` — predicate columns ⊆ allowed (closes the `WHERE ssn = ?` exfil path). * `rewrite_refuses_unknown` — unknown relation ⇒ explicit refusal. * `rewrite_refuses_forbidden_filter` — filter on forbidden col ⇒ explicit refusal. * `rewrite_no_new_columns` — contrapositive of schema-subset. * `rewrite_idempotent` — rewriter is a closure operator on schemas. * `rewrite_monotone` — strengthening the policy widens the output. - Proof discipline tightened: `by_cases` + `if_pos`/`if_neg` instead of fragile `simp [List.mem_filter]` chains. Per-theorem axiom set reported by `CheckAxioms.lean`; two theorems depend on no axioms. - JSON encoder in Main.lean is now RFC 8259 §7 -compliant (escapes \t, \r, \b, \f, and U+0000–U+001F as \uXXXX). # Rust + conformance - `postern-core` mirrors the new IR. Smart constructors for Plan/Grant/Catalog/Policy, `Hash` derive on every type, private fields with `new`/`from_*` constructors, `#[must_use]` on read paths. Nine unit tests including the new refusal cases. - `postern-diff` consumes the new `Option<Plan>` corpus shape. - Corpus expanded from 10 to 18 cases: * 3 refusal regressions (unknown relation, forbidden filter, nested forbidden filter). * Edge cases: empty policy, duplicate grants, catalog-absent columns, case-sensitive principal, trailing-whitespace principal, nonexistent project column. * All 18 pass. # CI / hygiene - `prototype.yml` rewritten from the deleted-Python form to cargo fmt + clippy -D warnings + test + conformance run. - `lean.yml` adds a corpus-drift check (regenerate, diff against committed copy, fail if changed). - `Cargo.lock` is now committed (binary workspace). - New `prototype/README.md` and `verifier/README.md`. - `scripts/reproduce.sh` actually runs the axiom audit and gates on corpus drift. # Paper - Anonymous citations (ocsql2023, memarchitect, ifc-ml) dropped; replaced with the verifiable Schoepe et al. faceted-execution paper and the AgentDojo / CaMeL line on LLM-agent threat models. - Cedar bib metadata replaced with a conservative `Proc. ACM Program. Lang.` entry — the prior author list and DOI were not verifiable. - "Every column visible at the gateway boundary" overclaim removed; theorem statements in the paper now match the Lean source exactly. - Axiom audit claim updated to "bounded by `{propext, Quot.sound}`, two theorems depend on none." - §2 (threat model) gains explicit TCB rows for catalog integrity, plan-to-executor lowering, and principal-string extraction. - §3 ASCII diagram replaced with Mermaid (per workspace convention). - §5 demo table gains an "outcome" column distinguishing refusal from accept-with-empty-schema. - §6 condensed from 5 mixed items to 3 load-bearing open theorems. - "Differential testing" → "reference-conformance testing" throughout — honest about the hand-curated corpus. - Three flagged marketing-ese sentences rewritten. - Reproducibility section pins toolchain versions and runtime budget. * docs: ground the lake-collapse pitch in concrete upstream-RBAC examples The "per-source RBAC dies at the lake boundary" phrasing was too abstract. Anchor it to Slack channel ACLs + Salesforce field-level security + Stripe customer-scoped tokens collapsing into a single DuckDB service-account role on ingest. Fix in paper abstract + introduction, README, SUMMARY, and PR body. * docs: add capability-tracking guardrail layer (Odersky et al. 2026, arXiv:2603.00991) User flagged a missing piece in the original prompt — guardrails via a 'tactic system' (Scala 3 capture-checking, in this case). The paper proposes capabilities as first-class program variables in a capture- checked language so agent-emitted code cannot exfiltrate data it does not hold a capability for. This composes with Postern. Postern enforces at the *plan boundary* (rewriter); capture-checking enforces at the *agent-code boundary*. The two layers stack without re-verifying each other's TCB. Changes: - references.bib: arxiv 2603.00991 entry - paper.md §3: new 'Defense-in-depth with capability tracking' subsection - paper.md §7 (related work): explicit stack-complementary note - README.md: defense-in-depth pointer - SUMMARY.md: extended item 10 with the layered model * ci(lean): commit lake-manifest.json so lean-action can build on fresh checkout The leanprover/lean-action workflow runs config.sh which errors out when no lake-manifest.json is present ("Run lake update to generate manifest"). Manifest is trivial (empty packages array — no deps) but must exist; treat it like Cargo.lock and pin it. Fixes the 18:23 UTC run of lean.yml which failed at the Configure step before any build started. * docs: tighten problem/solution framing — non-compositional authorization under ETL fusion Replaces the journalistic 'concrete' framing with academic prose across paper abstract + intro, README, SUMMARY. The phenomenon is named (non-compositional authorization), stated formally (materialized lakehouse L = ⋃ Sᵢ has authorization denotation A_L determined by the ingest IAM role alone; the originals {Aᵢ} have no semantic representative in A_L), and the consequence is sharp (effective permission = union of upstream principals, not intersection under querying identity). Concrete examples (Slack/ Salesforce/Stripe) remain as a one-sentence illustration. * feat(guardrail): Rust capability-tracking layer + rewrite paper Introduction Closes the missing piece from the original prompt: a guardrail layer ensuring no compute on data can be exfiltrated. Inspired by Odersky et al. 2026 'Tracking Capabilities for Safer Agents' (arXiv:2603.00991), which uses Scala 3 capture-checking; we mechanize the same intuition in Rust via sealed types. New crate: prototype/crates/postern-guardrail - Cap<C> — unforgeable single-use capability token, sealed inner field, pub(crate) constructor - Tagged<T, C> — data + capability needed to release, private value, sanctioned compute via map / and_then - gateway::issue_plan — combines verified rewrite with capability issuance; None iff Layer 1 refused - 7 unit tests covering legal flows + refusals - 2 compile_fail doctests pinning bypass attempts: * forging a Cap (sealed field private) * reading Tagged::value directly (field private) Paper §1 (Introduction) rewritten from scratch around the two-layer thesis: Layer 1 — plan boundary, Lean-verified (postern-core) Layer 2 — agent-code boundary, Rust capability-tracking (postern-guardrail) §3 'Defense-in-depth' replaced with 'Layer 2' — now describes the actual Rust mechanization, not just a forward-pointer to Odersky. Contributions list updated: 4 items, Layer 2 is contribution 2. Workspace: 9 (postern-core) + 7 (postern-guardrail) unit tests + 2 doctests + 18 conformance cases — all green. * feat(guardrail): branded scope + opaque-receipt sinks + no_std partition Three additive hardenings to postern-guardrail that move Layer 2 from a privacy-only perimeter towards a closer pure-Rust analog of capture-checking [Odersky et al. 2026, arXiv:2603.00991]. 1. Invariant brand 'sc on Cap and Tagged via PhantomData<fn(&'sc ()) -> &'sc ()>. The sole entry point is run<T, C, R, F>(value, f) -> R with F: for<'sc> FnOnce(Cap<'sc, C>, Tagged<'sc, T, C>) -> R, R: 'static. Universal quantification of 'sc + R: 'static forbids the closure from returning anything that references 'sc — Cap/Tagged cannot lexically escape the scope. (The ghost-cell pattern.) Verified by a third compile_fail doctest. 2. Public Tagged::release(cap) -> T removed. Extraction goes through sinks::* functions that consume both Cap and Tagged and return opaque receipts (LlmAck, AuditAck), not raw T. The agent never holds naked T across the scope boundary. 3. Crate is #![cfg_attr(not(test), no_std)] + extern crate alloc. postern-core dependency feature-gated under 'gateway' (on by default). A downstream agent crate with default-features = false can compile #![no_std] against just core + alloc, which means no link-level access to std::println!, std::process::exit, network sockets, or filesystem APIs — the only side-effect channels available are the sanctioned sinks. Paper rewrite (per user feedback): - Academic register throughout. Drop business buzzwords ('headline', 'load-bearing', 'closes the side-channel', 'we exercise', etc.) for restrained PL/security paper voice. - Abstract reframed around three components (mechanised core, capability layer, conformance harness) rather than aspirational marketing. - Introduction restructured around two deployed alternatives: (a) Postgres RLS/CLS — does not transfer to lakehouse, per-source policies don't compose across ETL paths (b) tenant segregation — silo case, forfeits cross-source analytics Followed by the property-gap statement (A_L determined by ingest IAM role alone; effective permission = union of upstream principals, not intersection). - §3 Layer 2 section rewritten to match the new hardened design (branded scope, opaque-receipt sinks, no_std partition, residual side-channels). - Theorems restated with consistent notation ($\sigma(q)$ for output schema, etc.). - Related work restructured into four landmark groups. - §6 open problems restated as research questions with conjectural theorem statements where appropriate. Workspace: 9 + 6 unit tests + 3 doctests + 18 conformance + clean no_std build (without gateway feature). * style(guardrail): apply rustfmt — CI rustfmt version disagreed on closure layouts * docs(paper): tighten abstract + condense §Existing-approaches to 4 sentences Per user feedback: - Abstract rewritten as a single problem-setting → contribution arc with three concrete components. Drops the prior "three points in this design space" wind-up and the implementation-detail asides; reads as a single ~280-word claim block. - §Existing approaches condensed from 13 sentences to 4: one lead, one for RLS/CLS, one for tenant segregation, one conclusion noting neither offers a compositional authorization guarantee over heterogeneous ingest. * docs(paper): bulk Introduction into a single motivation paragraph Per user: collapse Introduction + Existing approaches + Property gap + Approach subsections into one continuous paragraph carrying the full motivation arc — lakehouse + data-mesh + local-first + on-prem/edge LLM, agent-issued SQL with short/contextual access patterns that TRBAC cannot scale to, the two deployed responses (Postgres RLS and tenant segregation) and their limits, the union-not-intersection consequence, and the proposal of plan-level rewriting as a third design point. The formal property-gap restatement and the explicit two-requirement breakdown are folded into the prose. Contributions subsection remains as the structured list. * docs(paper): improve Introduction readability — same single paragraph, shorter sentences The bulk-paragraph version had a 150-word first sentence and three 40-50 word sentences. Same content, same paragraph, but split at natural clause boundaries: - First mega-sentence (lakehouse setup + access-control consequence + TRBAC-doesn't-scale) split into four shorter sentences. - RLS + tenant-segregation single sentence split into two, one per approach. - Closing 'we propose / Postern has three components' split into two sentences so the proposal lands cleanly before the artifact breakdown. No content additions or removals; word count down ~5%, average sentence length down from ~50 to ~25. * docs(paper): drop unsupported exemplar list in Introduction The prior paragraph claimed mem0, Notion AI, and Airbyte-class ETL workers exemplify the Parquet-on-S3 + DuckDB + LLM-agent-at-inference pattern. None cleanly do: mem0 is vector/graph-store backed by default, Notion AI's storage is not publicly Parquet+DuckDB, and Airbyte is producer-side (writes to lakehouses, doesn't query them as an agent). Replaced with the actual claim — lakehouse reads at inference time are spreading; DuckDB is one in-process engine example. Drops the orphaned @mem0 bib entry. * feat(verifier): Datalog scaffold — Horn-fragment evaluator skeleton Lands `verifier/lean/Datalog.lean` per the thesis pivot (2026-05-25): policies expressed in Biscuit-Datalog rather than the legacy column-grant DSL. Scope is path-B-narrow: Horn fragment only (no attenuation, audience, expiry, or key rotation — those are paper §6). Includes: - `Term`, `Atom`, `Rule`, `Program` syntax over `String` symbols. - `matchTerm` / `matchAtom` / `allMatches` — pattern matching against ground facts with substitution accumulation. - `step` — one immediate-consequence iteration; `iterate` — bounded. - `Program.herbrandBound` — `|preds| × (|consts|+1)^maxArity + 1` conservative LFP-depth bound. - `eval` — `iterate` for `herbrandBound` steps. - `Program.allowed prin rel` — the rewriter-facing bridge: queries `eval` for ground `right(prin, rel, c)` atoms. Theorems are stated with `sorry` placeholders to land alongside in a follow-up commit: - `eval_monotone` — bigger program ⇒ bigger fact set. - `eval_sound` — derived facts are rule-supported. - `eval_terminates` — iteration past `herbrandBound` is stable. CI: `lake build Datalog` succeeds; warnings flag the three `sorry`s. `Datalog` added to `defaultTargets` so the existing `lean` workflow exercises it. * feat(verifier): prove eval_monotone + 8 support lemmas Fills in one of three Datalog headline theorems and the eight supporting lemmas it rests on. The proofs use only stdlib list-membership lemmas; no Mathlib, no `simp` doing semantic work. Style follows Postern.lean. Newly proved (all `sorry`-free, axioms ⊆ {propext, Quot.sound}): step_extensive : a ∈ F → a ∈ step R F allMatches_subset_facts : F ⊆ F' ⇒ matches lift step_subset_facts : F ⊆ F' ⇒ step R F ⊆ step R F' step_subset_rules : R ⊆ R' ⇒ step R F ⊆ step R' F step_subset : joint monotonicity of step iterate_succ_extensive : iterate R F n ⊆ iterate R F (n+1) iterate_subset_le : n ≤ m ⇒ iterate R F n ⊆ iterate R F m iterate_subset_program : iterate is monotone in (R, F) at fixed n Newly proved (modulo one isolated cardinality obligation): eval_monotone : P ⊆ P' ⇒ eval P ⊆ eval P' -- via iterate_subset_program + -- iterate_subset_le + -- herbrandBound_mono (sorry) Open obligations (named explicitly so the residual proof surface is audited rather than buried): herbrandBound_mono : combinatorial argument about |L.eraseDups| under set inclusion. Intuitively monotone; the proof is mechanical list-Nodup-length arithmetic, no semantic content. eval_sound : every derived atom is rule-supported by a grounding whose body lies in eval. Proof is induction on iteration depth + case analysis on `step`. eval_terminates : iteration past herbrandBound is no-op. Knaster–Tarski-style Herbrand-base saturation argument. Other changes: - step: dropped the dedup-filter so the mem-set characterisation is clean (`a ∈ step R F ↔ a ∈ F ∨ a ∈ derived`). Output list may contain duplicates; downstream callers only consult `∈ eval`. - CheckAxioms: now imports Datalog and audits all 12 new declarations. The audit output names every `sorryAx` user (eval_monotone, herbrandBound_mono, eval_sound, eval_terminates) so the open obligations are visible in CI logs. * feat(verifier): rule-free Datalog specialisation — 4 lemmas Adds four `sorry`-free lemmas that cover the motivating-examples regime (financial-institution scenario: ground `right(_,_,_)` facts only, no derivation rules): eval_fact_mem : every P.fact is in eval P (extensiveness) step_no_rules : a ∈ step [] F ↔ a ∈ F iterate_no_rules : a ∈ iterate [] F n ↔ a ∈ F (every n) eval_no_rules : P.rules = [] → eval P ≡ P.facts (mem-set) `eval_no_rules` is the soundness direction of `eval_sound` unconditionally for the rule-free case — i.e. all our current scenarios. The general-case `eval_sound` (and `eval_terminates`) still need the Knaster–Tarski-style argument; tracked separately. Axiom audit (all four): `[propext]` only — no `Quot.sound`, no `sorryAx`. `CheckAxioms.lean` updated to print the dependencies. * docs(paper): tighten Introduction + add sample policy Two changes that have been on the working tree but unpushed since the structural/sentence-level review pass: 1. **Concise Introduction**: drop the bulky agent-dojo/CaMeL citation parenthetical and the "Two responses dominate" throat-clearing; collapse RLS-vs-segregation analysis to two crisp sentences each. Word count down ~30%; the load-bearing union-vs-intersection claim stays intact. 2. **Sample policy in the Introduction**: a concrete column-grant excerpt from the financial-institution scenario (§5), so the reader sees a tangible artifact before the formal contribution list rather than after. Note: the abstract's broader Datalog-thesis rewrite lives on branch `docs/paper-abstract` and merges separately. * docs(paper): rewrite abstract for Datalog thesis pivot Reflect the 2026-05-25 pivot to Biscuit-Datalog as the policy language. The column-grant surface now compiles to ground `right(principal, relation, column)` facts in the Horn fragment; the Lean development mechanises both the rewriter (nine theorems, sorry-free) and an underlying Datalog evaluator (partly proved — eval_monotone modulo one isolated obligation, eval_sound and eval_terminates carry sorryAx obligations named in CheckAxioms.lean). The abstract is now honest about what is and isn't proved, references @biscuit and @lean4 inline, and replaces the threat-model recap with the union-vs-intersection ACL-collapse attack. Body of the paper (§Contributions, §4, §Related-work) is untouched and inconsistent with the new abstract in places — to be reconciled in a follow-up pass. * docs(paper): build paper.pdf — pandoc + tectonic 11-page PDF rendered from paper.md via Pandoc 3.9 + Tectonic 0.16 (replaces the placeholder xelatex default in build.sh — tectonic is self-contained and auto-fetches TeX Live packages, so reviewers don't need a full mactex install). header.tex carries two small shims: - \xmpquote passthrough (pandoc emits it for pdfkeywords; comes from hyperxmp which tectonic doesn't auto-pull), - \newunicodechar mappings for the check marks and math symbols used in tables / verbatim code (✓ ✗ ⊆ ∩ ∅), since Latin Modern doesn't cover them. PDF_ENGINE env var lets you override the engine (e.g. PDF_ENGINE=xelatex). .gitignore: paper.pdf is now committed so reviewers don't need the toolchain locally — rebuild with `bash paper/build.sh`. * docs(paper): rebuild PDF (12 pages) off Datalog-thesis abstract + tightened Introduction * docs(paper): reconcile body with Datalog-thesis abstract Five surgical edits flagged by the abstract-rewrite review: 1. §Contributions item 1 — now mentions Biscuit-Datalog policy language, the Datalog evaluator lemmas in Datalog.lean, the single 'eval_monotone' theorem proved modulo herbrandBound_mono, and the two stated meta-theorems with named sorryAx. 2. §Contributions item 3 — discloses biscuit-auth::datalog::World as the Rust gateway's target evaluator, with the column-grant DSL as in-progress byte-equivalent stand-in. 3. §Formal model — new paragraph for Datalog.lean separately audits eight support lemmas + four rule-free specialisations (all sorry-free) + one partial + two stated. 4. §Related work, capability runtimes — Biscuit framing expanded: it is both the deployed token mechanism AND our policy-language layer, with the explicit out-of-scope items (attenuation/audience/expiry/key-rotation) called out. 5. §Implementation and conformance — adds a 'Datalog backend' paragraph naming the public World API and queuing the second Lean↔biscuit-auth conformance corpus. PDF (paper/paper.pdf) rebuilt via tectonic — 12 pages, 117 KB. * feat(prototype): postern-core gains datalog-biscuit feature Adds a feature-gated module `postern_core::datalog_biscuit` that evaluates `Policy::allowed prin rel` through `biscuit-auth`'s public Datalog evaluator (`biscuit_auth::AuthorizerBuilder` → `Authorizer::query`) — the same evaluator the production token-verification surface uses, only without the token-handling layers we put out of scope in paper §6 (attenuation, audience, expiry, key rotation). API surface (off by default): postern_core::datalog_biscuit::allowed_via_biscuit(pol, prin, rel) -> Result<Vec<String>, DatalogError> Compiles each `Grant { principal, relation, columns }` to one `right(p, r, c).` ground fact, builds an unauthenticated `Authorizer`, and queries `data($c) <- right(prin, rel, $c)`. Three smoke tests in the new module (run with `cargo test -p postern-core --features datalog-biscuit`): conforms_to_column_grant_on_crm_users_data -- biscuit-auth result == `Policy::allowed` (mem-set) refuses_cardops_on_users_data -- empty result for cross-department query union_of_grants_for_fraudrisk_on_users_data -- two grants on same (prin, rel) flat-union correctly The feature defaults OFF so the WASM crate (postern-wasm in PR #2) stays slim; turning it on pulls in ed25519-dalek, p256, prost, etc. The default build path (`Policy::allowed`) is unchanged — this is strict additive scaffolding. `cargo fmt --all --check` clean. `cargo clippy --features postern-core/datalog-biscuit -- -D warnings` clean. Full workspace tests still 9 + 6 + 3 + 18/18 conformance. * ci(prototype): exercise the datalog-biscuit feature in CI Adds two CI steps that run with `--features datalog-biscuit` enabled, so the biscuit-auth integration smoke tests (`conforms_to_column_grant`, `refuses_cardops_on_users_data`, `union_of_grants_for_fraudrisk`) regression-protect the in-progress migration.

…owser demos Wraps `postern-core::rewrite` behind a single `rewrite_plan(JsValue) -> JsValue` function so the Astro site can run the same Lean-mirrored rewriter in the browser via `wasm-pack build --target web`. The response shape is structured (`ok`/`reason`/`input_plan`/`output_plan`) so the demo UI doesn't need to reproduce the refusal-reason logic. Workspace member registered; `cargo check --workspace` clean. The `opt-level=s`/`lto` profile knobs are commented out (sub-crate profile warnings under workspace).

@key

Scaffolds `web/` with four routes: - `/` Landing — claim, contributions, repo pointers, theorem status (proved / partial / stated, with honest disclosures about `eval_*` sorries and the column-grant-only WASM). - `/paper` Paper rendered from `paper/paper.md` at build time — custom BibTeX parser resolves `[@key]` to numbered superscripts and emits the references section; KaTeX SSR for `$...$` and `$$...$$`; mermaid blocks pass through for client-side render. - `/slides` Six reveal.js slides — title/thesis, ETL collapse, architecture (verbatim mermaid from paper §Design), Lean theorem status table, Rust components, open problems. Mermaid renders per-slide on `slidechanged`. - `/demo` Interactive form for the financial-institution scenario: raw tables (sensitive cols flagged), policy in surface syntax alongside Datalog desugaring, and a Try-it form calling `rewrite_plan` from a WASM module compiled out of `postern-core`. Refusal reasons surfaced verbatim ("unknown relation" / "filter on forbidden column"). Wiring: - `pnpm prebuild` / `predev` invoke `wasm-pack build --target web` against `prototype/crates/postern-wasm`; output drops to `web/src/wasm/` (gitignored, regenerated each build). - Vite raw imports (`?raw`) inline `paper/paper.md`, `references.bib`, and the .postern policy at SSR time — `fs.readFileSync` against __dirname doesn't survive bundle output relocation. - `astro.config.mjs` allows `.ts.net` hosts for tailnet exposure. - `wrangler.toml` configures Cloudflare Pages project `postern-web`; `pnpm deploy:cf` runs `wrangler pages deploy dist` (not run from here — user approves first). Disclosures the site must surface and does: - `eval_monotone` proof reduces to `herbrandBound_mono` (one labelled sorry); `eval_sound` and `eval_terminates` stated with `sorry`. Plan rewriter theorems are `sorry`-free with axioms bounded by `{propext, Quot.sound}`. - WASM today is the column-grant rewriter; Biscuit-Datalog dispatch is task #4 and not what this binary executes. Stack: Astro 6 / TypeScript strict / KaTeX SSR / markdown-it / mermaid (client) / reveal.js / wasm-pack. `astro check` clean, `astro build` produces 4 static pages + 3.4 MB `_astro/` (mostly KaTeX CSS, mermaid bundle, reveal.js theme).

Four small slide edits: - Drop the 'FractalBox · Track 3' byline from S1 (and its unused .reveal .byline CSS rule). - S2 heading: 'The collapse at ETL' → 'Problem: IAM collapse at the lakehouse'. - S5 heading: 'What Rust does' → 'Learning from Cedar'. - S5 body: drop the 'migration is task #4, not yet in this artifact' aside — the migration landed on main in PR #1; the paragraph now reads as a stable description rather than a future-tense disclaimer.

debuggingfuture mentioned this pull request May 24, 2026

Postern: Lean-verified access gateway for agentic data lakehouses #1

Merged

debuggingfuture marked this pull request as ready for review May 24, 2026 22:06

debuggingfuture added 3 commits May 25, 2026 06:08

debuggingfuture force-pushed the web/astro-scaffold branch from c381171 to e52d76e Compare May 24, 2026 22:08

debuggingfuture changed the base branch from research/initial-scaffold to main May 24, 2026 22:08

debuggingfuture merged commit 10337e4 into main May 24, 2026
2 checks passed

debuggingfuture deleted the web/astro-scaffold branch May 24, 2026 22:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(web): Astro site + 6-slide deck + WASM-compiled rewriter demo#2

feat(web): Astro site + 6-slide deck + WASM-compiled rewriter demo#2
debuggingfuture merged 3 commits into
mainfrom
web/astro-scaffold

debuggingfuture commented May 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

debuggingfuture commented May 24, 2026

Summary

Key actions

Disclosures the site surfaces (and is honest about)

Rationale

Insight

What's deferred

Test plan

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant