Postern

A Lean-verified access gateway for agentic data lakehouses.

Research artifact for Track 3 of the Apart Research Secure Program Synthesis Hackathon, 2026-05-22 → 2026-05-24.

Problem. Per-source authorization is non-compositional under ETL fusion. When $n$ heterogeneous sources are materialized into a single columnar lakehouse, the per-source guards (channel ACLs, field-level security, OAuth-scoped tokens) have no representative in the lake's authorization surface, and the effective permission of a query-issuing agent becomes the union of the upstream principals rather than the intersection under the querying identity.

Solution. Postern mediates every read at the plan boundary against a column-grant policy. Its core — Plan IR, policy DSL, rewriter — is mechanized in Lean 4, with nine sorry-free theorems certifying that every accepted plan's output schema and filter-predicate read-set are contained in the policy-allowed columns. A Rust implementation mirrors the algorithm and is conformance-tested against the Lean reference (18 / 18 cases).

What's in the box

Path	What
`paper/`	Pandoc-Markdown paper + BibTeX. Build with `paper/build.sh` (needs pandoc + xelatex).
`verifier/lean/`	Lean 4 spec, nine fully-proved theorems, axiom audit, corpus emitter.
`prototype/`	Rust workspace mirroring the Lean types (`postern-core`) and the conformance harness (`postern-diff`).
`scenarios/financial-institution/`	Kaggle transactions-fraud-datasets case study, three departments.
`scripts/reproduce.sh`	One-shot reproduction.

Acceptance criteria, met

Fully proved Lean 4 theorem(s). Nine theorems span output- column soundness, filter-predicate soundness (closes the WHERE ssn = ? side-channel), schema subset, monotonicity in policy, idempotence, and explicit-refusal lemmas for unknown relations and forbidden filter columns. No sorry. CheckAxioms.lean reports the per-theorem axiom set is bounded by {propext, Quot.sound} — Lean's built-in foundational axioms; two theorems depend on none.
Conformance testing. postern-diff runs the Rust rewriter against the Lean-emitted reference corpus; 18 / 18 cases pass on the demo scenario (15 accept, 3 refuse — including regression cases for known attack shapes).

Reproduce

scripts/reproduce.sh

Expected tail:

18/18 cases pass (Lean reference == Rust impl)
==> All green.

The script also runs the axiom audit (#print axioms on every load-bearing theorem) and checks the committed corpus matches what Lean emits today (catches drift).

Toolchains: Lean 4.29.1 (pinned in verifier/lean/lean-toolchain), Rust stable (tested with 1.93). Runs in under two minutes on an M-series Mac on a warm cache.

Design summary

Policy is a list of column-grants $\langle p, r, C\rangle$ — "principal $p$ may read columns $C$ on relation $r$". Fail-closed, monotone grant-only (no deny-lists by design; see paper §6).
Plan IR is Scan(rel) | Project(plan, cols) | Filter(plan, col) — single-relation by design so soundness stays small.
Rewriter returns Option Plan. Refuses on unknown relation or forbidden filter column; on accept wraps the plan in a Project of schema(q) ∩ allowed(P, prin, touched(q)).
Capability distribution is via biscuit tokens (prototype-side, outside the proof).
Surface in the prototype is an MCP server over Polars / DuckDB.

See paper/paper.md §3–§5 for the full design and §6 for open challenges (joins, aggregation + DP, biscuit attenuation in-proof).

Defense-in-depth

Postern is the plan-boundary layer — last chance to bound what a query can read. Pair it with an agent-code-boundary layer like the Scala-3-capture-checking approach from Odersky et al. 2026 (arXiv:2603.00991), which makes capabilities first-class program variables so agent-emitted code cannot exfiltrate data it doesn't hold a capability for. The two layers compose without re-verifying each other's TCB; see paper §3 "Defense-in-depth with capability tracking".

Status & scope

This is a research artifact, not production code. The Lean spec covers single-relation plans; the Rust impl carries joins by per-leg rewriting but those compositions are not yet under proof. Aggregation, filter-timing side-channels, and the planner→executor lowering step are out of scope for the current theorem set — see paper §2 (threat model) and §6.

License

MIT OR Apache-2.0 (workspace-wide).

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.github/workflows		.github/workflows
paper		paper
prototype		prototype
scenarios/financial-institution		scenarios/financial-institution
scripts		scripts
verifier		verifier
.gitignore		.gitignore
README.md		README.md
SUMMARY.md		SUMMARY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Postern

What's in the box

Acceptance criteria, met

Reproduce

Design summary

Defense-in-depth

Status & scope

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Postern

What's in the box

Acceptance criteria, met

Reproduce

Design summary

Defense-in-depth

Status & scope

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages