Skip to content

ghuntley/tMIR

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

232 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tMIR — Trusted Mid-level Intermediate Representation

A shared proof-carrying IR for the t* verified compilation stack.

tMIR is the shared intermediate representation that tRust, tSwift, and tC all compile to. Verification tools operate on tMIR. LLVM2 consumes tMIR for verified codegen. One IR, many languages, composable proofs.

tRust (Rust+proofs)     tSwift (Swift+proofs)     tC (C+proofs)
       \                       |                      /
        \                      |                     /
         \                     |                    /
          v                    v                   v
       +------------------------------------------+
       |                    tMIR                   |
       |                                           |
       |  - State machine semantics (TLA+)         |
       |  - Proof obligations on every node        |
       |  - Information flow labels                |
       |  - Trust boundary markers                 |
       |  - Typed (maps to source types)           |
       +------------------------------------------+
                     |                |
                     v                v
              tla2 + z4           LLVM2
              (verify)          (codegen)
                                   |
                                   v
                        Verified Machine Code

Why tMIR Exists

Every verified language frontend reinvents the wheel: Prusti, Creusot, Kani each build their own IR with incompatible proof formats. Proofs don't compose across languages. Codegen is unverified (proofs stop at LLVM).

tMIR solves this:

  • Write verification once, apply to all languages. tRust, tSwift, and tC share one verification pipeline.
  • Cross-language proofs compose. A Rust library called from Swift inherits its proofs.
  • Single codegen backend to verify. LLVM2 is purpose-built for tMIR.

Proof-Carrying IR

tMIR is not just an IR with optional annotations. Every node has attached proof obligations and every edge has a proof that the source's postcondition implies the target's precondition.

Traditional IR:
  instruction -> instruction -> instruction

tMIR:
  (instruction, precondition, postcondition, proof) ->
  (instruction, precondition, postcondition, proof) ->
  ...

TLA+ Semantics

tMIR embeds state machine semantics for whole-program verification:

tMIR Construct TLA+ Meaning
State block VARIABLES
Action function TLA+ action
Guard (where) Enabling condition
Effect (ensures) Next-state relation
Invariant Safety property
Temporal property Liveness

This enables verification beyond single functions: "this system satisfies these temporal properties."

Installation

git clone https://github.com/andrewdyates/tMIR
cd tMIR
cargo build

Quick Start

Run the public CLI against a tMIR module and format it for stable diffs:

cargo run -p tmir-cli -- validate module.tmir
cargo run -p tmir-cli -- dump module.tmir
cargo run -p tmir-fmt -- module.tmir

For library use, depend on tmir for the core IR and on tmir-build for the builder and validation API. In self-contained consumers such as tRust, use local path dependencies so the frontend and tMIR update atomically in one workspace:

tmir = { path = "../tMIR/crates/tmir", features = ["serde", "parser"] }
tmir-build = { path = "../tMIR/crates/tmir-build", features = ["serde"] }

Supported Formats

Format Surface
Text tMIR Parser and pretty-printer for human-readable modules.
Binary tMIR Compact round-trip format for generated modules.
JSON Serde-backed interchange format for tools and reports.
MessagePack Serde-backed binary interchange format for tool pipelines.
Proof lineage Typed sidecar manifests for composed certificates across transforms.
Lean 4 semantics Public Lean package for operational semantics and proof work.

Repository Layout

Package Description
crates/tmir Core IR: types (ty), instructions (inst), proof annotations (proof), constants (constant), typed value IDs (value), instruction nodes (node), display (display). Feature-gated modules: parser, binary, serde, dialect, fmt, diff.
crates/tmir-build Builder API (ModuleBuilder, FunctionBuilder) and validator (validate_module, validate_function).
crates/tmir-z4 Non-default Z4 solver adapter for tMIR proof obligations and proof certificates. Uses the owned sibling ../z4 checkout, not a Git dependency.
crates/tmir-cli tmir binary: validate, dump, convert, show-gpu-eligibility.
crates/tmir-fmt tmir-fmt binary: canonical text pretty-printer for diff stability.
crates/tmir-diff tmir-diff binary: semantic module diff (ID/order-insensitive); --json and --ignore-proofs flags.
lean/tmir-semantics Lean 4 operational semantics, safety proofs, and z4 bridge (TMir/Semantics/, TMir/State/, TMir/Proofs/).

Dialects

tMIR supports MLIR-style dialects — namespaced operation sets that ride inside the core Inst::DialectOp variant and lower progressively into core tMIR via a fixpoint driver. Dialects round-trip through every serialization format (text, binary, JSON, MessagePack) even when the consumer has never heard of the dialect, so modules are portable across frontends, optimizers, and backends.

  • Registry: tmir::dialect::DialectRegistry
  • Op payload: tmir::dialect::DialectInst
  • Lowering: tmir::dialect::{LoweringPass, lower_module}
  • Opt-in helper surface: tmir::dialect::examples::verif (bfs_step, frontier_drain, fingerprint_batch) — gated behind the dialect-verif-example feature. The stable cross-repo contract is the verif.* op names, payload version, and lowering shape; the Rust helper module itself remains opt-in.

The dialect API documents the serialization format, lowering semantics, and verif worked example in the public Rustdoc. LLVM2#390 tracks the backend-side contract.

CLI Tools

tmir validate module.tmir              # validate well-formedness
tmir dump module.tmir                   # pretty-print module
tmir convert --from bin --to json a.tmir a.json
tmir show-gpu-eligibility module.tmir   # per-function GPU gate status
tmir-fmt module.tmir                    # canonical text pretty-printer
tmir-diff a.tmir b.tmir                 # semantic diff
tmir-diff --json --ignore-proofs a.tmir b.tmir

Reference Docs

Development

cargo test
cargo test --all-features
cargo test -p tmir-z4
./scripts/ci-self-verify.sh
cargo run -p tmir-cli -- --help
cd lean/tmir-semantics && lake build

Dependency Source Policy

Core tMIR workspace builds must not depend on upstream Git sources. Solver integrations live behind explicit adapter crates; tmir-z4 is excluded from default workspace members and resolves z4 from the owned sibling checkout at ../z4/crates/z4.

Self-Verify Gate

Run ./scripts/ci-self-verify.sh before refreshing a tRust snapshot. The gate checks that tMIR manifests and Cargo.lock contain no active Git dependencies, then runs locked default tests, locked all-features tests, the conformance fixture/canonical text corpus, and the non-default tmir-z4 adapter tests. The adapter step expects the sibling ../z4 checkout to have no tracked local edits; set TMIR_ALLOW_DIRTY_Z4=1 only for local diagnosis. If lake is on PATH, the gate also builds lean/tmir-semantics; set TMIR_REQUIRE_LEAN=1 to make a missing Lean toolchain fail the gate, or TMIR_SKIP_LEAN=1 to skip that step explicitly.

Testing

  • cargo test --all-features — 806 tests in tmir, 202 in tmir-build, all passing.
  • crates/tmir/tests/aggregate_constant_fuzz.rs — 4 seeded property tests × 4096 cases covering parser, binary, JSON, and MessagePack roundtrips on randomized aggregate constants (Part of #34).
  • crates/tmir/tests/binding_frame_dialect_roundtrip.rs — binary + serde (JSON + MessagePack) roundtrip for the binding-frame lifecycle (OpenFrame/BindSlot/LoadSlot/CloseFrame) and opaque DialectOp roundtrip including unknown dialects (Part of #58).
  • crates/tmir-build/tests/gpu_eligibility.rs — 20-row matrix over is_safe_for_gpu and is_gpu_eligible covering the two GPU-gate contract (Part of #35).
  • cd lean/tmir-semantics && lake build — Lean operational semantics and proofs. 2 documented sorrys remain in MemorySafety.lean (tracked in follow-up issues).

Status

Preview — Core type system and instruction definitions implemented. Active development.

Verification Goal

The long-term contract is that a tMIR module verified against its proof obligations and compiled by LLVM2 can carry source-level specifications through to generated code. The current repository provides the IR, builders, validators, formats, CLI tools, and Lean semantics package that support that contract; it is still a preview implementation rather than a finished source-to-machine proof chain.

Related Projects

Project Role
tRust Rust frontend (compiles Rust to tMIR)
tSwift Swift frontend (compiles Swift to tMIR)
tC C verification tool (generates tMIR from ACSL)
LLVM2 Verified codegen (compiles tMIR to machine code)
z4 SMT solver backend
tla2 Temporal logic verification

License

Apache 2.0 — see LICENSE.

Copyright 2026 Dropbox

About

Trusted MIR - Universal IR for the t* verified compilation stack (tRust, tSwift, tC)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Rust 81.6%
  • Lean 18.1%
  • Other 0.3%