Auditable research production system for papers that need traceable claims, admissible sources, and executable verification.
TruthWeave turns a paper repository into an audit spine: research intent in brief.yml, source admissibility in data_sources.yml, claim bindings in evidence.yml, reviewer-facing packets, and deterministic verification outputs. It is built for teams that want a paper workflow they can inspect, rerun, and hand off without losing the evidence trail.
Watch showcase clip · Start here · Finance exemplar · Formal methods exemplar · Resume guide
Showcase concept preview. Click the image to open the short clip.
Audit spine • provenance gate • reviewer packet • verification harness • domain profiles
TruthWeave is not a one-prompt autonomous paper generator. It is a system for producing research artifacts under explicit contracts: what was claimed, which sources were admissible, how evidence was bound, what reviewers should inspect, and how major claims can be replayed.
Many AI paper workflows optimize for autonomy. TruthWeave optimizes for admissibility and verification, not maximum autonomy. It is designed to keep traceable claims, source provenance, deterministic checks, reviewer-facing packets, executable verification paths, and domain-specific policy enforcement inside the repository rather than in undocumented process.
brief -> refs -> provenance -> claims -> review -> packet -> verification -> build
This workflow keeps the paper, its evidence, and its rerun path aligned instead of relying on prose-only handoffs.
These are the fastest ways to understand the product from the repository itself:
- Benchmark / failure corpus:
benchmarks/cases/andartifacts/benchmarks/benchmark_report.mdshow positive cases, negative cases, and blocked shortcuts as executable contract tests. - Finance exemplar:
papers/finance_exemplar/plus itspacket.mdandverification_report.mdshow a finance ML workflow with explicit temporal protocol and admissibility rules. - Formal methods exemplar:
papers/formal_methods_exemplar/plus itspacket.mdandverification_report.mdshow proof-bundle declarations and exact/file-presence verification. - Resume / development guide:
docs/TRUTHWEAVE_RESUME_GUIDE.mdis the canonical restart and handoff guide for contributors and coding agents.
If you want to run one thing first, start with uv run truthweave benchmark-contracts --format md or make exemplars.
TruthWeave ships with finance_ml, formal_methods, and simulation_abm. These profiles enforce domain-specific admissibility rules, forbidden substitutes, required declarations, evaluation requirements, and verification expectations so the same workflow can carry different research contracts honestly.
This repository already includes concrete proof assets rather than a proposal-only skeleton:
- A deterministic benchmark corpus with positive and negative contract cases under
benchmarks/cases/. - Reviewer packet export under
artifacts/packets/for inspection and handoff. - A verification harness under
artifacts/verification/with replay targets and reports. - Canonical exemplar papers under
papers/finance_exemplar/andpapers/formal_methods_exemplar/. - Profile reports under
artifacts/profiles/showing domain-policy enforcement in practice.
TruthWeave is for researchers, labs, and engineering-heavy paper workflows that care about reproducibility, auditable claims, domain-valid evidence, and handoff-ready research contracts. It is not primarily for users who want one-prompt autonomous paper generation without provenance, policy checks, or verification.
The full quickstart below walks the generic paper path end to end using example; the faster product-level entry points are the benchmark corpus and the two exemplars.
uv sync
uv run truthweave validate-brief --paper example
uv run truthweave validate-profile --paper example
uv run truthweave validate-provenance --paper example
uv run truthweave run exp=example
uv run truthweave discover
uv run truthweave build-paper-assets --paper example
uv run truthweave sync-refs --paper example
uv run truthweave validate-evidence --paper example
uv run truthweave provenance-report --paper example --format md
uv run truthweave claim-report --paper example --format md
uv run truthweave review-thread --paper example --phase draft_reviewed --format md
uv run truthweave reviewer-packet --paper example --format md
uv run truthweave verify-paper --paper example --format md
uv run truthweave profile-report --paper example --format md
uv run truthweave check --paper example
uv run truthweave benchmark-contracts --format mdTruthWeave keeps the research contract, generated assets, and verification outputs in one repository-native layout:
brief.yml / references.yml / data_sources.yml / evidence.yml
-> runs/ + artifacts/
-> papers/<paper_id>/auto
-> reviewer packet + verification report + PDF
This repository intentionally includes four papers with different roles:
example: the minimal single-paper baseline used by Quickstart commands.demo_paper: a second lightweight paper used to demonstrate multi-paper operations (discover,*-allMake targets, and cross-paper checks).finance_exemplar: a richer finance ML profile demo with explicit temporal protocol, baselines, admissibility declarations, packet export, and verification targets.formal_methods_exemplar: a richer formal methods profile demo with proof bundle declarations, exact-match/file-presence verification, packet export, and profile enforcement.
Keeping all four in the repository lets you test:
- single-paper onboarding (
--paper example) - multi-paper repository workflows (
make assets-all,make check-all) - profile-specific exemplar demos (
make exemplars)
uv run truthweave create-paper <paper_id>
# Or copy from an existing paper:
uv run truthweave create-paper <paper_id> --from <base_paper_id>Conference-specific .cls/.sty files should be placed in papers/<paper_id>/styles/.
uv run truthweave create-exp <exp_name>AI Collaboration Contract (restrict editable files):
This repository has a fixed structure.
Allowed files to edit:
- conf/exp/<exp_name>.yaml
- src/truthweave/experiments/<exp_name>.py
Do not create or modify any other files/directories.
Run the experiment:
uv run truthweave run exp=<exp_name>uv run truthweave create-dataset <dataset_id>Place raw data files in data/raw/<dataset_id>/.
uv run truthweave create-analysis <analysis_name>
make analysis NAME=<analysis_name>
# Or run directly:
uv run python -m truthweave.analysis.<analysis_name>Sync metrics, figures, and tables to the paper:
uv run truthweave build-paper-assets --paper <paper_id>The paper should use \input{auto/variables.tex} and reference macros instead of hardcoded numbers.
Select a deterministic domain contract in brief.yml:
research_profile: simulation_abmBuilt-in profiles live under profiles/ and strengthen the generic trust stack with domain-specific admissibility rules. Current starter packs include:
simulation_abm: requires explicit simulation environment/seed/config declarations and run-backed evidencefinance_ml: requires temporal split metadata, benchmark/baseline declarations, and leakage-sensitive evaluation fieldsformal_methods: requires proof checker/witness declarations and exact or manifest-based verification expectations
Validate and export the profile compliance report with:
uv run truthweave validate-profile --paper <paper_id>
uv run truthweave profile-report --paper <paper_id> --format mdProfile reports are written to artifacts/profiles/<paper_id>/profile_report.json and artifacts/profiles/<paper_id>/profile_report.md.
Use these two papers as canonical onboarding references:
-
finance_exemplarThis demonstrates a finance ML contract with an explicit temporal split, leakage controls, declared baselines, provenance-aware synthetic-market data, reviewer packet export, and replayable verification. Inspect:papers/finance_exemplar/brief.yml,artifacts/profiles/finance_exemplar/profile_report.md,artifacts/packets/finance_exemplar/packet.md,artifacts/verification/finance_exemplar/verification_report.md. The important product point is that synthetic data is admissible only because the brief explicitly setsdata_regime: synthetic_market; the benchmark corpus shows the blocked real-market substitute case. -
formal_methods_exemplarThis demonstrates a formal methods contract with local checker/witness artifacts, exact-match witness verification, file-presence checks for proof objects, and packet/report exports tied to a declared proof bundle. Inspect:papers/formal_methods_exemplar/brief.yml,papers/formal_methods_exemplar/proofs/checker.txt,artifacts/profiles/formal_methods_exemplar/profile_report.md,artifacts/packets/formal_methods_exemplar/packet.md,artifacts/verification/formal_methods_exemplar/verification_report.md.
Build both canonical demos with:
make exemplarsBind each brief claim to concrete repo artifacts:
uv run truthweave scaffold-evidence --paper <paper_id>
uv run truthweave validate-evidence --paper <paper_id>
uv run truthweave claim-report --paper <paper_id> --format mdevidence.yml references canonical claim_id values from brief.yml and points to deterministic artifacts such as:
- generated variables in
auto/variables.tex - manifest pointers in
auto/MANIFEST.json - concrete files under
runs/,papers/<paper_id>/figures/, orpapers/<paper_id>/tables/
The generated claim ledger is written to artifacts/claims/<paper_id>/claim_ledger.json.
Declare the admissible data acquisition contract before relying on evidence:
uv run truthweave scaffold-provenance --paper <paper_id>
uv run truthweave validate-provenance --paper <paper_id>
uv run truthweave provenance-report --paper <paper_id> --format mddata_sources.yml is repo-local and deterministic. It records each source_id declared by brief.yml, its acquisition mode, reproducibility level, and local pointers such as files, directories, and manifest references. The generated provenance ledger is written to artifacts/provenance/<paper_id>/provenance_ledger.json.
Export a reviewer-facing trust packet that aggregates the current thesis, claims, evidence, sources, references, and reproducibility caveats:
uv run truthweave reviewer-packet --paper <paper_id> --format mdThis writes:
artifacts/packets/<paper_id>/packet.jsonartifacts/packets/<paper_id>/packet.mdartifacts/packets/<paper_id>/claims.csvartifacts/packets/<paper_id>/sources.csvartifacts/packets/<paper_id>/rerun_checklist.md
The packet is intended for reviewers, coauthors, and future maintainers who need a compact audit bundle without reading the whole repo first.
Export and run a deterministic verification profile for major claims:
uv run truthweave verification-report --paper <paper_id> --format md
uv run truthweave verify-paper --paper <paper_id> --format mdThis writes:
artifacts/verification/<paper_id>/verification_report.jsonartifacts/verification/<paper_id>/verification_report.mdartifacts/verification/<paper_id>/verification_targets.csvartifacts/verification/<paper_id>/replay_profile.md
The reviewer packet is for inspection. The verification harness is for executable replay and comparison against declared evidence targets.
TruthWeave also ships a deterministic positive/negative benchmark corpus under benchmarks/cases/. These cases are not ordinary papers; they are product regression fixtures showing:
- admissible profiled papers that pass
- warning-only cases that remain inspectable
- blocked shortcuts that violate domain policy
Run the corpus with:
uv run truthweave benchmark-contracts --format mdThis writes:
artifacts/benchmarks/benchmark_report.jsonartifacts/benchmarks/benchmark_report.md
Each case includes an expectation.yml that records expected pass/fail behavior for validate-profile, check --mode ci, verify-paper, and build-paper, plus expected blocker categories where relevant.
uv run truthweave build-paper --paper <paper_id>Requires latexmk or similar LaTeX tools installed.
uv run truthweave check --paper <paper_id> --mode dev
uv run truthweave check --paper <paper_id> --mode ci- dev mode: STRUCTURE/PAPER_NUMBERS produce warnings only
- ci mode: STRUCTURE/PAPER_NUMBERS cause failures
| Symptom | Cause | Solution |
|---|---|---|
| MANIFEST is stale | Assets not regenerated | uv run truthweave build-paper-assets --paper <paper_id> |
| No runs found | Experiment not executed | uv run truthweave run exp=<exp_name> |
| Structure check fail | Repository layout violation | Use scaffolding commands to restructure |
| Manual inline numbers detected | Hardcoded numbers in .tex |
Replace with macros or append % truthweave-allow-number |
- Papers live under
papers/<paper_id>/withtruthweave.yml,brief.yml,references.yml,data_sources.yml, andevidence.yml. brief.ymlis the canonical source of claim IDs, workflow phase state, and optionalresearch_profile.data_sources.ymlrecords admissible source acquisition and provenance state for declaredsource_idvalues.evidence.ymlbinds each claim to concrete repo artifacts, generated values, manifests, and verification metadata.truthweave discoverwritesartifacts/manifests/papers_index.json.truthweave build-paper-assets --paper <paper_id>writespapers/<paper_id>/auto/variables.texandpapers/<paper_id>/auto/MANIFEST.json.truthweave provenance-report,claim-report,reviewer-packet,verification-report, andprofile-reportwrite the corresponding ledgers and human-readable exports underartifacts/.truthweave verify-paper --paper <paper_id>enforces deterministic claim verification and exits nonzero when required verification targets fail.truthweave build-paper --paper <paper_id>builds the LaTeX paper using the engine declared intruthweave.yml.- Make targets include
make assets,make refs,make provenance,make profile,make claims,make review,make packet,make verify,make paper,make benchmarks, andmake exemplars.
uv run truthweave create-paper mypaper- Fill
brief.yml, optionally selectresearch_profile, and runuv run truthweave validate-brief --paper mypaper - Validate the domain contract with
uv run truthweave validate-profile --paper mypaper - Declare references in
references.ymland runuv run truthweave sync-refs --paper mypaper - Declare required source IDs in
brief.ymlanddata_sources.yml, then runuv run truthweave validate-provenance --paper mypaper - Run experiments with
uv run truthweave run exp=<exp_name> - Sync paper assets with
uv run truthweave build-paper-assets --paper mypaper - Bind claims with
uv run truthweave validate-evidence --paper mypaper - Build the provenance ledger with
uv run truthweave provenance-report --paper mypaper --format md - Build the claim ledger with
uv run truthweave claim-report --paper mypaper --format md - Run
uv run truthweave review-thread --paper mypaper --phase draft_reviewed --format md - Generate the reviewer packet with
uv run truthweave reviewer-packet --paper mypaper --format md - Verify major claims with
uv run truthweave verify-paper --paper mypaper --format md - Export profile compliance with
uv run truthweave profile-report --paper mypaper --format md - Run
uv run truthweave check --paper mypaper --mode ci - Build the PDF with
uv run truthweave build-paper --paper mypaper
- Add a new case under
benchmarks/cases/<case_id>/ - Provide
expectation.yml - Add the minimal
paper/overlay files needed to express the case - Add any repo-root support files under
support/if the case needs them - Run
uv run truthweave benchmark-contracts --case <case_id> --format md - Confirm the observed contract behavior matches the expectation file
This corpus is the main regression harness for domain-policy evolution. If a profile rule changes intentionally, update the relevant expectation file and keep the case minimal and explicit.
uv run truthweave create-exp myexp- Ask AI to edit ONLY the created files
- Add or update the matching claim entry in
brief.yml uv run truthweave approve-phase --paper <paper_id> --phase experiment_readyuv run truthweave run exp=myexp
uv run truthweave create-analysis my_analysis- Ask AI to edit ONLY the created file
make analysis NAME=my_analysis
uv run truthweave create-dataset mydata- Place raw files into
data/raw/mydata/
This repository includes structured skills for AI agents (Codex, GitHub Copilot, etc.) in .codex/skills/:
- Purpose: Rebuild paper assets (figures/tables/variables) deterministically for a paper_id
- Usage: Automatically invoked when AI needs to regenerate paper outputs
- Key Commands:
uv run truthweave build-paper-assets --paper <paper_id> - Constraints: Never manually edit
papers/<paper_id>/auto/
- Purpose: Run TruthWeave CI checks, diagnose failures, and propose fixes
- Usage: Automatically invoked for validation and troubleshooting
- Key Commands:
uv run truthweave check --paper <paper_id> --mode ci - Capabilities: Detects stale assets, missing metadata, structure violations
To add or modify skills:
- Create/edit skill in
.codex/skills/<skill_name>/SKILL.md - Follow the frontmatter format:
--- name: skill-name description: Brief description ---
- Include: Inputs, Rules, Output format, Remediation playbook
- Skills are automatically available to AI agents
See AGENTS.md for the agent contract and editing constraints.
When collaborating with AI agents, pass the scaffolded file list or refer the agent to AGENTS.md. The default contract is to edit only explicitly allowed files and never hand-edit generated outputs under papers/<paper_id>/auto/.
uv run truthweave create-paper demo_paper
uv run truthweave validate-brief --paper demo_paper
uv run truthweave validate-provenance --paper demo_paper
uv run truthweave run exp=example
uv run truthweave build-paper-assets --paper demo_paper
uv run truthweave sync-refs --paper demo_paper
uv run truthweave provenance-report --paper demo_paper --format md
uv run truthweave claim-report --paper demo_paper --format md
uv run truthweave review-thread --paper demo_paper --phase draft_reviewed --format md
uv run truthweave reviewer-packet --paper demo_paper --format md
uv run truthweave verify-paper --paper demo_paper --format md
uv run truthweave build-paper --paper demo_paper
uv run truthweave check --paper demo_paper
make assets-all
make refs-all
make provenance-all
make claims-all
make review-all
make packet-all
make verify-all
make paper-all
make check-allconf/pipeline.yaml defines what counts as the latest run and which sources flow into assets.
See LICENSE file for details.