Verify-or-Trust v0.1.0
Verify-or-Trust v0.1.0
First public release of Verify-or-Trust, a verifiable-reward benchmark for calibrated verification in LLM-orchestrated biology foundation-model pipelines.
Included
- Installable Python package and
votCLI. - GEARS/Norman shipped substrate for out-of-the-box LLM-free reproduction.
- Panel builder, baseline value proof, grading, and mocked environment tests.
- Benchmark card, schema documentation, release process, and results tables.
- Machine-readable
artifact_manifest.jsonand JSON schemas. - Strict generated JSONL handling: missing
fm_log2FCvalues are emitted as JSONnull, not non-standardNaN. - CI matrix on Python 3.10 and 3.12 with lint, tests, K1 reproduction, and public-release validation.
Linked Dataset
The Hugging Face dataset hosts the released substrate table and Norman cell subset for live run_de reproduction:
https://huggingface.co/datasets/jang1563/verify-or-trust
Latest dataset card snapshot observed during release prep:
- HF commit:
32c547f422b4a4963a9c39c3b82f40b4a20043a6
Snapshot
- Git commit:
718962d12959129593e3190ccf27eab372796435 - GitHub Actions CI: passing on
main