Release Verify-or-Trust v0.1.0 · jang1563/verify-or-trust

Verify-or-Trust v0.1.0

First public release of Verify-or-Trust, a verifiable-reward benchmark for calibrated verification in LLM-orchestrated biology foundation-model pipelines.

Included

Installable Python package and vot CLI.
GEARS/Norman shipped substrate for out-of-the-box LLM-free reproduction.
Panel builder, baseline value proof, grading, and mocked environment tests.
Benchmark card, schema documentation, release process, and results tables.
Machine-readable artifact_manifest.json and JSON schemas.
Strict generated JSONL handling: missing fm_log2FC values are emitted as JSON null, not non-standard NaN.
CI matrix on Python 3.10 and 3.12 with lint, tests, K1 reproduction, and public-release validation.

Linked Dataset

The Hugging Face dataset hosts the released substrate table and Norman cell subset for live run_de reproduction:

https://huggingface.co/datasets/jang1563/verify-or-trust

Latest dataset card snapshot observed during release prep:

HF commit: 32c547f422b4a4963a9c39c3b82f40b4a20043a6

Snapshot

Git commit: 718962d12959129593e3190ccf27eab372796435
GitHub Actions CI: passing on main

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Verify-or-Trust v0.1.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Verify-or-Trust v0.1.0

Included

Linked Dataset

Snapshot

Uh oh!