Skip to content

Verify-or-Trust v0.1.0

Choose a tag to compare

@jang1563 jang1563 released this 15 Jun 21:03
· 9 commits to main since this release

Verify-or-Trust v0.1.0

First public release of Verify-or-Trust, a verifiable-reward benchmark for calibrated verification in LLM-orchestrated biology foundation-model pipelines.

Included

  • Installable Python package and vot CLI.
  • GEARS/Norman shipped substrate for out-of-the-box LLM-free reproduction.
  • Panel builder, baseline value proof, grading, and mocked environment tests.
  • Benchmark card, schema documentation, release process, and results tables.
  • Machine-readable artifact_manifest.json and JSON schemas.
  • Strict generated JSONL handling: missing fm_log2FC values are emitted as JSON null, not non-standard NaN.
  • CI matrix on Python 3.10 and 3.12 with lint, tests, K1 reproduction, and public-release validation.

Linked Dataset

The Hugging Face dataset hosts the released substrate table and Norman cell subset for live run_de reproduction:

https://huggingface.co/datasets/jang1563/verify-or-trust

Latest dataset card snapshot observed during release prep:

  • HF commit: 32c547f422b4a4963a9c39c3b82f40b4a20043a6

Snapshot

  • Git commit: 718962d12959129593e3190ccf27eab372796435
  • GitHub Actions CI: passing on main