Skip to content

0.0.3 — 2026-05-15

Choose a tag to compare

@github-actions github-actions released this 15 May 04:21
cc6cbc2

Release Notes

This is the diagnostic-surfaces and scenario-slicing release: instance
gains an oLRP error decomposition (Oksuz et al.), a detection-family
calibration summarizer (ECE / MCE / reliability), and a manifest-driven
slice-and-aggregate lane that runs one matching pass across N scenario
cells. Panoptic picks up boundary PQ. No paradigm shifts, no
crates.io additions — every kernel slots into the existing
vernier-core / vernier-panoptic / vernier-semantic surface.

Added

  • LRP / oLRP error decomposition (ADR-0043, ADR-0044, ADR-0045) —
    Oksuz et al. (ECCV 2018 / TPAMI 2021) Localization Recall Precision
    as an opt-in metric alongside AP. vernier.instance.optimal_lrp(gt, dt, iou=Bbox()|Segm()|Boundary()|Keypoints()) decomposes detection
    performance into oLRP_Loc + oLRP_FP + oLRP_FN, minimised over a
    per-class confidence threshold tau. CLI gains --metric {ap,olrp}
    with ap preserving the existing headline-table contract. The Rust
    core lives in crates/vernier-core/src/lrp/; the ADR-0005 firewall
    is held (no edits to matching.rs / accumulate.rs / evaluate.rs).
    Pure-NumPy oracle is the correctness contract (ADR-0043);
    kemaloksuz/LRP-Error is an opt-in tripwire, not a parity gate.
    vernier.panoptic.optimal_lrp is a typed NotImplementedError stub
    — panoptic predictions carry no per-segment score so the tau sweep
    has nothing to scan; extension is a follow-up ADR.
  • Boundary Panoptic Quality (ADR-0025 §Z1/Z2 amendment) —
    PanopticEvaluator(boundary=True, dilation_ratio=0.02) now ships
    under both parity_mode="strict" (bit-exact reproduction of
    bowenc0221/boundary-iou-api's coco_panoptic_api/evaluation.py
    at SHA 37d25586a677) and parity_mode="corrected" (deterministic,
    snapshot-based; segment-id-sorted iteration). Composition is
    iou = min(mask_iou, boundary_iou) — identical to the instance
    Boundary case (the prior Q3 row of boundary-iou-quirks.md had
    miscalled this; corrected in the same amendment). FN/FP attribution
    is unchanged; U6/U7/V1-V7/W1/W7 stand. The streaming runner threads
    boundary state per image with BoundaryScratch reuse, and
    distributed-eval partials hash the dilation_ratio into
    params_hash so silent boundary/instance partial mixing is rejected
    at envelope-validation time. No FORMAT_VERSION bump. Cityscapes
    panoptic (Z3) remains deferred.
  • Detection-family calibration summarizer (ADR-0018) —
    ECE / MCE / reliability table for bbox / segm / boundary /
    keypoints. Opt-in via Evaluator.evaluate(..., calibration=True);
    the lazy result.calibration(iou=..., n_bins=15, binning="quantile", min_score=0.05, per_class=False, ...) re-fold
    returns a vernier.calibration.CalibrationResult (polars
    reliability / per_class plus scalar ece / mce). Re-folding
    with different params does not re-run matching. Streaming pairing:
    BackgroundEvaluator.finalize_with_cells() plus the
    vernier.calibration.StreamingSnapshot wrapper. Clean-room NumPy
    oracle is the correctness contract; 16/16 parity bit-equal at
    strict mode. Panoptic and semantic calibration are deferred
    (data-model prerequisites per the ADR's per-paradigm shape map).
  • Slice-and-aggregate (ADR-0046) — manifest-driven scenario
    slicing across all three paradigms plus the vernier aggregate
    fan-in verb. Python:
    Evaluator.evaluate(..., manifest=..., cross_axes=...) accepts a
    dict, JSON / CSV path, or Arrow PyCapsule manifest and returns
    EvalResult.slices as a polars DataFrame (one row per
    (axis, value) cell). CLI: vernier eval --manifest weather.json [--cross weather,time_of_day] [--label NAME] [--metric {ap,olrp}]
    emits a v2 envelope; un-partitioned vernier eval keeps emitting
    v1 verbatim. vernier.aggregate(results, manifest, *, baseline=None, metric=None) and vernier aggregate result1.json result2.json --manifest runs.json --baseline clean fan N runs
    into a comparative table with <metric> (mPC) and
    <metric>__rpc (rPC) columns when --baseline is set. The
    tables= + manifest= cross product is a deliberate non-feature
    with a client-side recipe at
    docs/how-to/per-class-by-slice.md.
    New reference schemas: manifest-schema.md,
    aggregate-schema.md.

Install vernier-cli 0.0.3

Install prebuilt binaries via shell script

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/NoeFontana/vernier/releases/download/v0.0.3/vernier-cli-installer.sh | sh

Install prebuilt binaries via powershell script

powershell -ExecutionPolicy Bypass -c "irm https://github.com/NoeFontana/vernier/releases/download/v0.0.3/vernier-cli-installer.ps1 | iex"

Download vernier-cli 0.0.3

File Platform Checksum
vernier-cli-aarch64-apple-darwin.tar.xz Apple Silicon macOS checksum
vernier-cli-x86_64-pc-windows-msvc.zip x64 Windows checksum
vernier-cli-aarch64-unknown-linux-gnu.tar.xz ARM64 Linux checksum
vernier-cli-x86_64-unknown-linux-gnu.tar.xz x64 Linux checksum