Skip to content

v0.3.0

Latest

Choose a tag to compare

@justram justram released this 12 May 06:57

v0.3.0

Release focused on keeping pi-serini current with upstream Pi package migration while preserving the benchmark workflow surface.

Added

  • Added a BrowseComp-Plus external-run adapter at src/adapters/import_search_jsonl_run.ts and the package script npm run adapt:search-jsonl-run for normalizing one-JSON-object-per-line search-session artifacts into native run directories.
  • Added focused coverage for the external-run importer and response-confidence calibration helpers.
  • Added README links from @ricky42613 for the Pi-Serini project page and released BrowseComp-Plus run datasets on Hugging Face.

Changed

  • Migrated Pi package dependencies and source imports from @mariozechner/* to @earendil-works/*.
  • Updated @earendil-works/pi-coding-agent and @earendil-works/pi-tui to ^0.74.0 and refreshed package-lock.json.
  • Replaced Ajv-backed TypeBox validation with TypeBox v1 native compiler APIs while preserving protocol validation behavior and structured error metadata.
  • Updated judge-evaluation calibration to use response self-reported confidence against gold-answer correctness.

Fixed

  • Fixed benchmark launches against the current Pi CLI by using the explicit-extension-compatible --no-builtin-tools behavior.
  • Fixed shared-BM25 liveness detection for root-relative log paths.
  • Fixed sharded shared-BM25 merge metadata handling so merged runs synthesize canonical merged-level metadata instead of failing on shard-local metadata differences.
  • Fixed calibration computation to include a final partial confidence bin.

Upgrade notes

  • Install/update Pi to 0.74.0 or newer.
  • Use @earendil-works/pi-coding-agent and @earendil-works/pi-tui in extension or SDK imports; the old @mariozechner/* Pi package names are retired upstream.