pi-serini v0.1.0

Initial public release of pi-serini as a reusable, benchmark-driven pi search-agent workspace.

v0.1.0 establishes the current core product shape: index-driven benchmark execution, BM25-backed agentic search, benchmark-aware evaluation, and reproducible run artifacts.

Highlights

Index-driven benchmark and agentic search workflows for:
- MS MARCO v1 Passage with dl19 and dl20
- BrowseComp-Plus with q9, q100, q300, and qfull
BM25-backed pi search extension with search and document-reading flows over Lucene indexes
Shared BM25 RPC execution with single-process, shared-daemon, and sharded shared-daemon launch modes
Benchmark-aware retrieval evaluation, judge evaluation, summarization, and Markdown reporting
Reproducible run manifests via per-run benchmark_manifest_snapshot.json

Included in this release

Benchmarks

browsecomp-plus — default packaged benchmark
msmarco-v1-passage — MS MARCO v1 passage support for dl19 and dl20
benchmark-template — tiny local end-to-end demo benchmark for development and validation

Platform capabilities

Typed benchmark registry and benchmark-aware query/qrels/index resolution
Node.js/TypeScript-first orchestration entrypoints under src/orchestration/
Managed launch presets and operator surfaces under src/operator/
Internal and trec_eval-backed retrieval evaluation paths
Judge evaluation with benchmark-aware mode defaults and validation
BM25 comparison and tuning tooling

Scope note

This release is intentionally index-driven: benchmark runs execute against prepared Lucene indexes.

Document-ingestion-first indexing workflows built around Anserini IndexCollection are planned next, but are not part of v0.1.0.

Getting started

npm run setup:browsecomp-plus
npm run setup:msmarco-v1-passage

BENCHMARK=msmarco-v1-passage \
QUERY_SET=dl19 \
MODEL=openai-codex/gpt-5.4-mini \
npm run run:benchmark:query-set

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.1.0

Choose a tag to compare

Sorry, something went wrong.