Skip to content

feat(openreview): public adapter — search/venue/paper/reviews#1294

Merged
jackwener merged 3 commits intomainfrom
feat/openreview-adapter
May 4, 2026
Merged

feat(openreview): public adapter — search/venue/paper/reviews#1294
jackwener merged 3 commits intomainfrom
feat/openreview-adapter

Conversation

@jackwener
Copy link
Copy Markdown
Owner

Summary

OpenReview is the open peer-review platform used by ICLR / TMLR / COLM and many ML workshops. Its v2 API (api2.openreview.net) exposes everyone-readable submissions, reviews, and decisions without auth, so all four commands run with browser: false.

  • openreview search <query> — full-text search via /notes/search?term=...&type=terms
  • openreview venue <venue> — list venue submissions; accepts either a display name (matched against content.venue, e.g. "ICLR 2024 oral") or a full invitation id (e.g. "ICLR.cc/2025/Conference/-/Submission"); offset pagination
  • openreview paper <id> — single-paper detail with full abstract
  • openreview reviews <forum> — paper + threaded reviews/decisions/comments, ordered chronologically with paper lifted to row 0; classifies notes via invitation tail (REVIEW / DECISION / REBUTTAL / COMMENT / META_REVIEW / WITHDRAWAL)

Listing IDs round-trip into paper / reviews (id === forum for top-level submissions). PDF URLs normalized to absolute https://openreview.net/pdf/.... pdate falls back to cdate when missing, formatted as YYYY-MM-DD.

Agent-native contract (lessons applied from #1292 / #1293)

  • All limits/offsets/ids fail-fast with typed errors (ArgumentError / EmptyResultError / CommandExecutionError) — no silent clamping, no Math.min/max, no empty-array fallbacks.
  • coerceInt accepts numeric strings but rejects floats / NaN / 'abc' so CLI argv coercions don't silently drift.
  • openreviewFetch wraps fetch rejection, non-2xx, 404 (mapped to null so callers throw a contextual EmptyResultError), and res.json() parse failures all into CommandExecutionError. Network/API failures never look like empty results.
  • requireForumId validates [A-Za-z0-9_-]{6,20} so bad ids fail before any network call.
  • noteToRow falls back from content.authorsauthorIds (with ~ / trailing-digit cleanup) only when the primary field is missing — never silently masks a partial response.
  • reviews per-row truncation has a 200-char floor (no Math.max).

Output Columns

Command Columns
search rank, id, title, authors, venue, pdate, url
venue rank, id, title, authors, keywords, primary_area, pdate, pdf, url
paper id, title, authors, keywords, venue, venueid, primary_area, abstract, pdate, pdf, url
reviews type, author, rating, confidence, text

Test plan

  • npm run typecheck
  • npx vitest run clis/openreview/ — 1 file / 23 tests passing
  • npm run build — manifest regenerated, 655 entries, 4 new openreview entries
  • bash scripts/check-doc-coverage.sh --strict — 106/106
  • Live: openreview search "diffusion model" --limit 3 — returns ranked papers with venue + pdate
  • Live: openreview venue "ICLR 2024 oral" --limit 3 — returns ICLR 2024 orals with keywords/primary_area/pdf
  • Live: openreview paper KS8mIvetg2 — full abstract intact
  • Live: openreview reviews KS8mIvetg2 --max-length 800 — PAPER row first, then 3 REVIEW rows with rating/confidence
  • Live: openreview paper notarealidEMPTY_RESULT; openreview venue "ICLR 2099 oral"EMPTY_RESULT; empty search ""ARGUMENT

jackwener added 3 commits May 4, 2026 19:15
OpenReview is the open peer-review platform used by ICLR / TMLR / COLM
and ML workshops. Its v2 API exposes everyone-readable submissions,
reviews, and decisions without auth, so all four commands run with
`browser: false`.

Commands:
- `openreview search <query>` — full-text search
- `openreview venue <venue>` — list submissions; accepts either a venue
  display name (matched against `content.venue`, e.g. "ICLR 2024 oral")
  or a full invitation id (e.g. "ICLR.cc/2025/Conference/-/Submission")
  via `/-/` heuristic; supports offset pagination
- `openreview paper <id>` — single-paper detail with full abstract
- `openreview reviews <forum>` — paper + threaded reviews/decisions/
  comments, ordered chronologically with paper lifted to row 0;
  classifies notes via invitation tail (REVIEW / DECISION / REBUTTAL /
  COMMENT / META_REVIEW / WITHDRAWAL); per-row truncation via
  `--max-length` (min 200)

Listing IDs round-trip into `paper`/`reviews`. PDF URLs normalized to
absolute `https://openreview.net/pdf/...`. `pdate` falls back to
`cdate` when missing, formatted as `YYYY-MM-DD`.

All limits/offsets/ids fail-fast with typed errors (`ArgumentError`,
`EmptyResultError`, `CommandExecutionError`) — no silent clamping, no
empty-array fallbacks. fetch + json + non-2xx + 404 are wrapped so
network/API failures never look like empty results.

Tests: 23 unit tests covering the column contract, content extraction,
date/PDF normalization, invitation-vs-venue dispatch, error paths
(network/JSON/HTTP), pagination offset accounting, and the reviews
classifier + section joiner + truncation.

Live-verified against api2.openreview.net for search ("diffusion
model"), venue ("ICLR 2024 oral"), paper (KS8mIvetg2), and reviews on
that paper's full thread.
@jackwener jackwener merged commit 3281409 into main May 4, 2026
11 checks passed
jackwener pushed a commit that referenced this pull request May 4, 2026
* docs(cases): add three researcher workflow examples

Add use cases under cases/ that exercise the recently-landed
researcher-friendly adapters:

- daily-rl-research-monitor.md uses arxiv recent + openreview venue
  + hf top to compress a morning paper-skim into one shell pipeline.
- find-paper-implementation.md chains arxiv search/paper + dblp
  search + hf top + openreview search to map a paper's canonical
  record, follow-ups, and community uptake.
- track-conference-papers.md walks openreview venue + reviews to
  shortlist accepted papers and digest review threads in batch.

Each file is a real workflow built on commands from #1289 (arxiv
recent), #1294 (openreview), and #1299 (dblp).

* docs(cases): correct venue ids and forum example to ones that return data

The first revision used "ICLR.cc/2026/Conference" and "ICLR 2026 oral"
as venue strings. Both return EMPTY_RESULT today because the venue is
not open. Update each case to use natural-language venue text that
OpenReview currently exposes ("ICLR 2024 oral", "NeurIPS 2025 oral")
and a real forum id (KS8mIvetg2, "Proving Test Set Contamination in
Black-Box Language Models") in the reviews / paper drill-down. Note
the arxiv free-text-search ranking quirk so the worked DPO example
makes sense.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant