Release The criterion compiler release (v3.1.0) · asystemoffields/interp-lab

v3.1.0
2a966df
Choose a tag to compare

Filter

View all tags

v3.1.0
2a966df
Choose a tag to compare

Filter

View all tags

asystemoffields tagged this 10 Jun 18:37

Rebuild the natural-language criterion front door around "generate big,
score tiny, verify everything":

- compile-criterion: criterion -> verified prompt dataset + preset, with
  agent (two-phase generation-request flow), llamacpp, and heuristic
  generators; NLI margin gate (pos>=0.7 / neg<=0.3 / >=8 per side,
  recorded exclusions, balance trim) plus real assay validation.
- score-prompts: score any dataset against a criterion hypothesis with a
  tiny zero-shot NLI cross-encoder (new [criteria] extra, ~70M, nothing
  bundled); exact ScoredPrompt JSONL with per-row criterion_score_source.
- Honest degradation: hash-cosine fallback labeled weak everywhere,
  margin gate downgrades to advisory with explicit warnings.
- 2 new MCP tools (21 total): compile_criterion, score_prompts.
- AGENTS.md opens with "operationalize the criterion first".

539 tests (up from 500), all heavy paths stubbed behind factory seams.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!