v0.71.6 — Synth data + build runner go live
Three deferred v0.69.0 stubs go live, IRT grows, and a real soup data augment crash is fixed.
What's New
soup buildmaterialises (#231) — the dbt-for-SFT DAG no longer just dry-runs.soup build manifest.yaml --output-dir out/runs five built-in transforms (identity/drop_empty/lowercase/strip/dedup_exact), rebuildstable/viewmodels from scratch, and re-transforms only the changed rows forincrementalmodels — tracked in a SQLite state store keyed on the row hash and the model's transform+config fingerprint, so a transform change re-runs everything. Custom transforms ride in via the Python API'stransforms=map. Outputs are written atomically;--output-diris symlink-checked before any directory is created.soup data gen-magpiegenerates (#232) — feed an aligned model its chat-template prefix (chatml / llama3 / gemma / mistral auto-detected) and harvest the self-generated instruction + response via raw completion. Live for--provider ollama(/api/generate) andvllm(/v1/completions), both loopback-only;anthropicis rejected (no raw-completion endpoint). Optional--quality-filterdrops low-quality rows; exact-duplicate instructions are de-duplicated.- 2PL / 3PL eval-cost models (#213) —
soup eval irt-subset --model 2pl|3pladds per-item discrimination (and a 3PL guessing floor) on top of the existing 1PL Rasch fit, via joint coordinate-ascent MLE. - Tokenizer-aware memorization probe (#167) —
score_memorization(..., tokenizer=...)/split_prefix(..., tokenizer=...)split on real token-id boundaries and measure echo-overlap over sub-word tokens, catching BPE-level memorization whitespace tokenisation misses (library surface;soup diagnose --tokenizerCLI lands with the live probe runner).
Fixed
soup data augment --provider ollama|vllmno longer crashes (#75) — the command imported a non-existentOllamaProviderand raisedImportErroron every non-OpenAI provider. It now routes through the shared, SSRF-hardened provider factory;--model/--base-urlare honoured, the output path is containment- and symlink-checked, and the write is atomic.
Security
validate_ollama_url/validate_vllm_urlreject the0.0.0.0bind-any wildcard (loopback set is nowlocalhost/127.0.0.1/::1), matching the newervalidate_hub_endpoint/validate_webhook_urlvalidators — reachable now that Magpie threads a user-supplied--base-urlthrough these providers.
Install / Upgrade
pip install --upgrade soup-cliKnown Limitations
soup buildcustom transforms are Python-API only — the 5 built-ins ship via the CLI; a manifest cannot yet name an importable dotted-path transform.gen-magpieisollama/vllmonly —anthropicis rejected (no raw-completion endpoint); OpenAI-compatible servers work via--provider vllm --base-url.- 3PL guessing is an empirical low-ability floor, not a jointly-estimated parameter (joint 3PL MLE is unstable without priors on small eval sets).
- Tokenizer-aware memorization is library-only this release — the
soup diagnose --tokenizerCLI flag lands with the live probe runner (#165). gen-magpiequality filter is the v0.47 keyword/heuristic baseline (Llama-Guard / FineWeb-Edu classifiers remain a[data-pro]deferral).
Full notes: CHANGELOG.md