Skip to content

dikw-core v0.5.2

Choose a tag to compare

@helebest helebest released this 13 Jun 06:53
· 55 commits to main since this release
b178df1

synth-quality measurement + prompt tuning + post-synth self-check

This release is a synth-quality arc: it makes synth output measurable, then tunes the authoring prompt against those measurements, and adds a self-check the agent layer can gate on.

Highlights

Synth prompt-quality overhaul — Zettelkasten framing + worked English/Chinese examples (Phase 1), existing-page slug disambiguation + priority-create feedback to later fan-out groups (Phase 2), six targeted UP revisions (PR1), and an SP rewrite + cache-friendly prompt layout so the instruction prefix is byte-stable for OpenAI/codex prefix caching (PR2). The llm_max_tokens_synth default rises 2048 → 3072 (~768 tokens/page).

Synth-quality measurement foundation (Phase 0a/0b) — deterministic --eval synth diagnostics (source-chunk coverage, fallback/slug-merge ratios, category distribution) plus four opt-in, $0-by-default LLM judges with bootstrap 95% CIs: fact_entailment_ratio, category_correctness_ratio, wikilink_correctness_ratio, semantic_atomicity_ratio. Adds an A/B experiment harness (Welch t-test, no scipy) and a calibrated --judge-sample auto (n≥25 guarantees ≤±0.2 CI).

dikw client synth --verify — post-synth self-check over just this run's pages: persist / lint / semantic-duplicate legs emit one PASS/FAIL verdict, plus a report-only --judge grounding leg.

title_slug_quality lint — deterministic, zero-false-positive K-page title/slug hygiene (also a synth --verify gated leg).

dikw client eval --against / --write-baseline — machine-readable, direction-aware single-run regression gate.

Provider robustnessanthropic_compat + openai_compat complete() now stream, so the read timeout applies per SSE event (fixes reasoning-model timeouts on long syntheses, e.g. MiniMax-M3) and SDK failures are classified transient vs permanent. dikw client check no longer false-fails reasoning-model LLMs/embeddings; openai_compat embeddings are re-ordered by response index.

See the CHANGELOG for the complete, itemized list.

Install: pip install dikw-core==0.5.2 (or dikw-core[postgres]==0.5.2).