Skip to content

dikw-core v0.6.0

Choose a tag to compare

@helebest helebest released this 21 Jun 11:27
· 29 commits to main since this release
8eb13b3

0.6.0 — config-driven provider API-key env vars (BREAKING); DeepSeek V4 Pro + Gitee bge-m3; horizontal model comparison

Changed

  • BREAKING — provider API-key env var is now config-driven, and DIKW_EMBEDDING_API_KEY
    is removed.
    ProviderConfig gains two required fields, llm_api_key_env and
    embedding_api_key_env, naming the environment variable that holds each leg's key.
    The engine no longer hardcodes any key var name: anthropic_compat/openai_compat
    read exactly the var named in dikw.yml, with no fallback. The dikw-invented
    DIKW_EMBEDDING_API_KEY magic name is gone — embedding keys now use vendor-canonical
    names (OPENAI_API_KEY, GITEE_API_KEY, …) chosen via embedding_api_key_env. The
    LLM/embedding "two separate keys" separation is now achieved by naming distinct vars
    (point both legs at one var to share a key, or at different vars to split vendors)
    rather than by a special name + no-fallback rule. Migration: add the two fields to
    every dikw.yml provider: block (a fresh dikw init scaffold writes them), and in
    .env rename DIKW_EMBEDDING_API_KEY → the vendor var your config names; a same-vendor
    Anthropic+MiniMax .env that reused ANTHROPIC_API_KEY for a MiniMax key should move
    the MiniMax key to MINIMAX_API_KEY and set llm_api_key_env: MINIMAX_API_KEY. Wipe
    the local evals/.cache/snapshots/ after upgrading (its snapshot dikw.ymls predate
    the fields). /v1/health's api_key_present and the dikw client check probe now key
    off the configured var; the tools/e2e_verify.py real-leg gate derives its required
    keys from the active profile's provider.{llm,embedding}_api_key_env.

Added

  • DeepSeek V4 Pro (LLM) + Gitee AI bge-m3 (embeddings) support — config-only. DeepSeek
    runs via the existing anthropic_compat protocol against its Anthropic-compatible
    endpoint (llm_base_url: https://api.deepseek.com/anthropic, llm_model: deepseek-v4-pro,
    key in DEEPSEEK_API_KEY); DeepSeek ignores the cache_control field the provider
    sends (no error — only the Anthropic prompt-cache discount is absent, same cost note as
    openai_compat). bge-m3 runs via openai_compat embeddings against Gitee
    (embedding_base_url: https://ai.gitee.com/v1, embedding_model: bge-m3,
    embedding_dim: 1024, embedding_batch_size: 16, key in GITEE_API_KEY). No engine
    code; a committed reference config ships at tests/fixtures/live-deepseek-gitee-bgem3.dikw.yml.
    See docs/providers.md.
  • Horizontal model-comparison harness (evals/tools/compare_models.py). A dev tool
    (not shipped in the wheel) that runs the same eval dataset against N model arms and emits
    an arm-by-metric comparison matrix + per-arm JSON. compare compares embedding models
    via retrieval eval (deterministic, 1 run/arm: hit@k / mrr / nDCG@10 / recall@100);
    compare-synth compares LLM models via synth eval (N runs/arm + a Welch t-test of each
    arm vs the baseline arm: grounding / atomicity / duplicate / wikilink / language, plus judge
    dims with --judge). Each arm carries a full provider: block, so two same-protocol
    vendors (DeepSeek + MiniMax) resolve distinct keys via their *_api_key_env. Reuses the
    tested statistics from ab_experiment.py and the direction rule from client/baseline.py.
    See evals/README.md and docs/providers.md.
  • Real-environment end-to-end verification harness (tools/e2e_verify.py). A dev
    tool (not shipped in the wheel) that drives every dikw client verb against a
    live server in one of two throwaway environments, then destroys it: --mode local
    (temp-dir base + long-lived dikw serve on SQLite) and --mode docker (server +
    pgvector Postgres via a generated compose project, image built from the local
    working tree
    — not the released PyPI examples/docker/Dockerfile). CLI coverage is
    asserted against the live Typer tree, so adding a verb without a sequence step fails
    the run. Provider posture is tiered + skip-loud: structural legs (ingest --no-embed,
    pages/graph/lint/delete/tasks) run with no keys; real legs
    (check/embed/synth/vector-retrieve/eval) run when the keys named by the
    active profile's provider.{llm,embedding}_api_key_env are present (from .env)
    and SKIP loudly otherwise. Both modes
    use a free host port (never a fixed 8765) so concurrent runs don't collide; docker
    teardown is guaranteed (down -v --rmi local removes containers, volumes and the
    built image
    ; --prune sweeps crashed-run leftovers by label/name). --observe wires the
    docs/observability OTel stack and surfaces a Jaeger trace link on failure. Registered
    as a cli/server/client leg in the dikw-core-verify skill; wrapped by
    tests/test_e2e_verify_{local,docker}.py (-m slow). Default provider profile is the
    committed MiniMax + Qwen3-Embedding-0.6B template; swap vendor/model via
    --provider-profile <dikw.yml>.
  • dangling_provenance drift lint kind — flag a K/W page citing a deleted source
    (read-only).
    A new deterministic lint kind that flags a knowledge/ (K) or
    wisdom/ (W) page whose sources: provenance edge points at a source file that
    no longer exists on disk. It is read-only — surfaced, never auto-repaired: there
    is no fixer (like duplicate_title, lint propose reports it for human triage and
    lands every issue in skipped), because the sources: frontmatter is the user's to
    edit (ADR-0001's non-cascade design — delete never rewrites another page's content).
    Disk is the source of truth (ADR-0005), so detection stats the file, not the
    documents projection: a source present on disk but not yet ingest-ed (no active D
    row) is not dangling — there the fix is ingest, not editing frontmatter. A
    provenance path that escapes the base is dangling and its external target is never
    stat-ed. Runs in the default lint scan, sharing the per-page provenance read with
    missing_provenance (zero extra storage round-trips); suppressible per page via
    lint: {skip: [dangling_provenance]}. Final slice of ADR-0005
    (filesystem-as-source-of-truth) — the arc (the delete verb + missing_file /
    untracked_file / stale_index / dangling_provenance drift kinds) is now complete,
    and docs/design.md gains a "Disk is the source of truth" invariant section.
  • stale_index + untracked_file drift lint kinds — re-project hand-edited /
    hand-written K/W pages (and unlock hand-authored knowledge pages as first-class).

    Two new deterministic lint kinds, both fixed by one ReindexPageFixer:
    stale_index flags an active knowledge/ (K) or wisdom/ (W) row whose on-disk
    body hash no longer matches the indexed hash (a hand-edit outside dikw);
    untracked_file flags a .md / .markdown file under knowledge/ or wisdom/
    with no active row (hand-written, or restored outside dikw). Both propose a single
    reindex_page op that re-projects the current on-disk bytes through
    persist_knowledge / persist_wisdom — re-chunk, re-link, re-provenance,
    inline-or-deferred re-embed — without rewriting the file (disk is the source of
    truth, ADR-0005) and without re-running synth (so a hand-edit is preserved, not
    regenerated from the D-source). Run in the default lint scan; fix with
    dikw client lint propose --rule stale_index (or untracked_file) →
    dikw client lint apply <task_id>. untracked_file closes the "hand-write a K page,
    the engine never indexes it" gap and makes hand-authored pages first-class;
    stale_index closes the "edit a K/W file on disk, the storage projection silently
    drifts" gap. Detection is near-free: stale_index reuses the per-page read the
    other lexical checks already do (no separate mtime-prefiltered hashing pass), and
    untracked_file is a cheap disk walk (stat + membership, no read) rooted at
    knowledge/ + wisdom/ so the sibling trash/ / .dikw/ / assets/ trees are
    naturally excluded and .gitkeep / non-markdown files never trip. Both are K/W-only
    (D-layer adds/edits stay ingest's job); a page failing its re-projection is
    deactivated and surfaced via ApplyReport.persist_errors, successes under
    ApplyReport.reindexed_documents. Third slice of ADR-0005 (dangling_provenance
    is the fourth, above). This supersedes the never-built dikw client reindex <path> — the
    reindex story is now dikw client lint propose --rule stale_index (or
    --rule untracked_file) followed by dikw client lint apply <task_id>.
  • missing_file drift lint kind — purge orphaned document rows (D/K/W). A new
    deterministic lint kind (with MissingFileFixer) that detects an active
    documents row whose backing file is gone from disk — a sources/ (D),
    knowledge/ (K), or wisdom/ (W) file deleted outside dikw — and proposes a
    single purge_document op that drops the orphaned row + its outgoing edges via
    Storage.delete_document. Runs in the default lint scan; fix it with
    dikw client lint propose --rule missing_filedikw client lint apply <task_id>.
    Closes the original gap where deleting a source file left its row stuck at
    active=True forever (run_lint never scanned D rows). Inbound [[wikilink]]s
    from live pages are left to surface as broken_wikilink (delete_document clears
    only outgoing edges; the kind never rewrites a user's page); a truly dangling edge
    (both ends purged) clears itself. The op carries the resolved layer, re-checks
    at apply time that the file is still absent and the row still exists (propose→apply
    race / restored-file safety), and reports purged paths under
    ApplyReport.purged_documents. Second slice of ADR-0005
    (filesystem-as-source-of-truth); untracked_file / stale_index /
    dangling_provenance land in follow-ups.
  • dikw client delete <path> — first-class document deletion (D/K/W). A new
    immediate verb (api.delete_page / POST /v1/base/delete) that deletes any
    registered document — a sources/ file, a knowledge/ page, or a wisdom/
    page — by path: it purges the storage row + its outgoing links/provenance
    (Storage.delete_document) and soft-deletes the on-disk file to
    <base>/trash/<layer>/<rel> with an audit trashed: block (recover with a plain
    mv back into place). It is symmetric with wisdom write: explicitly-targeted,
    immediate (no propose/apply — trash/ is the safety net), --wait by default,
    --reason for an audit note. Closes the gap where deletion existed only as a side
    effect of the lint orphan_page/non_atomic_page fixers (K-layer stubs only) —
    arbitrary K pages and all D/W documents were previously undeletable.
    Inbound [[wikilink]]s from live pages are left dangling and surface as
    broken_wikilink on the next dikw client lint — delete never rewrites another
    page. First slice of ADR-0005 (filesystem-as-source-of-truth); the drift lint
    kinds (missing_file / untracked_file / stale_index / dangling_provenance)
    land in follow-ups. Internally, the soft-delete primitive move_to_trash was
    promoted out of domains/knowledge/lint_fix.py into the shared, layer-agnostic
    domains/trash.py so D/W deletes reuse it.

Fixed

  • OTel validation stack now runs on arm64 (Apple Silicon). The
    docs/observability/docker-compose.yml collector was pinned to
    otel/opentelemetry-collector-contrib:0.116.0, whose arm64 binary is
    dynamically linked (interpreter /lib/ld-linux-aarch64.so.1) while the image
    is FROM scratch — so on Apple Silicon the container exited immediately with
    exec /otelcol-contrib: no such file or directory and the stack came up with
    jaeger/prometheus/grafana healthy but zero traces. Bumped to 0.117.0,
    the nearest release that restored the static arm64 build (verified: boots
    clean against the existing otel-collector-config.yaml); amd64 was
    unaffected. This also fixes tools/e2e_verify.py --observe on arm64, which
    drives this same compose file.

  • Synth front-matter is whitelisted to tags; write_page guards reserved
    keys.
    Enforces in code the forbidden-key policy 0.5.3 added to the synth prompt
    (the "Synth forbids sources/lint in emitted front-matter" entry below):
    that change only reworded the prompt — the parser still routed every non-tags
    key into extras and write_page merged it over the engine's authoritative
    fields, so a disobedient LLM (or a hand-edited file flowing through lint-apply's
    update_page) could still override sources/category/id, inject a lint:
    block that suppressed lint on a fresh page, or — via a handler/content key
    colliding with frontmatter.Post(**meta) — silently collapse the whole file to a
    literal string. Now: the synth parser (_parse_one_page_block) drops every
    non-tags front-matter key the LLM emits (title comes from the body # H1,
    category/slug from the <page> attributes, the rest engine-managed), covering
    every LLM-sourced page (synth fan-out + the lint grounded/split/merge fixers that
    share the parser) at one point; and the shared write_page sink filters caller
    extras against _RESERVED_FRONTMATTER_KEYS and assigns metadata via
    post.metadata.update, mirroring the W-layer write_wisdom_file guard. User
    extras (e.g. an Obsidian aliases: list) still pass through, and the lint:
    block written by orphan_page.mark_as_leaf is deliberately not reserved.
    Behaviour-preserving for conformant synth output (which emits only tags).

Security

  • Raise the python-multipart floor to >=0.0.31 (security floor) — clears the
    open Dependabot form-parsing advisories.
    The declared floor was >=0.0.26, which
    let the published wheel resolve a python-multipart vulnerable to the
    multipart/form-data resource-exhaustion / DoS chain (GHSA-5rvq-cxj2-64vf and the
    <0.0.31 follow-ups GHSA-v9pg-7xvm-68hf / GHSA-6jv3-5f52-599m / GHSA-vffw-93wf-4j4q).
    The lock was already bumped to 0.0.31 by Dependabot (#209), but the manifest floor
    still permitted a downstream install below the fix; raising it hardens the
    published-wheel contract and, by re-touching uv.lock, lets GitHub's dependency
    graph re-ingest the already-patched resolution (python-multipart 0.0.31,
    starlette 1.3.1) so the eight stale alerts auto-resolve. Starlette's matching
    request.form() limit-bypass / DoS fixes (≥1.3.1, GHSA-82w8-qh3p-5jfq and the
    <1.1.0 advisories) already ship transitively via fastapi (locked) — it is not a
    direct dependency, so no direct pin is added. Metadata-only: no resolved-version or
    code change (uv.lock diff is the recorded root specifier alone).