Release dikw-core v0.6.2 · OpenDIKW/dikw-core

0.6.2 — optional cross-encoder reranking; atomic writes + concurrency hardening; repositioned as a self-managed knowledge engine

Changed

Docs: repositioned as a self-managed knowledge base engine. README and the
user-facing docs now frame dikw-core as a client/server engine whose server
authoritatively manages the base's directory tree, index, and database — distinct
from Obsidian — with knowledge persisted as an open, portable Markdown format
(same lineage as Karpathy's LLM-Wiki pattern and Google's Open Knowledge Format).
docs/** no longer frames the base as "an Obsidian vault"; the on-disk format is
described as open Markdown openable in any editor. No engine behavior, on-disk
layout, or invariant changed.
README professionalized. Restructured with a documentation index, a
client/server + DIKW mermaid diagram, container/Ruff/status badges, and the
missing dikw client delete / dikw client wisdom write verbs; the maintainer
release mechanics moved out of the README into docs/releasing.md.
The header is now centered with the OpenDIKW logo
(.github/assets/opendikw-avatar.png).

Added

Optional cross-encoder reranking in the retrieval pipeline. A new
RerankProvider seam (providers/base.py) + build_reranker factory +
OpenAICompatReranker (providers/rerank.py, the Jina/Cohere-compatible
/rerank wire shape that Gitee AI / SiliconFlow / Jina / Cohere share) add a
rerank stage to HybridSearcher.search, between RRF fusion and the top-K
truncation: the top retrieval.rerank_candidate_k (default 40) fused
candidates are scored by (query, chunk) relevance, re-ordered, then cut to
the query limit. It recovers precision@k from the recall pool without
changing recall, and does not touch any storage adapter or retrieve's
response shape. Configure via provider.rerank / rerank_model /
rerank_base_url / rerank_api_key_env (+ rerank_timeout_seconds /
rerank_batch_size) and retrieval.rerank_enabled / rerank_candidate_k;
the candidate window is split into rerank_batch_size batches per /rerank
call so it respects per-vendor document caps (Gitee: ≤25). On once configured
(rerank_enabled defaults true; a base that configures no reranker runs no
rerank leg). A reranker is a deterministic scoring model — the same category
as the embedding model, part of scoping not reasoning — so it is
consistent with the "LLMs only enter at synth" invariant; an LLM-as-reranker
is deliberately excluded. On the read path a transient rerank failure degrades
to the fused order, a permanent one fails loud. See
docs/adr/0006-reranker-deterministic-scoping.md,
the docs/providers.md reranking cookbook, and the SciFact ablation in
evals/BASELINES.md.
Community-health files: CONTRIBUTING.md, SECURITY.md, CODE_OF_CONDUCT.md,
and GitHub issue-form templates under .github/ISSUE_TEMPLATE/.

Fixed

Atomic on-disk page writes. Knowledge and wisdom page writes now go through a
shared temp-file-then-os.replace helper (domains/_atomic.py), so a crash or a
full disk mid-write can no longer leave a half-written page at the visible path —
a reader sees the old bytes or the new bytes, never a truncated file. The
trash/ collision loop claims its destination name via O_CREAT | O_EXCL instead
of an exists() probe, closing the window where two concurrent trashings of the
same path could clobber each other.
Task status is terminal-immutable. update_status (both the SQLite and
Postgres task stores) now refuses to overwrite a row that is already
succeeded / failed / cancelled: a late write is a silent no-op, so a cancel that
lands first wins over a runner's trailing failure. An unknown task_id still
raises TaskNotFound.
Atomic base-instance-id creation. <base>/.dikw/base_id is now created with an
exclusive open(..., "x"), so two server processes cold-starting the same base
converge on one id instead of each minting a different one (which silently split
the shared-Postgres task-store scope). DIKW_BASE_INSTANCE_ID still overrides.
Orphan import-staging cleanup is gated like task reaping. Startup no longer
wipes <base>/.dikw/staging/ unconditionally; it does so only when this process
owns the task store exclusively (per-base SQLite, or DIKW_TASK_REAP_ON_START=1),
so a replica sharing a Postgres task store can no longer delete a live peer's
in-flight import staging.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

dikw-core v0.6.2

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

0.6.2 — optional cross-encoder reranking; atomic writes + concurrency hardening; repositioned as a self-managed knowledge engine

Changed

Added

Fixed

Uh oh!