feat(importer): manifest library + dkg-importer SKILL.md by branarakic · Pull Request #642 · OriginTrail/dkg

branarakic · 2026-05-25T11:51:55Z

Summary

Implements the resumable-import pattern from ADR 0002 (#641) and gives every agent / human importer a single canonical reference for chunked bulk writes against a DKG node.

`scripts/lib/manifest.mjs` — Resumable-import manifest helpers

`createImportManifest` writes the import as an RDF assertion in the project's `meta` sub-graph.
`markPartitionStatus` appends status events (append-only — latest event wins on read).
`loadImportManifest` + `pendingPartitions` resolve current per-partition status in one SPARQL round-trip, so an interrupted import resumes from the next pending partition.
Itself respects the ADR 0002 chunking contract by reusing `DkgClient.writeAssertion`.

`scripts/lib/tests/manifest.test.mjs` — Six unit tests for pure helpers (URI encoding, ontology constants, triple building, status filtering). Daemon-roundtrip coverage stays in the repro suite.

`packages/cli/skills/dkg-importer/SKILL.md` (NEW) — Agent-readable manual codifying:

Chunking contract from ADR 0002 (CHUNK=5000, ROOT_CHUNK=1000)
Worked examples in TypeScript + Python
Manifest pattern walk-through
Canonical URIs from ADR 0003
Anti-patterns + HTTP error handling
One-page cheat sheet

`packages/cli/skills/dkg-node/SKILL.md` — One-line cross-reference under the assertion-write tool table, pointing agents to `dkg-importer/SKILL.md` when pushing >5k quads in one go.

Test plan

`node --test scripts/lib/tests/manifest.test.mjs` → 6/6 pass
No new linter errors
Reviewer: confirm `DkgClient.writeAssertion`'s 500-quad default `batchSize` is the right conservative default for the docs to advertise
Reviewer: confirm the manifest's append-only status pattern is acceptable (SPARQL DELETE/INSERT would be tidier but the daemon doesn't expose it on assertions)

Stack

This PR is stacked on #641 (ADRs). Once #641 lands, this rebases against main.

Related: #596, #636, #640.

Made with Cursor

Update — round 3 (2026-05-25): known caps + manifest hardening

`dkg-importer/SKILL.md` — Known caps with exact error strings (§1.1)

Adds a new §1.1 reference table calling out the three independent daemon caps an importer must handle, with the verbatim daemon error text for each:

Endpoint	Cap	Constant	Error
`POST /api/assertion/<name>/write`	10 MB request body	`MAX_BODY_BYTES`	`HTTP 413` "Request body too large (>10485760 bytes)"
`POST /api/assertion/<name>/promote`	256 KB request body	`SMALL_BODY_BYTES`	`HTTP 413` "Request body too large (>262144 bytes)"
`POST /api/assertion/<name>/promote`	10 MB gossip message	hard-coded in gossipsub publish	`HTTP 500` "Promoted assertion too large for gossip (XXXX KB, limit 10 MB)"

The two /promote caps are independent — same status code (413) for the body cap, but a different limit from /write's — and the 10 MB gossip cap can trip even with a single root URI if that root's transitive triple set is large enough. §5 (Error handling) is rewritten so each failure mode has its own recipe instead of one generic "halve on 413" block.

Numbers and error strings validated empirically during the rc.10 Graphify import. Quoting them inline turns "discover via 413" into "look up before you start", which is the difference between a 5-minute import and a 20-minute import for non-trivial graphs.

`scripts/lib/manifest.mjs` — round-3 hardening

Three round-3 codex line comments fixed (each with a new unit test, 13/13 pass):

L222 — createImportManifest not idempotent on already exists: now catches the daemon's "already exists" error from /api/assertion/create and treats it as successful reuse, so a retry / second-process resume of the same import doesn't crash before reading progress.
L369 — same-millisecond status events: the SPARQL "latest event per partition" filter now includes a deterministic IRI tie-breaker (?ts2 = ?recordedAt && STR(?ev2) > STR(?ev)) so two writes within one ms produce a deterministic ordering instead of nondeterministic status selection.
L375 — empty bindings ≠ complete: loadImportManifest now throws loudly when the SPARQL returns zero bindings instead of silently returning {partitions: []} (which previously made resume callers think the import was complete when the manifest was actually missing / not yet visible in SWM).

Update — round 4 (2026-05-25 22:50): post-empirical-motivation Codex review

Round-4 review against the round-3 push surfaced five issues — three real bugs that would have shipped, one CodeQL finding, and one runnable-example fix. All addressed inline with new unit-test coverage (18/18 pass).

CRITICAL — `loadImportManifest` was reading the bare data graph, not SWM (L391)

/api/query with subGraphName alone routes to the bare data graph did:dkg:context-graph:<cg>/<sub>, NOT the SWM tier …/_shared_memory. Since the manifest is created via the assertion API (create → write → promote), it ONLY exists in SWM. So a peer/restart resume calling loadImportManifest would silently see zero bindings even on a healthy import — and the §L375 "loud failure" fix would then throw "manifest is missing" for a manifest that actually exists.

Fix: plumb graphSuffix (and view for completeness) through DkgClient.query. loadImportManifest now passes graphSuffix: '_shared_memory' so the query hits the right graph. The mock client in tests now also routes by graphSuffix, so a regression where the SWM hint is dropped fails the test suite immediately. A new dedicated test pins the call shape.

CRITICAL — `statusEventUri` same-millisecond ordering was non-deterministic (L84)

The previous shape <ts>-<6-char random> meant two events in the same millisecond would tie on recordedAt AND tie on a random token, leaving SPARQL's lexicographic tie-breaker non-deterministic.

Fix: new shape <ts>-<12-digit monotonic counter>-<6-char random>. The counter is process-local and zero-padded so lexicographic ordering matches call order for same-process same-ms events. Cross-process same-ms collisions fall back to the random component — no worse than before. New regression test asserts strictly-increasing IRIs across 50 back-to-back calls.

CodeQL #90 — double-unescape risk in `unquote` (L138)

The previous chain of four .replace() passes had a textbook double-unescape pitfall: \\\\n (escaped backslash + literal n) could collapse to a newline because the second pass would re-interpret the \ left behind by the first pass.

Fix: single-pass regex .replace(/\\(["\\nr])/g, replacer) so each escape sequence matches exactly once. New regression test round-trips literal \n, \\, \", mixed strings through lit() / unquote() byte-for-byte.

`importId` sanitization for default assertion name (L222)

importUri() / partitionUri() percent-encode any importId, but the daemon's validateAssertionName rejects /, whitespace, and other IRI-unsafe characters. A caller passing importId="corpus/2026-q1" would get valid URIs back but /api/assertion/create would 400.

Fix: new exported defaultManifestAssertionName() that maps unsafe character runs to -, length-clamps to the 256-char daemon limit, and throws a descriptive error if nothing valid remains (instead of crashing later with a cryptic 400). Two regression tests: parameterised sanitization assertions + an end-to-end "create/load with an unsafe importId works" check.

Python sample fix (L127)

The Python snippet in dkg-importer/SKILL.md was missing import os and used an undefined port. Fixed: added the import, defaulted port to int(os.environ.get('DKG_PORT', '9200')), switched to a with block for the token file.

Test coverage

5 new tests added (18/18 total pass) — one per fix, all targeted at the exact failure mode codex flagged, so a future regression on any of these would fail CI immediately.

github-actions · 2026-05-25T11:54:17Z

+    contextGraphId: cgId,
+    assertionName: assertion,
+    subGraphName,
+    entities: [importUri(importId)],


🔴 Bug: only the import root is promoted here. The partition subjects (urn:dkg:import:...#part:...) stay in WM, so after the promote loadImportManifest() cannot reliably read imp:key / imp:initialStatus from SWM and another node cannot resume the import. Promote the partition URIs alongside the import URI (chunked if needed).

Addressed on this branch: createImportManifest now promotes the import root and every partition URI, chunked by ROOT_CHUNK.

github-actions · 2026-05-25T11:54:17Z

+    { subject: evIri, predicate: IMPORT_P.status, object: lit(status) },
+    { subject: evIri, predicate: IMPORT_P.recordedAt, object: dt(nowIso) },
+  ];
+  await client.writeAssertion({


🔴 Bug: this appends the status event to the manifest assertion but never promotes it. That leaves progress updates in WM only, so peers will still see the old status and a resume from another node can restart already-finished partitions. Promote the partition/event roots after each status write, or use the direct SWM path if that's the intended visibility.

Addressed on this branch: markPartitionStatus writes the status event and promotes it so progress becomes visible from SWM.

github-actions · 2026-05-25T11:54:17Z

+            imp:initialStatus ?initial .
+      OPTIONAL {
+        {
+          SELECT ?part (SAMPLE(?status) AS ?latestStatus) (MAX(?recordedAt) AS ?latestRecordedAt) WHERE {


🔴 Bug: SAMPLE(?status) is not correlated with MAX(?recordedAt). Once a partition has multiple events, this subquery can return the newest timestamp with an older status, so loadImportManifest() may report pending/failed for a partition that is actually done. Select the row with the max timestamp (for example via ORDER BY DESC(?recordedAt) LIMIT 1 per partition) instead of mixing independent aggregates.

Addressed on this branch: loadImportManifest uses a latest-row selection pattern instead of SAMPLE plus MAX, so status and timestamp stay correlated.

github-actions · 2026-05-25T11:54:17Z

+  const bindings = res?.result?.bindings ?? res?.bindings ?? [];
+
+  const partitions = bindings.map((row) => {
+    const key = unquote(row.key);


🔴 Bug: /api/query returns SPARQL-JSON binding cells ({ value, type, ... }) on the normal path, but this code treats row.key, row.part, row.latestStatus, etc. as raw strings. The returned manifest state will contain objects instead of strings and the sort/comparisons will misbehave. Normalize bindings through .value/.id first, the same way the other daemon clients do.

Addressed on this branch: query bindings are normalized from SPARQL JSON cells through .value before sorting/comparing.

github-actions · 2026-05-25T12:15:53Z

+    contextGraphId: cgId,
+    assertionName: assertion,
+    subGraphName,
+    entities: [evIri],


🔴 Bug: promoting only evIri does not publish the partIri -> imp:statusEvent -> evIri triple you just wrote on line 266. assertion.promote({ entities: [...] }) only moves quads whose subject is one of those roots, so after a restart or a peer-side resume loadImportManifest() still sees no status events and treats every partition as pending. Promote both partIri and evIri here, or flip the edge so the event points at the partition.

Addressed on this branch: markPartitionStatus promotes both partIri and evIri so the partition-to-event edge is visible to loadImportManifest.

github-actions · 2026-05-25T12:15:53Z

+/**
+ * Unit tests for the pure helpers in `scripts/lib/manifest.mjs`.
+ *
+ * Daemon-roundtrip behaviour (createImportManifest / markPartitionStatus /


🟡 Issue: this test file only exercises pure helpers, but the risky behavior in this PR is the WM→SWM round-trip through markPartitionStatus()/loadImportManifest(). A regression in promote root selection will pass CI unnoticed. Please add an integration-style test that creates a manifest, marks a partition done, promotes it, and then verifies loadImportManifest() reads done back from shared memory.

Addressed on this branch: added mock WM/SWM round-trip tests covering create, mark done, promote, and reload from shared memory.

…ration tests Addresses Codex round-2 review on PR #642 (12:15Z): 1. markPartitionStatus now promotes BOTH `partIri` and `evIri`. Codex correctly flagged that the `partIri imp:statusEvent evIri` triple has subject `partIri` — `assertion.promote({entities})` only moves quads whose subject is in `entities`, so promoting only `evIri` left the partition→event edge in WM and a peer-side `loadImportManifest` couldn't see the new status. Re-promoting `partIri` is idempotent for its prior triples and ships the new edge. 2. `unquote()` now collapses SPARQL 1.1 results-JSON cells (`{value, type, datatype?, "xml:lang"?}`) to their `.value` before the literal-unquoting pass. The daemon's `bindingValue` helper already handles both shapes; this library was assuming only the flat-string shape Oxigraph emits. Added `bareUri()` for URI bindings and switched `loadImportManifest` to use it for `?part`. 3. New integration-style tests in scripts/lib/__tests__/manifest.test.mjs cover: - full WM→SWM round-trip with flat bindings (Oxigraph shape) - same round-trip with SPARQL-JSON cell bindings (sparql-http shape) - latest-event-wins across multiple status updates on one partition A mock client tracks the WM/SWM split as two in-memory sets and serves SPARQL queries from SWM, exactly matching the daemon's promote contract. Confirmed all three tests FAIL when markPartitionStatus reverts to promoting only `evIri` — that's the regression coverage Codex asked for. 4. SKILL.md updated to document the dual-promotion behaviour so downstream importers / agents implementing this contract don't re-introduce the bug. Co-authored-by: Cursor <cursoragent@cursor.com>

github-actions · 2026-05-25T13:09:51Z

+  }
+  const cgId = client.cgId;
+  const assertion = assertionName ?? `import-manifest-${importId}`;
+  await client.request('POST', '/api/assertion/create', {


🔴 Bug: this helper is meant for resumable imports, but /api/assertion/create returns already exists on retries. A transient failure after the create succeeds, or a second process resuming the same import, will now fail before it can load progress. Please treat an existing manifest assertion as a successful reuse (or add an explicit ensure flow) so createImportManifest() is idempotent.

Addressed in 3b04c99: createImportManifest now treats an existing manifest assertion as idempotent reuse and still runs the additive write/promote healing path.

github-actions · 2026-05-25T13:09:51Z

+        FILTER NOT EXISTS {
+          ?part imp:statusEvent ?ev2 .
+          ?ev2 imp:recordedAt ?ts2 .
+          FILTER (?ts2 > ?latestRecordedAt)


🔴 Bug: this "latest event" filter only excludes strictly later timestamps. recordedAt is only millisecond precision, so two status writes in the same ms both survive and you can return duplicate rows for one partition with nondeterministic status selection. Add a deterministic tie-breaker for equal timestamps (for example compare ?ev as a secondary key) or use a monotonic sequence instead of raw ISO timestamps alone.

Addressed in 3b04c99: latest-event selection now uses the event IRI as a deterministic tie-breaker for equal millisecond timestamps, with a same-ms regression test.

github-actions · 2026-05-25T13:09:51Z

+    }
+  `;
+  const res = await client.query({ sparql, contextGraphId: cgId, subGraphName });
+  const bindings = res?.result?.bindings ?? res?.bindings ?? [];


🔴 Bug: createImportManifest() rejects empty partition lists, so an empty binding set here means the manifest was missing or not visible in SWM, not that the import is complete. Returning { partitions: [] } will make resume callers skip all remaining work on a typo / wrong sub-graph / failed promote. Please fail loudly when no manifest rows are found, or explicitly verify that the import root exists before treating this as success.

Addressed in 3b04c99: loadImportManifest now throws when no manifest rows are visible instead of returning an empty partition list, with a regression test.

github-actions · 2026-05-25T20:34:04Z

+      }
+    }
+  `;
+  const res = await client.query({ sparql, contextGraphId: cgId, subGraphName });


🔴 Bug: client.query() defaults to working-memory when no view/graph suffix is supplied, so this read will miss the manifest/status triples you just promoted to SWM. On a restart or peer resume, loadImportManifest() will report the manifest as missing even though it exists in shared memory. Plumb view: 'shared-working-memory' (or graphSuffix: '_shared_memory') through DkgClient.query and use it here, then update the mock/test to match the daemon's real default.

github-actions · 2026-05-25T20:34:04Z

+/** Stable-ish URI for one StatusEvent. Exported so callers can promote it. */
+export function statusEventUri(importId, key) {
+  // Random-ish suffix keeps multiple events on the same partition unique.
+  const suffix = `${Date.now()}-${Math.random().toString(36).slice(2, 8)}`;


🔴 Bug: the same-millisecond tie-break in loadImportManifest() orders by event IRI, but this suffix is random, not monotonic. If two status updates for the same partition land in one millisecond, the older one can win nondeterministically and regress the visible status. Use an order-preserving suffix (for example timestamp + per-process counter) or persist an explicit sequence field and sort by that instead of a random token.

github-actions · 2026-05-25T20:34:04Z

+
+```python
+import requests
+token = open(os.path.expanduser('~/.dkg/auth.token')).read().strip()


🟡 Issue: This copy-paste Python example is not runnable as written: it uses os.path.expanduser(...) without importing os, and it relies on port below without defining it. Add the missing setup (import os, a default port = 9200 or a function parameter) so the sample works when readers try it.

…mport Adds a short "Empirical motivation" section between the Consequences and Rejected alternatives lists, with concrete data from importing this repo's Graphify output (26,960 nodes / 63,003 edges) on rc.10: - Hand-rolled importer from dkg-node SKILL.md alone: ~330 LOC, half of which is cap-discovery ceremony. - Same workload using rc.10's existing scripts/lib/dkg-daemon.mjs DkgClient: ~half the LOC, gaps remaining are the ones #642 and #643 close. - Four of 17 partitions exceeded the 10 MB gossip cap; total wall- clock in halve-and-retry: ~15 minutes on top of legitimate work. - Write latency degrades ~200x from baseline once SWM grows past ~100k triples; that cost is what motivates #643 but requires the chunking discipline this ADR formalises. Keeps the headlines so the contract's "why these numbers" question has a concrete answer for reviewers; the full per-partition data lives in the rc.10 test artifacts referenced in the PR description. Co-authored-by: Cursor <cursoragent@cursor.com>

…ration tests Addresses Codex round-2 review on PR #642 (12:15Z): 1. markPartitionStatus now promotes BOTH `partIri` and `evIri`. Codex correctly flagged that the `partIri imp:statusEvent evIri` triple has subject `partIri` — `assertion.promote({entities})` only moves quads whose subject is in `entities`, so promoting only `evIri` left the partition→event edge in WM and a peer-side `loadImportManifest` couldn't see the new status. Re-promoting `partIri` is idempotent for its prior triples and ships the new edge. 2. `unquote()` now collapses SPARQL 1.1 results-JSON cells (`{value, type, datatype?, "xml:lang"?}`) to their `.value` before the literal-unquoting pass. The daemon's `bindingValue` helper already handles both shapes; this library was assuming only the flat-string shape Oxigraph emits. Added `bareUri()` for URI bindings and switched `loadImportManifest` to use it for `?part`. 3. New integration-style tests in scripts/lib/__tests__/manifest.test.mjs cover: - full WM→SWM round-trip with flat bindings (Oxigraph shape) - same round-trip with SPARQL-JSON cell bindings (sparql-http shape) - latest-event-wins across multiple status updates on one partition A mock client tracks the WM/SWM split as two in-memory sets and serves SPARQL queries from SWM, exactly matching the daemon's promote contract. Confirmed all three tests FAIL when markPartitionStatus reverts to promoting only `evIri` — that's the regression coverage Codex asked for. 4. SKILL.md updated to document the dual-promotion behaviour so downstream importers / agents implementing this contract don't re-introduce the bug. Co-authored-by: Cursor <cursoragent@cursor.com>

…ivation Brings in: - #641: round-3 codex (portable scanner path; importer skill/manifest framed as future work) + empirical motivation subsection in ADR 0002 (~330 LOC hand-rolled importer, ~15 min wall-clock in halve-and-retry). - #642: round-3 codex (createImportManifest idempotency on 'already exists'; timestamp tie-breaker on equal-ms; loud failure on empty manifest bindings) + new §1.1 in dkg-importer/SKILL.md documenting the three hard caps and their verbatim error strings (10 MB /write body, 256 KB /promote body, 10 MB gossip). Co-authored-by: Cursor <cursoragent@cursor.com> # Conflicts: # packages/cli/skills/dkg-importer/SKILL.md # scripts/lib/__tests__/manifest.test.mjs # scripts/lib/manifest.mjs

github-actions · 2026-05-25T20:40:18Z

+      }
+    }
+  `;
+  const res = await client.query({ sparql, contextGraphId: cgId, subGraphName });


🔴 Bug: client.query({ sparql, contextGraphId, subGraphName }) uses the legacy data-graph route, which targets did:dkg:context-graph:<cg>/<subGraph> rather than the sub-graph's shared-memory graph. Since this manifest is only written via assertion WM→SWM promotion, a real daemon query here will come back empty even after a successful promote. Route this through the SWM graph explicitly (graphSuffix: '_shared_memory' or equivalent) and add a regression test that matches the real DkgClient.query semantics.

github-actions · 2026-05-25T20:40:18Z

+/** Stable-ish URI for one StatusEvent. Exported so callers can promote it. */
+export function statusEventUri(importId, key) {
+  // Random-ish suffix keeps multiple events on the same partition unique.
+  const suffix = `${Date.now()}-${Math.random().toString(36).slice(2, 8)}`;


🔴 Bug: the same-millisecond tie-break in loadImportManifest() relies on later events sorting lexicographically higher by IRI, but this suffix is random. Two markPartitionStatus() calls in the same millisecond can therefore resolve to either status arbitrarily. Use a monotonic suffix/counter (or persist an explicit sequence field) so equal-timestamp events preserve write order.

github-actions · 2026-05-25T20:40:18Z

+    throw new Error('createImportManifest requires `subGraphName` (typically "meta").');
+  }
+  const cgId = client.cgId;
+  const assertion = assertionName ?? `import-manifest-${importId}`;


🟡 Issue: the default assertion name embeds raw importId, but assertion names reject /, whitespace, and other unsafe characters. importUri()/partitionUri() already accept arbitrary IDs via percent-encoding, so a caller can pass an importId that produces valid manifest URIs but makes /api/assertion/create fail here. Either sanitize the default assertion name or validate importId up front and throw a clear error.

Adds a "What changed in round 3" section to INTEGRATION_NOTES_GRAPHIFY.md summarising the per-PR updates folded into integration after running the stack against a real Graphify codebase import on rc.10: - #641: empirical motivation in ADR 0002 + portable scanner ref + dead-link cleanup - #642: §1.1 caps reference with verbatim error strings + manifest hardening (idempotent create, ms tie-breaker, loud empty bindings) - #643: SMALL_BODY_BYTES correction + full assertion identity for uniqueness + lease-expiry shutdown + §10 empirical appendix Notes that rc.10 import test artifacts are kept local, not committed. Co-authored-by: Cursor <cursoragent@cursor.com>

Implements the resumable-import pattern from ADR 0002 and gives every agent / human importer a single canonical reference for chunked bulk writes against a DKG node. scripts/lib/manifest.mjs: - buildInitialManifestTriples / createImportManifest write the import graph as an RDF assertion in the project's `meta` sub-graph. - markPartitionStatus appends status events (append-only — no DELETE/INSERT needed; latest event wins on read). - loadImportManifest + pendingPartitions resolve the current per- partition status in one SPARQL round-trip, so an interrupted import can resume from the next pending partition without re-doing work. - Itself respects the ADR 0002 chunking contract by reusing DkgClient.writeAssertion. scripts/lib/__tests__/manifest.test.mjs: - Six unit tests covering URI encoding, ontology constants, and the buildInitialManifestTriples / pendingPartitions pure helpers. The daemon-roundtrip behaviour stays covered by the repro suite in scripts/repro/. packages/cli/skills/dkg-importer/SKILL.md (NEW): - The agent-readable manual for bulk imports. Codifies the chunking contract from ADR 0002 (CHUNK=5000, ROOT_CHUNK=1000), gives worked examples in TypeScript + Python, walks through the manifest pattern, and links to ADR 0003 for canonical URIs. - Includes anti-patterns ("don't bump MAX_BODY_BYTES to 1GB", "don't invent a graphify:* URI scheme"), HTTP error handling, and a one-page cheat sheet. packages/cli/skills/dkg-node/SKILL.md: - One-line cross-reference under the assertion-write tool table pointing agents to dkg-importer/SKILL.md when they're about to push >5k quads in one go. Stacked on docs/importer-adrs (#641). Related to #596, #636, #640. Co-authored-by: Cursor <cursoragent@cursor.com>

Three real correctness bugs Codex flagged on the manifest library, plus one documentation note clarifying a binding-format concern that turned out to be a false positive (the daemon's /api/query route returns flat JSON-encoded literal strings, not SPARQL-1.1 results-JSON cells — same shape the existing seed/drain scripts already consume via bareIri()). - createImportManifest now promotes the import root AND every partition URI (chunked at ROOT_CHUNK=1000 per ADR 0002). Promoting only the root left partition subjects in WM, so any peer reading the manifest back from SWM could not see imp:key / imp:initialStatus, breaking resume across nodes. - markPartitionStatus now promotes the new StatusEvent root after writing it. Previously the event sat in WM only, so SWM-visible reads of the manifest still reported the prior status and a resume on another node could restart an already-finished partition. - loadImportManifest's "latest event per partition" sub-query replaced SAMPLE+MAX (decorrelated — the status could come from one row and the timestamp from another) with the standard "max row" idiom (`FILTER NOT EXISTS { ... ?ts2 > ?latestRecordedAt }`). The status now genuinely corresponds to the chosen timestamp. - statusEventUri was renamed from a private eventUri to a public export so the promote step has a stable handle to the just-written event. Added a unit test covering its shape + uniqueness; existing 6 tests still green. - Added a doc note above `unquote()` explaining the daemon's flat binding format, with cross-refs to the other scripts that depend on it. This is to prevent a future reader (or a future Codex pass) from trying to swap in SPARQL-1.1-results-JSON parsing and silently breaking everything. - Updated dkg-importer SKILL.md to match the new promote-on-status behaviour and to call out the SAMPLE/MAX foot-gun for future importers writing similar queries. Co-authored-by: Cursor <cursoragent@cursor.com>

…ration tests Addresses Codex round-2 review on PR #642 (12:15Z): 1. markPartitionStatus now promotes BOTH `partIri` and `evIri`. Codex correctly flagged that the `partIri imp:statusEvent evIri` triple has subject `partIri` — `assertion.promote({entities})` only moves quads whose subject is in `entities`, so promoting only `evIri` left the partition→event edge in WM and a peer-side `loadImportManifest` couldn't see the new status. Re-promoting `partIri` is idempotent for its prior triples and ships the new edge. 2. `unquote()` now collapses SPARQL 1.1 results-JSON cells (`{value, type, datatype?, "xml:lang"?}`) to their `.value` before the literal-unquoting pass. The daemon's `bindingValue` helper already handles both shapes; this library was assuming only the flat-string shape Oxigraph emits. Added `bareUri()` for URI bindings and switched `loadImportManifest` to use it for `?part`. 3. New integration-style tests in scripts/lib/__tests__/manifest.test.mjs cover: - full WM→SWM round-trip with flat bindings (Oxigraph shape) - same round-trip with SPARQL-JSON cell bindings (sparql-http shape) - latest-event-wins across multiple status updates on one partition A mock client tracks the WM/SWM split as two in-memory sets and serves SPARQL queries from SWM, exactly matching the daemon's promote contract. Confirmed all three tests FAIL when markPartitionStatus reverts to promoting only `evIri` — that's the regression coverage Codex asked for. 4. SKILL.md updated to document the dual-promotion behaviour so downstream importers / agents implementing this contract don't re-introduce the bug. Co-authored-by: Cursor <cursoragent@cursor.com>

Adds §1.1 to dkg-importer/SKILL.md with the three hard caps an agent will hit at scale, plus the verbatim daemon error text for each one: - /write: MAX_BODY_BYTES = 10 MB body cap (HTTP 413) - /promote: SMALL_BODY_BYTES = 256 KB body cap (HTTP 413) — same status code as /write but a different limit, easy to confuse - /promote: 10 MB gossip-message cap (HTTP 500 "too large for gossip"), independent of the body cap and triggerable even with a single root URI Expands §5 Error handling so each of the three failure modes has its own recipe instead of one generic "halve and retry on 413" block. Calls out that 413-on-promote and 500-gossip are orthogonal recovery paths, both of which an rc.10 importer needs until #643's async promote queue lands. The cap numbers and error strings were validated empirically during the rc.10 Graphify import. Quoting them inline turns "discover via 413" into "look up before you start", which is the difference between a 5-minute import and a 20-minute import for non-trivial graphs. Co-authored-by: Cursor <cursoragent@cursor.com>

Five fixes from the round-4 review against the round-3 push: 1. SWM query routing (codex #642 L391, CRITICAL): the daemon's /api/query default routing for `subGraphName` alone reads the bare data graph, not the SWM tier. Since the manifest is created via the assertion API (create -> write -> promote), it only lives in SWM. Plumb `graphSuffix` (and `view`) through DkgClient.query, then have loadImportManifest pass `graphSuffix: '_shared_memory'`. Without this, a peer/restart resume would silently see zero manifest rows even on a healthy import. The mock client now routes by graphSuffix too, and a new regression test pins the call shape. 2. Monotonic statusEventUri suffix (codex L84, CRITICAL): `statusEventUri()` was `<ts>-<random>`. Two writes in the same millisecond therefore tied on both `recordedAt` AND a random token, so SPARQL's lexicographic tie-breaker picked a non-deterministic winner. New shape is `<ts>-<12-digit counter>-<random>` so within-process same-ms events sort by call order. Cross-process collisions still fall back to the random component, which is no worse than before. New regression test asserts strictly-increasing IRIs across 50 back-to-back calls. 3. Single-pass unquote() (CodeQL #90, L138): the previous four chained .replace() calls had a textbook double-unescape pitfall (`\\\\n` could collapse to a newline). Replaced with a single regex + replacer that matches each escape exactly once. New regression test round-trips literal `\n`, `\\`, `\"`, etc. through lit() / unquote() byte-for-byte. 4. importId sanitization for default assertion name (codex L222): `importUri()` / `partitionUri()` percent-encode any importId, but the daemon's `validateAssertionName` rejects `/`, whitespace, and other IRI-unsafe chars. The default assertion name now goes through a new exported `defaultManifestAssertionName()` that maps unsafe runs to `-`, length-clamps to the 256-char limit, and throws a descriptive error if nothing valid remains (instead of crashing /api/assertion/create with a cryptic 400). New regression tests cover sanitization, length cap, and the "all-unsafe" error case. 5. Python example fixes (codex L127): the README Python snippet was missing `import os` and referenced an undefined `port`. Added the import, defaulted port to `int(os.environ.get('DKG_PORT', '9200'))`, and used a `with` block for the token file. All 18 unit tests pass; 5 new tests cover each of the five fixes above. Co-authored-by: Cursor <cursoragent@cursor.com>

Round-4 codex review against the round-3 push surfaced five issues, all addressed inline: 1. CRITICAL: loadImportManifest queried the bare data graph instead of SWM, so resume from a peer/restart would silently see zero manifest rows on a healthy import. Plumb graphSuffix='_shared_memory' through DkgClient.query. 2. CRITICAL: same-millisecond statusEventUri suffixes used pure random; replaced with monotonic per-process counter so SPARQL tie-breaker matches call order. 3. CodeQL #90: unquote()'s chained .replace() had a textbook double- unescape pitfall; replaced with a single regex + replacer. 4. importId sanitization for the default assertion name (rejects /, whitespace, IRI-unsafe chars per validateAssertionName). 5. Python example: missing 'import os' + undefined 'port'. 5 new unit tests cover each fix; 18/18 pass. Co-authored-by: Cursor <cursoragent@cursor.com> # Conflicts: # packages/cli/skills/dkg-importer/SKILL.md # scripts/lib/__tests__/manifest.test.mjs # scripts/lib/manifest.mjs

github-actions · 2026-05-25T20:55:53Z

+      throw err;
+    }
+  }
+  const triples = buildInitialManifestTriples(importId, partitions, new Date().toISOString());


🔴 Bug: retries/resumes for the same importId will append a fresh imp:startedAt every time, so one manifest can end up with multiple conflicting start timestamps. Because this helper explicitly treats already exists as success, that retry path is normal. Preserve the original start time on reuse (or make it caller-supplied/stable) instead of regenerating it here.

Fixed in 59532a3. On assertion reuse, createImportManifest now detects the 409 and suppresses a fresh imp:startedAt while still healing the root/partition triples. Added a regression that reruns create against an existing assertion and asserts the original startedAt triple is unchanged.

github-actions · 2026-05-25T20:55:53Z

+  // `imp:initialStatus` and resume can't recover. Chunk by `ROOT_CHUNK` so
+  // very-large imports stay within the daemon's body-size budget — see
+  // ADR 0002.
+  const ROOT_CHUNK = 1000;


🔴 Bug: fixed ROOT_CHUNK = 1000 batches can still exceed the daemon's /promote 256 KB body cap when partition URIs are long (deep paths, percent-encoding, GitHub-style keys). That makes manifest creation fail on large imports even though the new skill documents 413-aware splitting as part of the contract. This promote loop should be size-aware or halve-and-retry on 413.

Fixed in 59532a3. Promote now batches roots by estimated JSON body size under the daemon cap and recursively splits/retries on 413, so long percent-encoded partition URIs do not blow up a nominal 1000-root chunk. Added a mock 413 regression for long partition keys.

github-actions · 2026-05-25T20:55:54Z

+    partitions: ['only.ts'],
+    subGraphName: 'meta',
+  });
+  // The mock records all `request()` calls; the create call should have


🟡 Issue: this regression test never asserts the sanitized assertion name it claims to cover. makeMockClient.request() drops the request body, so the test would still pass if createImportManifest sent the raw importId or omitted name entirely. Record the create payload in the mock and assert name === "import-manifest-corpus-2026-q1" so the regression is actually pinned.

Fixed in 59532a3. The mock client now records request() payloads, and the sanitized-name regression asserts the create body sends import-manifest-corpus-2026-q1.

Preserve the original imp:startedAt triple when createImportManifest reuses an existing assertion, and batch manifest promote roots by estimated JSON body size with recursive 413 split/retry. Extend the manifest mock to record create request bodies and simulate promote payload limits, covering sanitized assertion names and long partition URI promote batches. Co-authored-by: Codex <codex@openai.com>

github-actions · 2026-05-25T21:10:07Z

+  const prefix = 'import-manifest-';
+  const maxSlugLen = 256 - prefix.length;
+  const slug = importId
+    .replace(/[^A-Za-z0-9._-]+/g, '-')


🔴 Bug: This sanitization is no longer one-to-one. Different importIds such as a/b vs a b (and long IDs that only differ after the 240-char cutoff) both collapse to the same assertion name, so the later import will silently reuse the earlier manifest assertion and mix their progress/lifecycle state. Preserve uniqueness here, for example by appending a stable hash/encoded suffix whenever sanitization or truncation changes the original id.

github-actions · 2026-05-25T21:10:07Z

+  return `${partitionUri(importId, key)}/event/${ts}-${seq}-${rand}`;
+}
+
+function lit(value) {


🔴 Bug: The new literal codec only escapes \\, ", \n, and \r. Partition keys/status values containing other valid RDF control escapes such as tabs are emitted raw here, which can make the manifest write invalid or read back corrupted. Please reuse the existing shared literal helper or extend both the encode and decode paths to handle the full N-Triples escape set consistently.

github-actions · 2026-05-25T21:58:40Z

+  const evIri = statusEventUri(importId, partitionKey);
+  const nowIso = new Date().toISOString();
+  const triples = [
+    { subject: partIri, predicate: IMPORT_P.statusEvent, object: evIri },


🔴 Bug: markPartitionStatus() never checks that partitionKey was declared in the manifest. If a caller passes a typo or a new key, this write/promote succeeds, but loadImportManifest() will never surface that status because it only walks <import> imp:partition ?part edges created by createImportManifest(). That leaves the real partition looking permanently pending with no obvious error. Please validate membership before writing, or backfill/promote the partition declaration when it is missing.

github-actions · 2026-05-25T21:58:40Z

+    createdFresh = false;
+  }
+  const importIri = importUri(importId);
+  const triples = buildInitialManifestTriples(importId, partitions, new Date().toISOString())


🔴 Bug: on the already exists path we blindly append whatever partitions array the caller supplies into the existing assertion. Reusing an importId with a different partition set will silently produce the union of old and new partitions, so resume logic can keep retrying stale entries from a previous run. Consider loading the existing manifest on reuse and failing fast when the partition set differs, or make additive extension an explicit opt-in.

github-actions · 2026-05-25T21:58:40Z


 P2P tools fail gracefully when the peer is offline. `dkg_publish` (fresh quads + write + publish, two HTTP calls) and `dkg_shared_memory_publish` (publish existing SWM, one HTTP call) differ in intent: use the two-call helper for "I have quads, publish now"; use the canonical finalizer as step 4 of the stepwise write → promote → publish flow. `dkg_share` is a direct SWM convenience helper for quick team-visible notes, not a replacement for assertion lifecycle tracking.

+**Bulk imports (>5,000 quads in one logical operation):** the per-call `dkg_assertion_*` loop IS the chunked-write API; there is no `/api/import/bulk`. If you're pushing a code graph, a corpus, or any multi-thousand-quad import, read [`dkg-importer/SKILL.md`](../dkg-importer/SKILL.md) first — it codifies the chunking budgets, HTTP 413 recovery, resumability manifest, and canonical-URI rules so your importer converges with the rest of the workspace.


🟡 Issue: this cross-link is not reachable for installed users right now. The setup/install flows and the public skill endpoint still hardcode only skills/dkg-node/SKILL.md, so agents running via dkg mcp/openclaw/hermes setup will not have ../dkg-importer/SKILL.md available. Either ship the importer skill through those paths too, or keep the bulk-import guidance inline here until the extra skill is actually distributed.

github-actions · 2026-05-25T22:03:36Z

+try {
+  await client.writeOne(slice);
+} catch (err) {
+  if (err.statusCode !== 413) throw err;


🔴 Bug: These retry examples check err.statusCode, but scripts/lib/dkg-daemon.mjs sets err.status on failed requests. Copied verbatim, the 413 branch never runs and the sample rethrows instead of halving. Please switch this snippet (and the similar one below) to the actual error shape or document a wrapper that normalizes it.

github-actions · 2026-05-25T22:03:36Z

+### Connection errors / 5xx
+
+Standard retry with exponential backoff. The daemon does not implement
+idempotency tokens, but `assertion/create` returns 200 for both first and


🟡 Issue: The raw daemon route does not return 200 on duplicate POST /api/assertion/create; packages/cli/src/daemon/routes/assertion.ts maps already exists to HTTP 400. Only higher-level clients normalize that into an idempotent success. As written, this tells importer authors the wrong wire contract and can lead to broken retry handling.

github-actions · 2026-05-25T22:13:01Z

+  if (!subGraphName) throw new Error('markPartitionStatus requires `subGraphName`.');
+
+  const cgId = client.cgId;
+  const assertion = assertionName ?? defaultManifestAssertionName(importId);


🔴 Bug: markPartitionStatus() recomputes the assertion name from importId on every write. If the manifest was originally created with a custom assertionName, a resumed importer that only knows importId will target a different assertion here and /api/assertion/:name/write will fail with not found. Because loadImportManifest() doesn't persist or return the actual assertion name, callers have no way to recover it later. Please either persist the chosen assertion name in the manifest and resolve it here, or remove the public assertionName override so the name is always derivable.

github-actions · 2026-05-25T22:13:01Z

+
+  const cgId = client.cgId;
+  const assertion = assertionName ?? defaultManifestAssertionName(importId);
+  const manifest = await loadImportManifest({ client, importId, subGraphName });


🟡 Issue: this validates the partition key by loading and sorting the entire manifest on every status update. For the bulk imports this helper targets, that makes status tracking quadratic (in_progress + done across 10k partitions means 20k full SWM queries). Consider caching the declared partition set from createImportManifest()/loadImportManifest(), or replacing this with a narrower existence check / optional validation flag.

github-actions

Codex review skipped: filtered diff is 11370 lines (cap: 5,000). Please consider splitting this into smaller PRs for reviewability.

… via HTTP Two PR #642 follow-ups + the originally-scheduled SKILL.md update for the async promote queue (the bulk of PR #4's documentation work, deferred until #642 merged because that's where dkg-importer/SKILL.md first landed). 1. packages/cli/skills/dkg-importer/SKILL.md — new §6 "Async promote queue" - Route inventory (POST/GET/DELETE /api/assertion/.../promote-async + recover) - Async write loop (drop-in for §2's synchronous promote step) - Failure-classification table mapping attempt.lastError.classification (transient | cap_exceeded | fatal) to importer actions - Migration guidance vs. the sync /promote route (still supported) - Inspecting in-flight jobs alongside the manifest in §3 New cheat-sheet variant under §8 mirroring the async loop. §5's forward reference to "PR #643's async promote queue ... until that lands" rewritten to point at the now-shipping §6. 2. New endpoint GET /.well-known/skill-importer.md (Codex PR #642 follow-up) The cross-link from dkg-node/SKILL.md to dkg-importer/SKILL.md was unreachable for agents installed via the setup flow because only the former was served from packages/cli/src/daemon/manifest.ts's loadSkillTemplate. Adds a sibling loadImporterSkillTemplate + /.well-known/skill-importer.md route in status.ts (same auth-public, ETag-cacheable shape as /.well-known/skill.md), plus PUBLIC_GET_PATHS / PUBLIC_HEAD_PATHS / isLoopbackRateLimitExemptPath allowlist entries. Cross-link in dkg-node/SKILL.md now points at the runtime URL. 4 new tests (auth public path + 4 dkg-importer skill content pins). 3. scripts/lib/manifest.mjs — partitionDeclared helper (Codex PR #642 quadratic-validation follow-up). markPartitionStatus used to call loadImportManifest on every status write, materialising the full manifest from SWM. For a 10k-partition import marking each twice (in_progress + done), that's 20k full SWM round-trips. Replaced with a single-row SPARQL ASK: one query, one Boolean, same error semantics when a partition isn't declared. 4 new tests including a regression pin asserting markPartitionStatus issues exactly one query. Tests: 27/27 (scripts/lib manifest), 32/32 (publisher async-promote-queue), 190/190 (CLI unit suites incl. new skill-endpoint cases, async-promote routes/worker/E2E, auth allowlist). tsc green. Zero lints. Co-authored-by: Cursor <cursoragent@cursor.com>

branarakic requested review from Jurij89 and zsculac as code owners May 25, 2026 11:51

github-actions Bot reviewed May 25, 2026

View reviewed changes

branarakic mentioned this pull request May 25, 2026

docs(specs): RFC for async WM→SWM promote queue #643

Merged

5 tasks

branarakic force-pushed the feat/dkg-importer-helpers branch from 5d2a4e9 to 673dfca Compare May 25, 2026 12:13

github-actions Bot reviewed May 25, 2026

View reviewed changes

branarakic force-pushed the feat/dkg-importer-helpers branch from 673dfca to c5e6d51 Compare May 25, 2026 13:07

github-actions Bot reviewed May 25, 2026

View reviewed changes

branarakic pushed a commit that referenced this pull request May 25, 2026

merge: PR #642 (incl. #641) — ADRs + importer manifest + SKILL.md

f401e92

github-actions Bot reviewed May 25, 2026

View reviewed changes

branarakic force-pushed the feat/dkg-importer-helpers branch from 3b04c99 to bb8b16e Compare May 25, 2026 20:37

github-actions Bot reviewed May 25, 2026

View reviewed changes

branarakic mentioned this pull request May 25, 2026

docs(adr): importer chunking contract + code-graph ontology convergence #641

Merged

3 tasks

Branimir Rakic and others added 5 commits May 25, 2026 22:42

fix(importer): harden manifest resume semantics

c9e4d91

branarakic changed the base branch from docs/importer-adrs to main May 25, 2026 20:42

branarakic force-pushed the feat/dkg-importer-helpers branch from bb8b16e to 36e4c22 Compare May 25, 2026 20:42

github-advanced-security AI found potential problems May 25, 2026

View reviewed changes

Comment thread scripts/lib/manifest.mjs Fixed

github-actions Bot reviewed May 25, 2026

View reviewed changes

fix(importer): make manifest ids collision-safe

3a90293

github-actions Bot reviewed May 25, 2026

View reviewed changes

fix(importer): validate manifest partition reuse

752cbb4

github-actions Bot reviewed May 25, 2026

View reviewed changes

docs(importer): fix daemon retry examples

a526ee8

github-actions Bot reviewed May 25, 2026

View reviewed changes

fix(importer): derive manifest assertion names

884a444

github-actions Bot reviewed May 25, 2026

View reviewed changes

branarakic mentioned this pull request May 25, 2026

feat(daemon): PR #4 — async-promote queue config knobs + E2E test + SKILL.md docs #667

Closed

7 tasks

branarakic merged commit 6dc410c into main May 25, 2026
36 checks passed

branarakic mentioned this pull request May 27, 2026

docs(skill/importer): document Rule 4 root-entity uniqueness + blank-node rewrite recipe #721

Merged

3 tasks


		P2P tools fail gracefully when the peer is offline. `dkg_publish` (fresh quads + write + publish, two HTTP calls) and `dkg_shared_memory_publish` (publish existing SWM, one HTTP call) differ in intent: use the two-call helper for "I have quads, publish now"; use the canonical finalizer as step 4 of the stepwise write → promote → publish flow. `dkg_share` is a direct SWM convenience helper for quick team-visible notes, not a replacement for assertion lifecycle tracking.

		Bulk imports (>5,000 quads in one logical operation): the per-call `dkg_assertion_*` loop IS the chunked-write API; there is no `/api/import/bulk`. If you're pushing a code graph, a corpus, or any multi-thousand-quad import, read [`dkg-importer/SKILL.md`](../dkg-importer/SKILL.md) first — it codifies the chunking budgets, HTTP 413 recovery, resumability manifest, and canonical-URI rules so your importer converges with the rest of the workspace.

Conversation

branarakic commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Stack

Update — round 3 (2026-05-25): known caps + manifest hardening

dkg-importer/SKILL.md — Known caps with exact error strings (§1.1)

scripts/lib/manifest.mjs — round-3 hardening

Update — round 4 (2026-05-25 22:50): post-empirical-motivation Codex review

CRITICAL — loadImportManifest was reading the bare data graph, not SWM (L391)

CRITICAL — statusEventUri same-millisecond ordering was non-deterministic (L84)

CodeQL #90 — double-unescape risk in unquote (L138)

importId sanitization for default assertion name (L222)

Python sample fix (L127)

Test coverage

Uh oh!

github-actions Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

branarakic commented May 25, 2026 •

edited

Loading

`dkg-importer/SKILL.md` — Known caps with exact error strings (§1.1)

`scripts/lib/manifest.mjs` — round-3 hardening

CRITICAL — `loadImportManifest` was reading the bare data graph, not SWM (L391)

CRITICAL — `statusEventUri` same-millisecond ordering was non-deterministic (L84)

CodeQL #90 — double-unescape risk in `unquote` (L138)

`importId` sanitization for default assertion name (L222)