Skip to content

[Feature] Adopt llmtxt v2026.4.8 — catch up on Storage Evolution (v4.5→v4.8) + leverage unused capabilities #96

@kryptobaseddev

Description

@kryptobaseddev

Feature Request

TL;DR

I maintain llmtxt (npm) + llmtxt-core (crates.io) and CLEO is my daily driver. I audited CLEO's shipped bundle against what llmtxt currently exposes and found seven concrete integration opportunities. This issue documents what to bump to, what you get, and suggests an adoption order.

The package pin in @cleocode/cleo@2026.4.91 is "llmtxt": "^2026.4.6" but my local CLEO install and the downstream behaviour I'm observing both indicate the effective runtime is v2026.4.5 (build-time lockfile, likely). You're two major feature releases behind: v2026.4.6 Storage Evolution (7 epics, 30+ commits) and v2026.4.7/4.8 bundler compatibility fixes. v2026.4.8 is published (npm view llmtxt dist-tagslatest: 2026.4.8).

Bundler fix that directly concerns CLEO: v2026.4.7's move of better-sqlite3 / drizzle-orm / postgres to peer-optional introduced a load-time ERR_MODULE_NOT_FOUND for lightweight consumers — confirmed broken on cleo docs generate (silently falls back). v2026.4.8 (just released) fixes this by lazy-loading LocalBackend / RemoteBackend inside createBackend(). Bump past v2026.4.7 directly to v2026.4.8.

Repo: https://github.com/kryptobaseddev/llmtxt
Specs: docs/specs/P1-loro-migration.md, P2-cr-sqlite.md, P3-p2p-mesh.md, ARCH-T426-ephemeral-agent-lifecycle.md, ARCH-T427-document-export-ssot.md, ARCH-T428-binary-blob-attachments.md, ARCH-T429-hub-spoke-topology.md
Live CI green on main: https://github.com/kryptobaseddev/llmtxt/actions
Live API: https://api.llmtxt.my/api/health (HTTP 200)


0. Bump to v2026.4.8 + trim externalize list

cd packages/cleo  # or wherever llmtxt lives in your monorepo
pnpm add llmtxt@^2026.4.8

Updated build.mjs externalize list (v2026.4.8 verified against esbuild 0.28):

// Minimum required — v2026.4.8+
// onnxruntime-node NO LONGER needs --external (runtime-opaque import in embeddings.ts)
external: [
  'better-sqlite3',       // native .node addon (standalone topology only)
  '@vlcn.io/crsqlite',    // native extension, ESM-only (mesh topology only)
  'drizzle-orm',          // transitively pulls mssql + @opentelemetry/api
  'drizzle-orm/*',        // subpath imports
  'postgres',             // hub-spoke Postgres backend only
  'mssql',                // drizzle-orm/v1-beta multi-dialect residual
  '@opentelemetry/api',   // drizzle-orm/v1-beta tracing
]

If CLEO never imports beyond generateOverview (the current call site), this works too — the lazy-load in v2026.4.8 means even lighter consumers like this succeed:

external: [
  'better-sqlite3', 'drizzle-orm', 'drizzle-orm/*',
  'postgres', 'mssql', '@opentelemetry/api', '@vlcn.io/crsqlite',
]
// then: const { generateOverview } = await import('llmtxt')
// works without any of the above installed

Per-topology install matrix is in packages/llmtxt/README.md.


1. What I actually tested in your current CLI

I exercised CLEO commands against a real task on my machine. Data, not assumptions:

Command Result Observation
cleo docs list --task T014 {count:0, attachments:[]} Interface works
cleo docs add T014 /tmp/file.txt E_VALIDATION_FAILED: Path traversal Absolute paths outside project root are rejected. llmtxt BlobOps doesn't have this restriction (content-addressed, no path concept).
cleo docs add T014 claudedocs/file.txt --desc "..." {attachmentId, sha256, refCount:1, kind:"local-file"} Works with project-relative path. SHA-256 + refCount tracking already aligns with BlobOps.
cleo docs generate --for T014 {content:..., usedLlmtxtPackage:false, attachmentCount:1} usedLlmtxtPackage: false — the dynamic import('llmtxt') at dist/cli/index.js:95141 is silently failing. See §2 for the root cause.
cleo docs generate --task T014 Missing required argument Accepts --for <id>, not --task <id>. Minor CLI inconsistency between docs list --task vs docs generate --for.

The silent fallback (usedLlmtxtPackage: false) is the most concerning finding. Your tryLoadGenerateOverview():

async function tryLoadGenerateOverview() {
  try { const mod = await import("llmtxt"); if (typeof mod.generateOverview === "function") return mod.generateOverview; return null; }
  catch { return null; }
}

This catches the import error. When I ran the import manually from CLEO's install dir:

$ cd ~/.npm-global/lib/node_modules/@cleocode/cleo-os/node_modules/@cleocode/cleo && \
  node -e "import('llmtxt').catch(e => console.error(e.message))"
FAILED: Cannot find package 'better-sqlite3' imported from \
  ~/.npm-global/lib/node_modules/@cleocode/cleo-os/node_modules/llmtxt/dist/local/local-backend.js

That's the regression v2026.4.8 fixes. After CLEO bumps, usedLlmtxtPackage will become true and you'll get structural section analysis from llmtxt instead of the built-in fallback.


2. Current CLEO surface using llmtxt (audit)

dist/cli/index.js:95141:    const mod = await import("llmtxt");

Exactly one site: cleo docs generate fallback. Everything else (brain, attachments, audit, NEXUS, sessions, lifecycle, roadmap, analysis) is homegrown.


3. High-value integration opportunities

Ordered low-risk / high-value first.

3.1 Replace cleo docs add/fetch/remove backing store with llmtxt BlobOps (T428)

CLEO today: on-disk store with refCount tracking. docs fetch returns base64 inline for ≤ 1 MB, path-only otherwise.

llmtxt offers (v2026.4.6 epic T428):

  • Content-addressed (SHA-256 = key) — automatic dedup across projects
  • Hash-verify-on-read (tamper detection) — mandatory per security mandate
  • LWW merge per attachment name — multi-agent-safe
  • 3 adapters: BlobFsAdapter (local), BlobPgAdapter S3/R2 + PG LOB
  • Lazy sync via changeset refs
  • Max size enforcement (100 MB default)
  • Sibling CLI: llmtxt attach / detach / blobs

Integration pattern:

import { createBackend } from 'llmtxt';  // v2026.4.8+
const backend = await createBackend({ topology: 'standalone', storagePath: '~/.cleo' });

// cleo docs add T014 <file>
const ref = await backend.attachBlob('T014', {
  name: basename(file),
  contentType: detectMime(file),
  data: await readFile(file),
  uploadedBy: currentAgent,
});

// cleo docs fetch <id-or-hash>
const { data, contentType, hash, size, uploadedBy, uploadedAt } = await backend.getBlob('T014', ref.name);

Drop-in for your current attachment store. Gains dedup + hash-verify-on-read + multi-device sync (§3.4). Absolute-path rejection you currently have can be kept as a CLI guard or dropped since blobs are hash-keyed, not path-keyed.

Spec: docs/specs/ARCH-T428-binary-blob-attachments.md


3.2 Wrap cleo complete <id> / cleo session * with AgentSession (T426, signed receipts)

CLEO today: cleo complete mutates rows. Overrides go to .cleo/audit/force-bypass.jsonl. ADR-051 demands programmatic evidence on every gate.

llmtxt offers (T426 + signed-writes from v2026.4.5):

  • AgentSession state machine: open() → contribute(fn) → close()
  • close() returns ContributionReceipt: { sessionId, agentId, documentIds[], eventCount, sessionDurationMs, openedAt, closedAt, signature? }
  • Crash recovery: TTL-based lease/presence cleanup (50-worker swarm test — 17ms convergence, 70+ lease contentions, zero orphans)
  • Ed25519 signed writes — cryptographically attributable
import { AgentSession, createBackend } from 'llmtxt';
// Inside `cleo session start`:
const backend = await createBackend({ topology: 'standalone', storagePath: '~/.cleo' });
const session = new AgentSession({ backend, agentId: os.hostname() });
await session.open();

// Inside `cleo complete T123`:
const receipt = await session.contribute(async () => await applyTaskCompletion('T123'));
// Persist receipt as audit evidence — uniform signed ledger replaces force-bypass.jsonl

Spec: docs/specs/ARCH-T426-ephemeral-agent-lifecycle.md


3.3 backend.exportDocument for cleo roadmap / dash / briefing --format md (T427)

Four byte-deterministic formats (markdown / json / txt / llmtxt) with 100-iteration determinism test in the llmtxt repo.

await backend.exportDocument('roadmap', { format: 'md', outputPath: './ROADMAP.md', includeMetadata: true });

Plus llmtxt export-all for per-task markdown snapshots, and importDocument for round-trip (edit local md, re-ingest into CLEO brain).

Spec: docs/specs/ARCH-T427-document-export-ssot.md


3.4 Make .cleo/brain.db CRDT via @vlcn.io/crsqlite (T385, optional peer)

Today: .cleo/brain.db is raw SQLite. Multi-device requires manual sync.

llmtxt offers (T385): LocalBackend optionally loads @vlcn.io/crsqlite at open. When present, DB becomes a CRDT: getChangesSince(dbVersion) → changeset, applyChanges(changeset) → merge. App-level Loro merge for crdt_state blobs (DR-P2-04). Graceful skip when native .so missing — DB still works local-only.

const a = await createBackend({ topology: 'standalone', storagePath: '~/.cleo' });
const b = await createBackend({ topology: 'standalone', storagePath: '/other-machine/.cleo' });
console.log(a.hasCRR, b.hasCRR);  // true iff @vlcn.io/crsqlite installed
await b.applyChanges(await a.getChangesSince(lastSync));  // converges

Spec: docs/specs/P2-cr-sqlite.md


3.5 cleo nexushub-spoke topology (T429)

Today: registry file + manual sync.

llmtxt offers (T429): typed TopologyConfig + createBackend() with tested failure modes (hub unreachable, partition + reconnect, split-brain LWW + Loro merge).

// Hosted CLEO instance
const hub = await createBackend({ topology: 'hub-spoke', mode: 'hub', hubUrl, storagePath, apiKey });

// Local dev workstation
const spoke = await createBackend({
  topology: 'hub-spoke', mode: 'spoke', hubUrl, apiKey,
  persistLocally: true,  // keeps local cr-sqlite mirror for offline
  storagePath: '~/.cleo',
});

Spec: docs/specs/ARCH-T429-hub-spoke-topology.md


3.6 P2P mesh for agent swarms (T386, optional)

For on-prem teams / air-gapped: Ed25519 mutual handshake + cr-sqlite changeset exchange over UnixSocket/HTTP. 5-peer convergence integration test in repo.

await createBackend({
  topology: 'mesh', storagePath: '~/.cleo',
  peers: ['unix:///tmp/cleo-mate.sock', 'http://peer-laptop.lan:5150'],
  identityKey: await loadOrGenerateEd25519Key(),
});

CLI (already shipped): llmtxt mesh start|stop|status|peers|sync.

Spec: docs/specs/P3-p2p-mesh.md


3.7 Progressive disclosure / retrieval planning for large task trees

Today: cleo find / list --parent stream subtrees. Token budgets manual.

llmtxt offers (pre-v2026.4.5, still relevant): generateOverview(content) + planRetrieval(overview, budget, query) — saves 60-80% tokens in practice. Exactly what your cleo docs generate already tries to use for attachment section analysis; same tooling can now drive a cleo briefing --token-budget 4000 --query X mode for LLM-prompt workflows.


4. Suggested adoption order

Step Change Risk Effort
0 Bump to llmtxt@^2026.4.8, rebuild — fixes silent usedLlmtxtPackage:false none 5 min
1 Adopt BlobOps behind a feature flag for cleo docs add/fetch/remove low ~1 day
2 Wrap cleo session start + cleo complete with AgentSession medium ~2 days
3 Add cleo roadmap --format md via backend.exportDocument none ~2 hours
4 Opt-in CRR mode for multi-device brain low ~1 day
5 Replumb cleo nexus on hub-spoke topology medium ~3 days
6 Optional P2P mesh for on-prem high ~1 week
7 Progressive disclosure for large task trees (cleo briefing --token-budget) low ~4 hours

5. What NOT to adopt

  • LlmtxtDocument + LlmtxtLifecycle — llmtxt's own user-content model. CLEO's task system is richer and battle-tested.
  • llmtxt RBAC (org/document roles) — not relevant to CLEO.

6. Why it's a good fit

CLEO is building a multi-agent collaboration tool. llmtxt is now a mature multi-agent collaboration SDK. Every capability listed has a homegrown equivalent in CLEO that can be replaced by a peer-installable, test-covered, signed-audit-trail primitive.

Numbers:

  • 470 SDK unit tests + 352 cargo tests + 257 backend tests, all green
  • Production API (api.llmtxt.my) running the same code, HTTP 200 as of filing
  • Published on npm + crates.io under OIDC provenance
  • 4 RFC 2119 specs + 3 updated for Storage Evolution + 6 MDX doc pages

Happy to pair on the first two adoption steps and iterate on rough edges.


7. References

Are you using an AI agent?

Yes - AI agent filed this issue


Environment

Component Version
CLEO 2026.2.1
Node.js v24.13.1
OS linux 6.19.11-200.fc43.x86_64 x64 (x64)
Shell /bin/bash
gh CLI gh version 2.87.3 (2026-02-23)
Install /home/keatonhoskins/.npm-global/bin/cleo

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions