Skip to content

fix(runtime): make v0.9.1 dist actually start#184

Closed
rohitg00 wants to merge 4 commits intomainfrom
fix/runtime-iii-sdk-0.11
Closed

fix(runtime): make v0.9.1 dist actually start#184
rohitg00 wants to merge 4 commits intomainfrom
fix/runtime-iii-sdk-0.11

Conversation

@rohitg00
Copy link
Copy Markdown
Owner

@rohitg00 rohitg00 commented Apr 21, 2026

Summary

v0.9.1 dist crashed on first import. Local reproduction during end-to-end slot testing surfaced four independent bugs that each block startup.

What was broken

  1. getContext re-imported in three files — PR feat: implement multimodal image memory #111's conflict resolution re-added `import { getContext } from "iii-sdk"` to `compress.ts`, `disk-size-manager.ts`, `image-quota-cleanup.ts`. iii-sdk v0.11 dropped that export (see the shim comment in `src/logger.ts`). Tests mocked iii-sdk so `npm test` passed, but `node dist/cli.mjs` died with `The requested module 'iii-sdk' does not provide an export named 'getContext'`.
  2. Old `registerFunction({ id, description }, handler)` shape — two of those files also used the pre-0.11 signature. 0.11 only accepts `registerFunction(id, handler, options?)`.
  3. `KV.state` missing — same conflict resolution pass dropped `state: "mem:state"` from `src/state/schema.ts`. `disk-size-manager.ts` uses that key for persistence.
  4. tsdown inlined onnxruntime-node + @xenova/transformers — the bundled `binding.js` rewrote the relative require for `../bin/napi-v3/darwin/arm64/onnxruntime_binding.node` so it no longer resolved from `dist/`. Even with `AGENTMEMORY_IMAGE_EMBEDDINGS` off, the module graph still evaluated the bundled binding on startup and threw.

What shipped

  • Dropped the three `getContext` imports, switched to module-level `logger`.
  • Collapsed `registerFunction({ id, description }, ...)` to `registerFunction(id, ...)`.
  • Restored `state: "mem:state"` in `src/state/schema.ts`.
  • `tsdown.config.ts` marks these as `external`: `@xenova/transformers`, `onnxruntime-node`, `onnxruntime-web`, `@anthropic-ai/claude-agent-sdk`, `@anthropic-ai/sdk`. Bundle 6.1 MB → 1.9 MB. CLIP + local embedding providers lazy-load them from `node_modules` where relative paths work.
  • Bumped `iii-sdk` 0.11.0 → 0.11.2 to match the shipping API (Logger / durable:subscriber / durable:publisher / TriggerAction.void).
  • `test/multimodal.test.ts` registerFunction mocks updated to match the real `(id, cb)` signature.

Verified end-to-end

```
$ AGENTMEMORY_SLOTS=true AGENTMEMORY_REFLECT=true node dist/cli.mjs
[agentmemory] Slots: enabled (pinned editable memory). Reflect on Stop hook: on
[agentmemory] Ready. BM25+Graph search active.
[agentmemory] Endpoints: 107 REST + 51 MCP tools + 6 MCP resources + 3 MCP prompts

$ curl :3111/agentmemory/livez → 200 { status: ok }
$ curl :3111/agentmemory/slots → 8 defaults seeded into correct scopes
$ curl -X POST :3111/agentmemory/slot/append -d '{"label":"persona","text":"..."}' → 200
$ curl -X POST :3111/agentmemory/slot -d '{"label":"x","sizeLimit":-1}' → 400 sizeLimit must be a positive integer
$ curl -X POST :3111/agentmemory/slot -d '{"label":"x","scope":"other"}' → 400 scope must be 'project' or 'global'
$ curl -X POST :3111/agentmemory/slot/append -d '{"label":"tight","text":"<600 chars>"}' → 413 append would exceed sizeLimit
$ curl -X POST :3111/agentmemory/slot/reflect -d '{"sessionId":"empty"}' → { applied: 0 }
$ curl -X POST :3111/agentmemory/vision-search -d '{"queryText":"x"}' → 503 image embeddings disabled
```

Test plan

  • `npm run build` clean
  • `npm test` — 812/812 pass
  • End-to-end smoke test above — all eight default slots seed to correct scopes, validation errors surface with specific messages, audit trail populated on every write and reflect.

Summary by CodeRabbit

  • Chores

    • Updated iii-sdk dependency to ^0.11.2
    • Changed CI/publish install steps and build settings to avoid bundling certain external packages
  • Refactor

    • Simplified internal function registration and unified logging across handlers
  • New

    • Added a persisted state key/schema entry for disk-size tracking
  • Tests

    • Updated tests/mocks to match the adjusted function registration signature

… onnx, restore KV.state

The v0.9.1 dist was dead on arrival. Three independent bugs layered:

1) `getContext` was re-imported during PR #111 conflict resolution in
   three files (compress.ts, disk-size-manager.ts, image-quota-cleanup.ts).
   iii-sdk v0.11 dropped that export — the shim in src/logger.ts
   documents the fix. Tests mocked iii-sdk so `npm test` passed, but
   `node dist/cli.mjs` crashed on first import. Removed the imports,
   switched to the module-level logger.

2) The two PR #111 files also used the
   `registerFunction({ id, description }, handler)` shape. iii-sdk
   0.11.2 only accepts `registerFunction(id, handler, options?)`. The
   dist imports succeeded but the runtime registration threw "not a
   function" on the test mock. Collapsed to the flat form.

3) `KV.state` was dropped from src/state/schema.ts during the same
   conflict resolution, which broke the disk-size-manager's persistence
   key. Restored `state: "mem:state"` — used only by this manager.

4) tsdown was inlining onnxruntime-node and @xenova/transformers into
   dist/. That rewrites the relative require() inside
   `onnxruntime-node/dist/binding.js` so it can't find
   `../bin/napi-v3/darwin/arm64/onnxruntime_binding.node` from dist/.
   Even without AGENTMEMORY_IMAGE_EMBEDDINGS the module graph still
   evaluated the bundled binding.js on startup and blew up. Added both
   packages (plus onnxruntime-web and the two Anthropic SDKs) to
   `external:` in tsdown.config.ts. Bundle shrank from 6.1 MB to
   1.9 MB. The CLIP / local embedding providers lazy-load them from
   node_modules where relative paths work.

5) Bumped iii-sdk 0.11.0 → 0.11.2 to match the API currently shipped
   (Logger / durable:subscriber / durable:publisher / TriggerAction.void).

6) test/multimodal.test.ts used the old `{ id, description }` mock
   shape — rewrote the four registerFunction mocks to match the real
   `(id, cb)` signature. 812/812 pass.

End-to-end smoke test with AGENTMEMORY_SLOTS=true AGENTMEMORY_REFLECT=true:
- livez → 200 ok
- GET /slots → all 8 defaults seeded into correct scopes (persona /
  user_preferences / tool_guidelines in global; rest in project)
- POST /slot (invalid sizeLimit / unknown scope) → 400 with specific
  error
- POST /slot/append overflow → 413 with currentSize + sizeLimit
- POST /slot/reflect on empty session → no-op
- GET /audit → every slot write + reflect emits an audit row
- POST /vision-search without flag → 503 disabled
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agentmemory Ready Ready Preview, Comment Apr 22, 2026 10:07am

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 21, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2403b470-1dff-4df7-b796-03bcac6b6bf6

📥 Commits

Reviewing files that changed from the base of the PR and between 121322b and c718a03.

📒 Files selected for processing (2)
  • .github/workflows/ci.yml
  • .github/workflows/publish.yml
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/publish.yml

📝 Walkthrough

Walkthrough

Updated SDK dependency and build externals; removed unused context imports and switched to module-level logger; changed function registration to use string IDs; added KV state key and typed StateScope; test and CI workflow updates to match the API/installation changes. (48 words)

Changes

Cohort / File(s) Summary
Dependency & Build Configuration
package.json, tsdown.config.ts
Bumped iii-sdk to ^0.11.2. Added shared.external entries in tsdown.config.ts to keep several ML/SDK packages external (not bundled).
Function files
src/functions/compress.ts, src/functions/disk-size-manager.ts, src/functions/image-quota-cleanup.ts
Removed unused getContext imports; replaced ctx.logger calls with module-level logger; tightened typing for disk-size state key and KV get/set; switched sdk.registerFunction calls to use string IDs; adjusted some structured log shapes.
State & Types
src/state/schema.ts, src/types.ts
Added KV key state: "mem:state" and introduced StateScope with "system:currentDiskSize": number plus StateScopeKey for typed state usage.
Tests
test/multimodal.test.ts
Updated Vitest mock of sdk.registerFunction to accept (id: string, cb) and updated assertions to compare the registration ID string (e.g., "mem::disk-size-delta").
CI & Publish Workflows
.github/workflows/ci.yml, .github/workflows/publish.yml
Changed install steps to a two-step lockfile generation + npm ci flow and added --no-audit --no-fund flags to the install commands.

Sequence Diagram(s)

(omitted — changes are small, primarily refactors, typings, and config edits that do not introduce multi-component sequential flows)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 I nibbled a stray import and tidied a line,
Moved logs to the burrow where module lights shine,
IDs whispered plainly, state keys set in rows,
Tests hopped in step, CI trimmed its throws,
A tiny rabbit patch — neat, quick, and fine.

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The PR title 'fix(runtime): make v0.9.1 dist actually start' is vague and does not clearly communicate the specific issues being fixed or the main changes involved. Consider a more descriptive title that specifies the core fixes, such as 'fix(runtime): update iii-sdk 0.11 API and restore state schema' or 'fix: resolve runtime import errors and bundle size issues in v0.9.1'.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/runtime-iii-sdk-0.11

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/functions/image-quota-cleanup.ts (1)

56-68: ⚠️ Potential issue | 🟠 Major

Do not delete images when ref-count lookup fails.

Line 61 logs the failure, but refCount remains 0, so the cleanup continues and can delete a still-referenced image after a transient KV/read error.

🐛 Proposed fix
             try {
               refCount = await getImageRefCount(kv, f.filePath);
             } catch (err) {
               logger.error("Failed to read refCount", { filePath: f.filePath, error: err instanceof Error ? err.message : String(err) });
+              return;
             }
 
             if (refCount > 0) {
               return;
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/functions/image-quota-cleanup.ts` around lines 56 - 68, The code
currently treats a failed getImageRefCount call as refCount = 0 and proceeds to
deleteImage; change the logic in the withKeyedLock block so that when
getImageRefCount throws (caught in the catch), you do not continue with
deletion: for example, set a sentinel (e.g., refCount = null) or rethrow the
error and then return early if refCount is null/undefined; update the checks
around refCount (> 0) to also guard for the sentinel so deleteImage(f.filePath)
is only called when you successfully obtained a numeric refCount, referencing
the getImageRefCount, refCount variable, withKeyedLock, and deleteImage symbols.
test/multimodal.test.ts (1)

4-11: ⚠️ Potential issue | 🟡 Minor

Remove the stale getContext mock from the SDK override.

Lines 8-10 mock getContext, but this export was removed in iii-sdk v0.11 and is not used anywhere in the codebase. The mock can mask future import regressions if getContext is accidentally reintroduced.

Proposed cleanup
 vi.mock("iii-sdk", async (importOriginal) => {
   const actual = await importOriginal<typeof import("iii-sdk")>();
   return {
     ...actual,
-    getContext: () => ({
-      logger: { info: vi.fn(), error: vi.fn(), warn: vi.fn() },
-    }),
   };
 });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/multimodal.test.ts` around lines 4 - 11, Remove the stale getContext
mock inside the vi.mock override in the test (the block returning ...actual and
getContext); specifically, update the vi.mock import override that currently
returns getContext: () => ({ logger: { info: vi.fn(), error: vi.fn(), warn:
vi.fn() } }) to simply spread and return ...actual without defining getContext,
so the test no longer mocks the removed iii-sdk export and cannot mask future
import regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/state/schema.ts`:
- Line 48: The project is missing a type interface for the KV scope `KV.state`,
causing callers like `disk-size-manager.ts` to use ad-hoc
`kv.get<number>(KV.state, ...)`; add a definition in `src/types.ts` that
declares the `state` scope mapping (e.g., include an interface or type entry for
keys such as "system:currentDiskSize": number) so that `KV.state` has a typed
shape; update the exported types so callers using `kv.get`/`kv.put` can
reference the new `state` interface instead of inline generics.

---

Outside diff comments:
In `@src/functions/image-quota-cleanup.ts`:
- Around line 56-68: The code currently treats a failed getImageRefCount call as
refCount = 0 and proceeds to deleteImage; change the logic in the withKeyedLock
block so that when getImageRefCount throws (caught in the catch), you do not
continue with deletion: for example, set a sentinel (e.g., refCount = null) or
rethrow the error and then return early if refCount is null/undefined; update
the checks around refCount (> 0) to also guard for the sentinel so
deleteImage(f.filePath) is only called when you successfully obtained a numeric
refCount, referencing the getImageRefCount, refCount variable, withKeyedLock,
and deleteImage symbols.

In `@test/multimodal.test.ts`:
- Around line 4-11: Remove the stale getContext mock inside the vi.mock override
in the test (the block returning ...actual and getContext); specifically, update
the vi.mock import override that currently returns getContext: () => ({ logger:
{ info: vi.fn(), error: vi.fn(), warn: vi.fn() } }) to simply spread and return
...actual without defining getContext, so the test no longer mocks the removed
iii-sdk export and cannot mask future import regressions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e6f54054-ee84-4582-96d8-09583f0bb83c

📥 Commits

Reviewing files that changed from the base of the PR and between 480d8e8 and a12767b.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (7)
  • package.json
  • src/functions/compress.ts
  • src/functions/disk-size-manager.ts
  • src/functions/image-quota-cleanup.ts
  • src/state/schema.ts
  • test/multimodal.test.ts
  • tsdown.config.ts

Comment thread src/state/schema.ts
Policy: never commit lock files. Downstream developers and CI
regenerate them at install time. Committing them creates churn on
every transitive-dep bump and masks real dependency changes in PRs.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 21, 2026

Only repository collaborators, contributors, or members can run CodeRabbit commands.

- CI (ci.yml + publish.yml): switch from `npm ci` to
  `npm install --legacy-peer-deps --no-audit --no-fund`. `npm ci`
  requires a committed lockfile; we gitignore lockfiles by policy
  (#184 follow-up b1bb2b2), so `npm install` is the correct path.
- types.ts: add `StateScope` interface documenting the KV.state
  scope shape (per CodeRabbit suggestion on schema.ts:48). Today it
  only holds `system:currentDiskSize: number`; future state-scope
  keys should register here to get compile-time key/value checks
  across the codebase.
- disk-size-manager.ts: use the new `StateScope` / `StateScopeKey`
  types for get/set so type safety flows through instead of the
  previous ad-hoc `kv.get<number>` annotation.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/types.ts (1)

853-857: Trim the WHAT-style JSDoc.

This comment mostly restates the adjacent StateScope type. Consider removing it or limiting it to non-obvious constraints that cannot be encoded in the type. As per coding guidelines, src/**/*.ts: “Avoid code comments explaining WHAT — use clear naming instead”.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/types.ts` around lines 853 - 857, Remove the redundant WHAT-style JSDoc
above the StateScope declaration: either delete the block entirely or replace it
with a short, specific note only for non-obvious constraints that the type
cannot encode (for example, "keys must be kept in sync with disk-size-manager
and other callers"). Ensure the comment no longer restates the type and only
documents constraints that need human attention while leaving the StateScope
type itself as the source of truth.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/publish.yml:
- Around line 28-30: The workflow currently runs a fresh npm install using "npm
install --legacy-peer-deps" which lets transitive deps vary between runs; change
the job to first generate a package-lock.json deterministically with "npm
install --package-lock-only --legacy-peer-deps" (or "npm shrinkwrap" if
preferred) and then use "npm ci --legacy-peer-deps" for the actual install
before running "npm run build" and "npm test", so the build+publish steps (and
the --provenance attestation) use the same resolved dependency graph and avoid
rebuilding dependencies unpinned between steps.

---

Nitpick comments:
In `@src/types.ts`:
- Around line 853-857: Remove the redundant WHAT-style JSDoc above the
StateScope declaration: either delete the block entirely or replace it with a
short, specific note only for non-obvious constraints that the type cannot
encode (for example, "keys must be kept in sync with disk-size-manager and other
callers"). Ensure the comment no longer restates the type and only documents
constraints that need human attention while leaving the StateScope type itself
as the source of truth.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f293c430-4cc6-4afb-b807-c0a943de56a3

📥 Commits

Reviewing files that changed from the base of the PR and between a12767b and 121322b.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (4)
  • .github/workflows/ci.yml
  • .github/workflows/publish.yml
  • src/functions/disk-size-manager.ts
  • src/types.ts

Comment thread .github/workflows/publish.yml Outdated
…blish

Follow-up to the npm-install switch in 121322b. `npm install` resolves
transitive deps fresh each step, so the version graph under which we
`build` + `test` can differ from what the publish step attests with
`--provenance`. Split install into two steps:

  1. `npm install --package-lock-only ...` — generates a lockfile in
     the runner workspace (not committed, lockfiles stay gitignored).
  2. `npm ci --legacy-peer-deps ...` — installs deterministically from
     that lockfile.

Now every step in a single job run uses the same resolved dependency
graph. Still no cross-run pinning (that is by design — we do not
commit lockfiles), only intra-job consistency.

Addresses CodeRabbit feedback on publish.yml. Applies the same pattern
to ci.yml for consistency.
rohitg00 added a commit that referenced this pull request Apr 22, 2026
Findings verified against current code on this branch; all four valid.

1. config.ts loadFallbackConfig (L281) — user could set
   FALLBACK_PROVIDERS=agent-sdk and bypass the AGENTMEMORY_ALLOW_AGENT_SDK
   gate added to detectProvider. Filter it out at the fallback layer too,
   with the same warning pointing at the opt-in flag.

2. summarize.ts (L87-92) — the empty_provider_response branch returned
   without recording failure metrics or a diagnostic log, unlike the
   parse/validation paths. Record the same metricsStore failure event and
   log provider name, prompt size, system size, and observation count so
   empty responses are visible in telemetry.

3. providers/agent-sdk.ts (L14-45) — setting
   process.env.AGENTMEMORY_SDK_CHILD = '1' without restoring it caused
   every subsequent .query() in the same parent process to hit the
   short-circuit guard and return '' (classified as a SDK child it is
   not). Capture prev, set in try, restore in finally (delete if prev
   was undefined). Child processes spawned during the for-await loop
   still inherit the marker because env is inherited at spawn time; we
   only restore after the loop completes.

4. plugin/scripts/sdk-guard-DI1NUOS9.mjs — tsdown extracted the shared
   guard helper into a hashed chunk. Hash rotates on every rebuild and
   churns the diff. Stopped using the shared module from hooks entirely
   and inlined the 6-line guard function into each hook .ts file
   instead. sdk-guard.ts stays in the tree because the unit tests cover
   it directly. Deleted the tracked hashed .mjs and confirmed no new
   chunk is emitted.

Also applied the CI two-step install (npm install --package-lock-only
then npm ci) on this branch, matching #184. Without it, npm ci fails
because lockfiles are gitignored.

Tests: 74 files / 819 tests pass.
rohitg00 added a commit that referenced this pull request Apr 22, 2026
Pulls in #184 runtime fix, #188 viewer pipeline, lockfile gitignore,
CI two-step install, StateScope types, firstPrompt typeof guard,
content-addressed lesson/crystal IDs, image-quota fail-closed, and
onnxruntime optionalDependencies. Resolves workflow conflicts flagged
by CodeRabbit — ci.yml / publish.yml on main are already the correct
two-step install pattern.

# Conflicts:
#	.github/workflows/publish.yml
@rohitg00
Copy link
Copy Markdown
Owner Author

Closing — content already landed on main:

git log origin/fix/runtime-iii-sdk-0.11 ^origin/main shows only the two ci-adjustment commits that are already semantically present. No net changes left. Closing to keep the PR list clean.

@rohitg00 rohitg00 closed this Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant