Skip to content

fix: use in-memory HNSW with auto-rebuild & add i18n.js web server route#49

Merged
tickernelz merged 3 commits intotickernelz:mainfrom
kui123456789:fix/hnsw-inmemory-i18n-route
Mar 2, 2026
Merged

fix: use in-memory HNSW with auto-rebuild & add i18n.js web server route#49
tickernelz merged 3 commits intotickernelz:mainfrom
kui123456789:fix/hnsw-inmemory-i18n-route

Conversation

@kui123456789
Copy link
Contributor

Summary

Follow-up fixes to PR #48. Resolves two issues discovered during runtime testing:

  1. hnswlib-wasm does not support Node.js/Bun filesystem — HNSW indexes must be kept in-memory
  2. Web UI i18n broken — missing static file route for i18n.js caused the entire web UI to fail to load

Changes

1. Fix: Use in-memory HNSW with auto-rebuild from SQLite (CRITICAL)

Problem: After PR #48 fixed the hnswlib-wasm API calls, runtime testing revealed that hnswlib-wasm is compiled with -sENVIRONMENT=web (Emscripten), making it browser-only. In Node.js/Bun:

  • Direct usage throws: "not compiled for this environment"
  • The Emscripten FS module is not exported, so real filesystem to virtual filesystem bridging is impossible
  • writeIndex()/readIndex() cannot persist to disk

Fix (src/services/sqlite/hnsw-index.ts):

  • Add globalThis.window = globalThis polyfill before loading hnswlib-wasm to bypass environment check
  • Switch to purely in-memory HNSW indexes — no file persistence for .hnsw binary data
  • Keep .meta files (id mapping) on real filesystem as before
  • Use empty string for autoSaveFilename constructor arg to avoid unnecessary IDBFS sync noise
  • Pass correct API arguments: addPoint(vector, label, false), searchKnn(vector, k, null)
  • Add isPopulated() method to check if index has data
  • Add Math.min(k, count) guard to prevent searching empty indexes

Fix (src/services/sqlite/vector-search.ts):

  • Add auto-rebuild logic: when searchInShard() detects an empty HNSW index, it automatically rebuilds from SQLite vectors via rebuildFromShard() before searching
  • This ensures indexes are transparently reconstructed after process restart

Architecture: All vectors are already stored in SQLite. HNSW indexes are now ephemeral acceleration structures rebuilt on-demand from the authoritative SQLite store.

2. Fix: Reduce IDBFS stderr noise

loadHnswlib() always attempts to mount IDBFS (browser IndexedDB), which fails in Node.js/Bun and prints warnings to stderr. Using a non-empty autoSaveFilename causes additional IDBFS sync attempts on every addPoint()/markDelete(). Fix: use empty string as autoSaveFilename.

3. Fix: Add i18n.js static file route to web server

Problem: PR #48 added Chinese localization via src/web/i18n.js, but web-server.ts had no route to serve this file. Browser requests to /i18n.js returned 404, causing t() function to be undefined, which crashed app.js initialization. The web UI showed "Initializing..." with 0 memories despite the API working correctly.

Fix (src/services/web-server.ts): Add static file route for /i18n.js alongside existing routes for app.js, styles.css, etc.

Verification

  • bun run typecheck — zero errors
  • bun test — 50 pass, 0 fail
  • Runtime tested: memory add success, memory search returns results with correct similarity scores
  • Web UI loads correctly with i18n support after fix

Files Changed

File Change
src/services/sqlite/hnsw-index.ts In-memory HNSW, window polyfill, correct API args, empty autoSaveFilename
src/services/sqlite/vector-search.ts Auto-rebuild HNSW from SQLite on first search
src/services/web-server.ts Add /i18n.js static file route

hnswlib-wasm requires browser environment (Emscripten -sENVIRONMENT=web)
and does not export FS for real filesystem bridging. This commit:

- Adds globalThis.window monkey-patch for Node.js/Bun compatibility
- Uses correct hnswlib-wasm API: initIndex(maxElements, m, efConstruction, seed),
  addPoint(vector, label, replaceDeleted), searchKnn(vector, k, filter)
- Keeps HNSW indexes purely in-memory (no file persistence)
- Auto-rebuilds indexes from SQLite vectors on first search after restart
- Adds isPopulated() method to HNSWIndex for rebuild detection
- Removes unused readFileSync import and getVirtualFilename helper
hnswlib-wasm's autoSave feature triggers IDBFS sync on every addPoint
and markDelete call. Since we use in-memory-only indexes (no file
persistence), pass empty string to disable auto-save and reduce
stderr noise from missing indexedDB in Node.js/Bun environment.
Copilot AI review requested due to automatic review settings March 1, 2026 17:26
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This follow-up to PR #48 fixes two runtime-discovered issues: (1) hnswlib-wasm's Emscripten browser-only compilation preventing HNSW from working in Node.js/Bun, solved by switching to purely in-memory indexes rebuilt on-demand from SQLite; and (2) a missing /i18n.js static route in the web server causing the web UI to fail to load.

Changes:

  • hnsw-index.ts: Adds a globalThis.window polyfill to load the browser-only WASM library in Node.js/Bun, switches HNSW to in-memory-only with correct API arguments, persists only .meta id-mapping files, and adds isPopulated().
  • vector-search.ts: Adds auto-rebuild of the in-memory HNSW index from SQLite when a search is triggered on an empty index (i.e., after process restart).
  • web-server.ts: Adds a static file route for /i18n.js.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
src/services/sqlite/hnsw-index.ts In-memory HNSW with window polyfill, correct API usage, meta-only persistence
src/services/sqlite/vector-search.ts Auto-rebuild HNSW index from SQLite on first search after restart
src/services/web-server.ts Add /i18n.js static route to fix web UI loading
Comments suppressed due to low confidence (1)

src/services/sqlite/hnsw-index.ts:152

  • The .meta file (containing nextId, idMap, reverseMap) is still written to disk on every insert(), insertBatch(), and delete() call via save(), but is never read back anywhere in the codebase — readFileSync has been removed entirely. On process restart, the HNSW index and its id mappings are fully rebuilt from SQLite via rebuildFromShard(), which re-assigns fresh internal IDs. Writing the meta file is now dead I/O on every mutation. The save() method and all related filesystem code (writeFileSync, the dir existence check, the metaPath construction) should be removed to avoid unnecessary disk writes on every insert/delete.
  async save(): Promise<void> {
    if (!this.index) return;

    const dir = dirname(this.indexPath);
    if (!existsSync(dir)) {
      mkdirSync(dir, { recursive: true });
    }

    // Only persist id mapping (.meta file). HNSW index data lives in-memory
    // and is rebuilt from SQLite vectors on process restart.
    const metaPath = this.indexPath + ".meta";
    const meta = {
      nextId: this.nextId,
      idMap: Object.fromEntries(this.idMap),
      reverseMap: Object.fromEntries(this.reverseMap),
    };
    writeFileSync(metaPath, JSON.stringify(meta));
  }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +206 to +208
if (path === "/i18n.js") {
return this.serveStaticFile("i18n.js", "application/javascript");
}
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The /i18n.js route is added to web-server.ts, but an identical static file routing block exists in web-server-worker.ts (the Bun worker variant of the server). The worker file handles /, /styles.css, /app.js, and /favicon.ico but has no case for /i18n.js (confirmed by searching the file). If the server runs in worker mode (via web-server-worker.ts), requests to /i18n.js will still return 404, leaving the web UI broken just as before this fix.

Copilot uses AI. Check for mistakes.
@tickernelz tickernelz merged commit 38aa97e into tickernelz:main Mar 2, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants