Skip to content

feat(memory_tree): switch embed model to bge-m3 (1024-dim, 8K context)#1174

Merged
senamakel merged 1 commit intotinyhumansai:mainfrom
sanil-23:feat/bge-m3-embedder
May 4, 2026
Merged

feat(memory_tree): switch embed model to bge-m3 (1024-dim, 8K context)#1174
senamakel merged 1 commit intotinyhumansai:mainfrom
sanil-23:feat/bge-m3-embedder

Conversation

@sanil-23
Copy link
Copy Markdown
Contributor

@sanil-23 sanil-23 commented May 4, 2026

Summary

Migrates the Phase 4 embedder from nomic-embed-text (768-dim, 2048-token context) to bge-m3 (1024-dim, native 8192-token context).

Driver: nomic's hard 2048-token context cap was failing on long chunks. The chunker's char-based heuristic (DEFAULT_CHUNK_MAX_TOKENS = 3000) undercounts BERT-WordPiece tokens by ~1.5-2× for HTML-derived markdown — a 1500-chunker-token chunk routinely produces 2500+ real tokens, exceeding nomic's cap. Even with per-request num_ctx overrides, Ollama clamps to the model's GGUF-baked context_length, so the only fix is a model swap. bge-m3 has native 8192 context and is widely deployed for retrieval reranking.

⚠ Breaking (on-disk format)

mem_tree_chunks.embedding and mem_tree_summaries.embedding blobs from the 768-dim era are invalid against the new 1024-dim layout. unpack_embedding's post-call dim check will reject them on read. Operators upgrading need to either:

UPDATE mem_tree_chunks SET embedding = NULL;
UPDATE mem_tree_summaries SET embedding = NULL;

(workers re-embed on next access) — or wipe chunks.db entirely and re-sync.

The legacy fallback path in retrieval that drops Option::None embedding rows to the bottom of a semantic rerank handles the transitional state cleanly: stale rows still surface (via keyword/recency signals) until they're re-embedded.

Changes

File Change
score/embed/mod.rs EMBEDDING_DIM 768 → 1024; module + struct docstrings updated; test fixture updated
score/embed/ollama.rs DEFAULT_MODEL nomic-embed-textbge-m3; add EMBED_NUM_CTX = 8192 const; add num_ctx field to EmbedOptions request struct so Ollama doesn't clamp to its 4096 default; test assertions updated
score/embed/factory.rs Test fixture switched to "bge-m3"

Test plan

  • 26 unit tests pass in memory::tree::score::embed::*
  • cargo check --bin openhuman-core clean

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • Chores
    • Updated embedding model configuration and increased embedding dimensionality to improve semantic search accuracy.

Migrates the Phase 4 embedder from `nomic-embed-text` (768-dim, 2048
token context) to `bge-m3` (1024-dim, native 8192 token context).

Driver: nomic's hard 2048-token context cap was failing on long
chunks. The chunker's char-based heuristic (`DEFAULT_CHUNK_MAX_TOKENS
= 3000`) undercounts BERT-WordPiece tokens by ~1.5-2× for HTML-derived
markdown — a 1500-chunker-token chunk routinely produces 2500+ real
tokens, exceeding nomic's 2048 cap. Even with `num_ctx` overrides at
the request level, Ollama clamps to the model's GGUF-baked
`context_length`, so the only fix is to change models. bge-m3 has
native 8192 context and is widely deployed for retrieval reranking.

⚠ BREAKING (on-disk): `mem_tree_chunks.embedding` and
`mem_tree_summaries.embedding` blobs from the 768-dim era are invalid
against the new 1024-dim layout. `unpack_embedding`'s post-call dim
check will reject them. Operators upgrading need to either:
  - Wipe the `embedding` columns: `UPDATE mem_tree_chunks SET embedding = NULL;`
    `UPDATE mem_tree_summaries SET embedding = NULL;` — workers
    re-embed on next access.
  - Or wipe `chunks.db` entirely and re-sync.

Changes:
- `score::embed::mod`: `EMBEDDING_DIM` 768 → 1024; module + struct
  docstrings updated; test fixture updated.
- `score::embed::ollama`: `DEFAULT_MODEL` `nomic-embed-text` → `bge-m3`;
  add `EMBED_NUM_CTX = 8192` const; add `num_ctx` field to the
  `EmbedOptions` request struct (Ollama defaults `num_ctx = 4096`
  unless overridden, and bge-m3's native cap is 8192); test
  assertions updated for new model + dim + options.
- `score::embed::factory`: test fixture switched to "bge-m3".

26 tests pass in `memory::tree::score::embed`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sanil-23 sanil-23 requested a review from a team May 4, 2026 06:40
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 4, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e49caf3d-0c7a-4c29-818e-b83aa53fe04d

📥 Commits

Reviewing files that changed from the base of the PR and between 644c5c8 and 984bbf7.

📒 Files selected for processing (3)
  • src/openhuman/memory/tree/score/embed/factory.rs
  • src/openhuman/memory/tree/score/embed/mod.rs
  • src/openhuman/memory/tree/score/embed/ollama.rs

📝 Walkthrough

Walkthrough

The PR updates the default embedding model from nomic-embed-text to bge-m3 across three modules, increasing the embedding dimensionality constant from 768 to 1024, adding a num_ctx override option to Ollama requests, and updating corresponding tests and documentation.

Changes

Embedding Model Migration (nomic-embed-text → bge-m3)

Layer / File(s) Summary
Data Shape
src/openhuman/memory/tree/score/embed/mod.rs, src/openhuman/memory/tree/score/embed/ollama.rs
EMBEDDING_DIM constant updated from 768 to 1024. New EMBED_NUM_CTX constant (8192) added to Ollama module.
Core Implementation
src/openhuman/memory/tree/score/embed/ollama.rs
DEFAULT_MODEL changed to "bge-m3". EmbedRequest now includes options field with EmbedOptions struct containing num_ctx override. Request construction updated to pass num_ctx: EMBED_NUM_CTX to Ollama.
Documentation
src/openhuman/memory/tree/score/embed/mod.rs, src/openhuman/memory/tree/score/embed/ollama.rs
Module-level docs revised to reference bge-m3 model, native 8192 token context, and 1024 output dimensions. Migration notes added stating old 768-dimension embedding blobs must be wiped and re-embedded.
Integration & Tests
src/openhuman/memory/tree/score/embed/factory.rs, src/openhuman/memory/tree/score/embed/mod.rs, src/openhuman/memory/tree/score/embed/ollama.rs
Factory test ollama_chosen_when_endpoint_and_model_set updated to use bge-m3. Unit tests unpack_wrong_dim_errors, happy_path_returns_embedding, and dim_mismatch_rejected updated to assert 1024 dimensions and bge-m3 model with num_ctx == 8192 in request payload.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 A rabbit hops through embeddings bright,
From nomic texts to bge-m3's light,
Eight thousand tokens, one-oh-two-four dims tall,
Context flows deeper, we've answered the call! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and clearly describes the main change: switching the embedding model from nomic-embed-text to bge-m3, including the key technical details (1024-dim, 8K context) that differentiate it.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Review rate limit: 2/5 reviews remaining, refill in 27 minutes and 23 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

@senamakel senamakel merged commit b8113e9 into tinyhumansai:main May 4, 2026
19 checks passed
jwalin-shah added a commit to jwalin-shah/openhuman that referenced this pull request May 5, 2026
* feat(remotion): Ghosty character library with transparent MOV variants (tinyhumansai#1059)

Co-authored-by: WOZCODE <contact@withwoz.com>

* feat(composio/gmail): sync into memory tree (Slack-parity) (tinyhumansai#1056)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(scheduler-gate): throttle background AI on battery / busy CPU (tinyhumansai#1062)

* fix(core,cef): run core in-process and stop orphaning CEF helpers on Cmd+Q (tinyhumansai#1061)

* ci: add dedicated staging release workflow (tinyhumansai#1066)

* fix(sentry): Rust source context + per-release deploy marker (tinyhumansai#405) (tinyhumansai#1067)

* fix(welcome): re-enable OAuth buttons with focus/timeout recovery (tinyhumansai#1049) (tinyhumansai#1069)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(dependencies): update pnpm-lock.yaml and Cargo.lock for package… (tinyhumansai#1082)

* fix(onboarding): personalize welcome agent greeting with user identity (tinyhumansai#1078)

* fix(chat): make agent message bubbles fit content width (tinyhumansai#1083)

* Feat/dmg checks (tinyhumansai#1084)

* fix(linux): Add X11 platform flags to .deb package launcher (tinyhumansai#1087)

Co-authored-by: unn-Known1 <unn-known1@users.noreply.github.com>

* fix(sentry): auto-send React events; collapse core→tauri for desktop (tinyhumansai#1086)

Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>

* fix(cef): run blank reload guard on the CEF UI thread (tinyhumansai#1092)

* fix(app): reload webview instead of restart_app in dev mode (tinyhumansai#1068) (tinyhumansai#1071)

* fix(linux): deliver X11 ozone flags via custom .desktop template (tinyhumansai#1091)

* fix(webview-accounts): retry data-dir purge so CEF handle race doesn't leak cookies (tinyhumansai#1076) (tinyhumansai#1081)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>

* fix(webview/slack): media perms + deep-link isolation (tinyhumansai#1074) (tinyhumansai#1080)

Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>

* ci(release): split staging vs production workflows; promote staging tags (tinyhumansai#1094)

* Update release-staging.yml (tinyhumansai#1097)

* chore(staging): v0.53.5

* chore(staging): v0.53.6

* ci(staging): cut staging from main; add act local-debug helper (tinyhumansai#1099)

* chore(staging): v0.53.7

* fix(ci): correct sentry-cli download URL and trap scope (tinyhumansai#1100)

* chore(staging): v0.53.8

* feat(chat): forward thread_id to backend for KV cache locality (tinyhumansai#1095)

* fix(ci): bump pinned sentry-cli to 3.4.1 (2.34.2 was never published) (tinyhumansai#1102)

* chore(staging): v0.53.9

* fix(ci): drop bash trap in upload_sentry_symbols.sh; inline cleanup (tinyhumansai#1103)

* chore(staging): v0.53.10

* refactor(session): flatten session_raw/, switch md to YYYY_MM_DD (tinyhumansai#1098)

* Add full Composio managed-auth toolkit catalog (tinyhumansai#1093)

* ci: add diff-aware 80% coverage gate (Vitest + cargo-llvm-cov) (tinyhumansai#1104)

* feat(scripts): pnpm work + pnpm debug for agent-driven workflows (tinyhumansai#1105)

* ci: pull pnpm into CI image, drop redundant setup steps (tinyhumansai#1107)

* docs: add Cursor Cloud specific instructions to AGENTS.md (tinyhumansai#1106)

Co-authored-by: Cursor Agent <cursoragent@cursor.com>

* chore(staging): v0.53.11

* docs: surface 80% coverage gate and scripts/debug runners (tinyhumansai#1108)

* feat(app): show Composio integrations as sorted icon grid on Skills (tinyhumansai#1109)

Co-authored-by: Cursor Agent <cursoragent@cursor.com>

* feat(composio): client-side trigger enable/disable toggles (tinyhumansai#1110)

* feat(skills): channels grid + integrations card polish; tolerant Composio trigger decode (tinyhumansai#1112)

* chore(staging): v0.53.12

* feat(home): early-bird banner + assistant→agent terminology (tinyhumansai#1113)

* feat(updater): in-app auto-update with auto-download + restart prompt (tinyhumansai#677) (tinyhumansai#1114)

* chore(claude): add ship-and-babysit slash command (tinyhumansai#1115)

* feat(home): EarlyBirdyBanner + agent terminology + LinkedIn enrichment model pin (tinyhumansai#1118)

* fix(chat): single onboarding thread in sidebar after wizard (tinyhumansai#1116)

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Steven Enamakel <senamakel@users.noreply.github.com>

* fix: filter out global namespace from citation chips (tinyhumansai#1124)

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: senamakel-droid <281415773+senamakel-droid@users.noreply.github.com>

* feat(nav): enable Memory tab in BottomTabBar (tinyhumansai#1125)

* feat(memory): singleton ingestion + status RPC + UI pill (tinyhumansai#1126)

* feat(human): mascot tab with viseme-driven lipsync (staging only) (tinyhumansai#1127)

* Fix CEF zombie processes on full app close and restart (tinyhumansai#1128)

Co-authored-by: senamakel-droid <281415773+senamakel-droid@users.noreply.github.com>
Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>

* Update issue templates for GitHub issue types (tinyhumansai#1146)

* feat(human): expand mascot expressions and tighten reply-speech state machine (tinyhumansai#1147)

* feat(memory): ingestion pipeline + tree-architecture docs + ops/schemas split (tinyhumansai#1142)

* feat(threads): surface live subagent work in parent thread (tinyhumansai#1122) (tinyhumansai#1159)

* fix(human): keep mascot mouth animating when TTS ships no viseme data (tinyhumansai#1160)

* feat(composio): consume backend markdownFormatted for LLM output (tinyhumansai#1165)

* fix(subagent): lazy-register toolkit actions filtered out of fuzzy top-K (tinyhumansai#1162)

* feat(memory): user-facing long-term memory window preset (tinyhumansai#1137) (tinyhumansai#1161)

* fix(tauri-shell): proactively kill stale openhuman RPC on startup (tinyhumansai#1166)

* chore(staging): v0.53.13

* fix(composio): per-action tool consumes backend markdownFormatted (tinyhumansai#1167)

* fix(threads): persist selectedThreadId across reloads (tinyhumansai#1168)

* feat(memory_tree): switch embed model to bge-m3 (1024-dim, 8K context) (tinyhumansai#1174)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(agent): drop redundant [Memory context] recall injection (tinyhumansai#1173)

* chore(memory_tree): drop body-read timeouts on Ollama HTTP calls (tinyhumansai#1171)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(transcript): emit thread_id + fix orchestrator missing cost (tinyhumansai#1169)

* fix(composio/gmail): phase out html2md, prefer text/plain MIME part (tinyhumansai#1170)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(tools): markdown output for internal tool results (tinyhumansai#1172)

* feat(security): enforce prompt-injection guard before model and tool execution (tinyhumansai#1175)

* fix(cef): popup paint dies after first frame — skip blank-page guard for popups (tinyhumansai#1079) (tinyhumansai#1182)

Co-authored-by: Steven Enamakel <31011319+senamakel@users.noreply.github.com>

* chore(sentry): rename OPENHUMAN_SENTRY_DSN → OPENHUMAN_CORE_SENTRY_DSN (tinyhumansai#1186)

* feat(remotion): add yellow mascot character with all animation variants (tinyhumansai#1193)

Co-authored-by: Neel Mistry <neelmistry@Neels-MacBook-Pro.local>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(composio): hide raw connection ID, derive friendly label (tinyhumansai#1153) (tinyhumansai#1185)

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>

* fix(windows): align install.ps1 MSI with per-machine scope (tinyhumansai#913) (tinyhumansai#1187)

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(tauri): deterministic CEF teardown on full app close (tinyhumansai#1120) (tinyhumansai#1189)

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(composio): cap Gmail HTML body before strip (crash mitigation) (tinyhumansai#1191)

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(auth): stop stale chat threads after signup (tinyhumansai#1192)

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(sentry): staging-only "Trigger Sentry Test" button (tinyhumansai#1072) (tinyhumansai#1183)

* chore(staging): v0.53.14

* chore(staging): v0.53.15

* feat(composio): format trigger slugs into human-readable labels (tinyhumansai#1129) (tinyhumansai#1179)

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>

* fix(ui): hide unsupported permission UI on non-macOS for Screen Intelligence (tinyhumansai#1194)

Co-authored-by: Cursor <cursoragent@cursor.com>

* chore(tauri-shell): retire embedded Gmail webview-account flow (tinyhumansai#1181)

* feat(onboarding): replace welcome-agent bot with react-joyride walkthrough (tinyhumansai#1180)

* chore(release): v0.53.16

* fix(threads): preserve selectedThreadId on cold-boot identity hydration (tinyhumansai#1196)

* feat(core): version/shutdown/update RPCs + mid-thread integration refresh (tinyhumansai#1195)

* fix(mascot): swap to yellow mascot via @remotion/player (tinyhumansai#1200)

* feat(memory_tree): cloud-default LLM, queue priority, entity filter, Memory tab UI (tinyhumansai#1198)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Persist turn state + restore conversation history on cold-boot (tinyhumansai#1202)

* feat(mascot): floating desktop mascot via native NSPanel + WKWebView (macOS) (tinyhumansai#1203)

* fix(memory/tree): emit summary children as Obsidian wikilinks (tinyhumansai#1210)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(tools): coding-harness baseline primitives (tinyhumansai#1205) (tinyhumansai#1208)

* docs: add Codex PR checklist for remote agents

---------

Co-authored-by: Steven Enamakel <31011319+senamakel@users.noreply.github.com>
Co-authored-by: WOZCODE <contact@withwoz.com>
Co-authored-by: sanil-23 <sanil@vezures.xyz>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Cyrus Gray <144336577+graycyrus@users.noreply.github.com>
Co-authored-by: CodeGhost21 <164498022+CodeGhost21@users.noreply.github.com>
Co-authored-by: oxoxDev <164490987+oxoxDev@users.noreply.github.com>
Co-authored-by: Mega Mind <146339422+M3gA-Mind@users.noreply.github.com>
Co-authored-by: Gaurang Patel <ptelgm.yt@gmail.com>
Co-authored-by: unn-Known1 <unn-known1@users.noreply.github.com>
Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Steven Enamakel <senamakel@users.noreply.github.com>
Co-authored-by: Steven Enamakel's Droid <enamakel.agent@tinyhumans.ai>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: senamakel-droid <281415773+senamakel-droid@users.noreply.github.com>
Co-authored-by: YellowSnnowmann <167776381+YellowSnnowmann@users.noreply.github.com>
Co-authored-by: Neil <neil@maha.xyz>
Co-authored-by: Neel Mistry <neelmistry@Neels-MacBook-Pro.local>
Co-authored-by: obchain <167975049+obchain@users.noreply.github.com>
Co-authored-by: Jwalin Shah <jshah1331@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants