feat: add Tavily as configurable internet search backend#1357
Merged
CaralHsi merged 126 commits intoMemTensor:dev-20260323-v2.0.11from Mar 27, 2026
Merged
Conversation
added 30 commits
March 8, 2026 12:01
# Conflicts: # apps/memos-local-openclaw/tests/integration.test.ts
…licitly - createMemorySearchTool now accepts optional store and ctx parameters instead of accessing private members via (engine as any) - Add missing worker.flush() call in root index.ts stop handler - Fix shutdown-lifecycle test mocks for new module dependencies Made-with: Cursor
Prevents memory exhaustion from oversized request bodies. Returns 413 status code when the limit is exceeded. Made-with: Cursor
Centralized auth + rate limit check for all authenticated endpoints. - 60 req/min default, 30 req/min for search endpoints - Returns 429 with retryAfterMs when exceeded - Refactored handle() to authenticate once at the top Made-with: Cursor
Add getHubGroupById, deleteHubGroup, and listHubGroupMembers to support the existing Hub server group management API endpoints. Made-with: Cursor
- Viewer server: add proxy endpoints for group CRUD, member management, and user listing (all forwarded to Hub API) - Hub server: add GET /admin/users endpoint for listing active users - Viewer UI: wire up "Manage Groups" button with group/member panel (already had the JS functions, now connected via server routes) Made-with: Cursor
- Add local_shared_tasks table to track which tasks have been shared - task_share/task_unshare now mark/unmark tasks in local tracking table - After each agent_end, automatically sync new chunks for shared tasks to the Hub without requiring manual re-share Made-with: Cursor
- Add upsertHubEmbedding/getHubEmbedding/getAllHubEmbeddings to SqliteStore - Hub server accepts optional embedder; embeds shared chunks async on receive - Hub search now merges FTS and vector results via RRF (k=60) when embedder is available, falling back to FTS-only otherwise - Pass embedder instance to HubServer from root index.ts Made-with: Cursor
- config.ts: inject provider:"openclaw" when hostEmbedding/hostCompletion capabilities are enabled but no explicit provider is configured, so the fallback priority chain (user config → openclaw → local/rule) works correctly - openclaw-api.ts: rename class to OpenClawAPIClient and add explicit `implements OpenClawAPI` to avoid confusion with the interface in types.ts - ingest/providers/index.ts: rewrite all 5 openclaw summarizer methods with high-quality prompts aligned with openai.ts (language preservation, structured JSON output for filter/dedup), reuse parseFilterResult/parseDedupResult - viewer/server.ts: pass openclawAPI to Summarizer in migration code path - openai.ts: export parseFilterResult for reuse Made-with: Cursor
- memory_search/skill_search: read agentId from context (3rd arg) instead of params, matching the OpenClaw Plugin SDK calling convention - memory_timeline: add owner filtering so agents cannot access other agents' private chunks via timeline traversal - memory_get: add owner filtering so agents cannot read other agents' private chunks directly - memory_search: include ref and summary in details.hits for downstream tools (timeline/get) that need the full ChunkRef - service.stop(): reorder to flush worker before telemetry shutdown, ensuring data persistence completes before auxiliary services stop Made-with: Cursor
The plugin runs as ESM (type: module) but telemetry.ts relied on __dirname which is undefined in ESM. Credentials file existed on disk but was never found, silently disabling all telemetry since day one. Now accepts pluginDir from index.ts (resolved via import.meta.url) and uses it as primary search path for telemetry.credentials.json. Made-with: Cursor
…from repo - Delete the duplicated memos-memory-guide file under the faux ~/.openclaw runtime path - Ignore ~/.openclaw-style generated content inside the plugin directory to avoid future accidental commits Made-with: Cursor
… isolation, and viewer improvements (#1321) ## Description This PR brings the `native_memos` branch into `main` with a major upgrade to the `memos-local-openclaw` plugin, focused on team sharing, dual-instance isolation, viewer usability, recall quality, and documentation completeness. It solves several practical issues that appeared during real-world local OpenClaw usage, especially when running multiple instances on the same machine: - team sharing state transitions were fragile across join/leave/role-switch flows - Hub/Viewer ports and session state could conflict in dual-instance setups - admin and member notifications were incomplete or unclear - Viewer polling and sharing UX had refresh/state consistency issues - documentation and landing pages did not fully explain the Hub/Client architecture, multi-instance deployment, and collaboration workflow Implementation approach: - added and hardened the Hub/Client sharing architecture, including Hub auth, client connector flow, server-side status management, admin operations, and notification handling - improved dual-instance isolation by refining runtime/config behavior around gateway-derived ports, sharing state cleanup, and local-vs-hub behavior boundaries - enhanced recall quality with origin tracking, Hub result filtering, and local/hybrid search improvements - improved Viewer UX and stability across sharing setup, admin workflows, polling, notifications, and collaboration-related UI - added and updated automated test coverage for hub server, viewer sharing, connector logic, storage, integration, and skill runtime flows - expanded README, Team Sharing guide, docs site, troubleshooting docs, and landing pages to better document collaboration workflows and multi-OpenClaw capabilities Relevant dependencies / requirements: - no new mandatory external product dependency for end users beyond existing model/provider configuration - local runtime still depends on `better-sqlite3` native bindings being built for the active Node.js version - optional telemetry and model-provider related configuration continue to follow the existing plugin setup Related Issue (Required): Fixes #1320 ## Type of change - [x] Bug fix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [x] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [x] Refactor (does not change functionality, e.g. code style improvements, linting) - [x] Documentation update ## How Has This Been Tested? The change was validated through a combination of targeted automated coverage added in this branch and manual end-to-end verification of multi-instance local OpenClaw scenarios. Test areas covered in this branch include: - Hub server and authentication behavior - viewer sharing flows and config handling - client connector and sharing state transitions - storage and integration flows - skill runtime and sharing-related scenarios Manual reproduction / verification steps: 1. Start two local OpenClaw instances with isolated `OPENCLAW_CONFIG_PATH`, `OPENCLAW_STATE_DIR`, and workspace directories. 2. Install and enable `memos-local-openclaw` in both instances. 3. Configure one instance as Hub and another as Client. 4. Verify gateway, viewer, and Hub startup behavior, including port isolation and no cross-instance conflict. 5. Submit a join request from the Client, approve it from the Hub, and verify status updates and notifications. 6. Test admin operations including promote, demote, remove member, and self-removal prevention. 7. Test client leave / disable / role-switch behavior and verify state cleanup and reconnect behavior. 8. Verify Viewer live updates do not cause unnecessary full refreshes or scroll-position loss. 9. Open the updated docs / landing pages and confirm installation instructions, Team Sharing content, and collaboration visuals are displayed correctly. Suggested test environment: - OS: macOS - Runtime: local OpenClaw dual-instance setup - Plugin: `memos-local-openclaw` - Database: SQLite / `better-sqlite3` - [ ] Unit Test - [x] Test Script Or Test Steps (please provide) - [ ] Pipeline Automated API Test (please provide) ## Checklist - [x] I have performed a self-review of my own code | 我已自行检查了自己的代码 - [x] I have commented my code in hard-to-understand areas | 我已在难以理解的地方对代码进行了注释 - [x] I have added tests that prove my fix is effective or that my feature works | 我已添加测试以证明我的修复有效或功能正常 - [x] I have created related documentation issue/PR in [MemOS-Docs](https://github.com/MemTensor/MemOS-Docs) (if applicable) | 我已在 [MemOS-Docs](https://github.com/MemTensor/MemOS-Docs) 中创建了相关的文档 issue/PR(如果适用) - [x] I have linked the issue to this PR (if applicable) | 我已将 issue 链接到此 PR(如果适用) - [x] I have mentioned the person who will review this PR | 我已提及将审查此 PR 的人 ## Reviewer Checklist - [x] closes #1320 - [x] Made sure Checks passed - [x] Tests have been provided
## Description add install or update scripts for memos openclaw local plugin Related Issue (Required): Fixes #issue_number ## Type of change Please delete options that are not relevant. - [ ] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] Refactor (does not change functionality, e.g. code style improvements, linting) - [ ] Documentation update ## How Has This Been Tested? Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration - [ ] Unit Test - [ ] Test Script Or Test Steps (please provide) - [ ] Pipeline Automated API Test (please provide) ## Checklist - [ ] I have performed a self-review of my own code | 我已自行检查了自己的代码 - [ ] I have commented my code in hard-to-understand areas | 我已在难以理解的地方对代码进行了注释 - [ ] I have added tests that prove my fix is effective or that my feature works | 我已添加测试以证明我的修复有效或功能正常 - [ ] I have created related documentation issue/PR in [MemOS-Docs](https://github.com/MemTensor/MemOS-Docs) (if applicable) | 我已在 [MemOS-Docs](https://github.com/MemTensor/MemOS-Docs) 中创建了相关的文档 issue/PR(如果适用) - [ ] I have linked the issue to this PR (if applicable) | 我已将 issue 链接到此 PR(如果适用) - [ ] I have mentioned the person who will review this PR | 我已提及将审查此 PR 的人 ## Reviewer Checklist - [ ] closes #xxxx (Replace xxxx with the GitHub issue number) - [ ] Made sure Checks passed - [ ] Tests have been provided
## Description Please include a summary of the change, the problem it solves, the implementation approach, and relevant context. List any dependencies required for this change. Related Issue (Required): Fixes #issue_number ## Type of change Please delete options that are not relevant. - [ ] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] Refactor (does not change functionality, e.g. code style improvements, linting) - [ ] Documentation update ## How Has This Been Tested? Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration - [ ] Unit Test - [ ] Test Script Or Test Steps (please provide) - [ ] Pipeline Automated API Test (please provide) ## Checklist - [ ] I have performed a self-review of my own code | 我已自行检查了自己的代码 - [ ] I have commented my code in hard-to-understand areas | 我已在难以理解的地方对代码进行了注释 - [ ] I have added tests that prove my fix is effective or that my feature works | 我已添加测试以证明我的修复有效或功能正常 - [ ] I have created related documentation issue/PR in [MemOS-Docs](https://github.com/MemTensor/MemOS-Docs) (if applicable) | 我已在 [MemOS-Docs](https://github.com/MemTensor/MemOS-Docs) 中创建了相关的文档 issue/PR(如果适用) - [ ] I have linked the issue to this PR (if applicable) | 我已将 issue 链接到此 PR(如果适用) - [ ] I have mentioned the person who will review this PR | 我已提及将审查此 PR 的人 ## Reviewer Checklist - [ ] closes #xxxx (Replace xxxx with the GitHub issue number) - [ ] Made sure Checks passed - [ ] Tests have been provided
…ld (#1312) ## Problem `before_agent_start` is a legacy hook that the OpenClaw framework invokes **twice** per agent run: 1. **Model resolve phase** ([`run.ts` L348](https://github.com/openclaw/openclaw/blob/main/src/agents/pi-embedded-runner/run.ts#L348)): passes only `{ prompt }` — no `messages` field 2. **Prompt build phase** ([`attempt.ts` L1438](https://github.com/openclaw/openclaw/blob/main/src/agents/pi-embedded-runner/run/attempt.ts#L1438)): passes `{ prompt, messages }` — messages available This causes the auto-recall handler to fire twice: the first invocation has no `messages` so topic-aware pre-filtering (if any) is bypassed, and a redundant embedding search + LLM filter call runs on every turn. ## Solution Use `before_prompt_build` instead ([docs](https://docs.openclaw.ai/tools/plugin#plugin-hooks)), which: - Fires once per turn, after session load - Always provides `{ prompt, messages }` — reliable access to conversation history - Is the framework-recommended hook for prompt injection (per OpenClaw docs: *"legacy compatibility hook... prefer the explicit hooks above"*) - Returns the same result fields: `prependContext`, `systemPrompt`, `prependSystemContext`, `appendSystemContext` ## Changes - `before_agent_start` → `before_prompt_build` (single line change in `index.ts` L931) - No other logic changes; return contract is identical ## References - OpenClaw agent loop: [`concepts/agent-loop.md`](https://github.com/openclaw/openclaw/blob/main/docs/concepts/agent-loop.md) - OpenClaw plugin docs: [`tools/plugin.md`](https://github.com/openclaw/openclaw/blob/main/docs/tools/plugin.md) - Related issue: [openclaw/openclaw#26914](openclaw/openclaw#26914) (ephemeralContext request — same root cause of prependContext accumulation)
…ion (#1310) ## Problem Zhipu AI models (glm-4.7, glm-5) consume `max_tokens` budget with internal reasoning tokens, leaving **no room for actual output**. This causes all summarizer LLM functions to return empty strings: - `filterRelevant` (auto-recall relevance filter) → returns empty → "no relevant hits" - `judgeNewTopic` (topic boundary detection) → returns empty → misclassifies topics - `judgeDedup` (memory deduplication) → returns empty → duplicate memories accumulate - `summarize` / `summarizeTask` → returns empty → poor memory quality ## Root Cause Zhipu's reasoning models use `reasoning_tokens` that count against `max_tokens`: | Model | max_tokens | reasoning_tokens | Actual output | |-------|-----------|-----------------|---------------| | glm-5 | 200 | 187 | Empty (truncated) | | glm-4.7 | 200 | 199 | Empty (truncated) | | glm-4.7 | 10 | 10 | Empty (truncated) | ## Solution Inject `{"thinking": {"type": "disabled"}}` for requests to zhipu endpoints (`bigmodel.cn` / `zhipuai`). This disables the built-in reasoning mode for all summarizer LLM calls. **Non-zhipu providers are completely unaffected** — the helper only activates for zhipu endpoints. ## Test Results (glm-4.7 + thinking disabled) | Function | Input | Expected | Got | Status | |----------|-------|----------|-----|--------| | judgeNewTopic | SAME (related topic) | SAME | SAME | ✅ | | judgeNewTopic | NEW (unrelated topic) | NEW | NEW | ✅ | | judgeDedup | Different topics | NEW | NEW (correct) | ✅ | | filterRelevant | 3 candidates | [2] | [2] | ✅ | ## Tested Models Summary | Model | Without fix | With fix (thinking disabled) | |-------|------------|------------------------------| | glm-5 | ❌ All empty |⚠️ No truncation, but poor instruction following | | glm-4.7 | ❌ All empty | ✅ All functions correct | | qwen-turbo | ✅ (no reasoning) | N/A (not affected) | | OpenAI/other | ✅ (no reasoning) | ✅ (not affected) | ## Note The `skillEvolution` LLM calls use a separate code path (`shared/llm-call.ts`) and are **not affected** by this change. If needed, the same helper can be applied there independently.
…from memory (#1298) (#1302) ## Summary Filter out system-injected prompts and sentinel replies that were leaking into long-term memory: - **Sentinel replies** (`NO_REPLY`, `HEARTBEAT_OK`, `HEARTBEAT_CHECK`) are now skipped in `captureMessages()` for all roles - **Boot-check prompts** (e.g. "You are running a boot check", "## Memory system — ACTION REQUIRED") are filtered before storage - **`stripMemoryInjection()`** expanded with additional patterns for boot-check text and standalone sentinel values in user messages ## 修复说明 过滤掉泄露到长期记忆中的系统注入提示和哨兵回复: - 哨兵回复(`NO_REPLY`、`HEARTBEAT_OK`、`HEARTBEAT_CHECK`)在 `captureMessages()` 中对所有角色跳过 - 启动检查提示(如"You are running a boot check"、"## Memory system — ACTION REQUIRED")在存储前被过滤 - `stripMemoryInjection()` 增加了启动检查文本和用户消息中独立哨兵值的额外匹配模式 Closes #1298
## Description The import/migration flow could report overall success (`ok: true`) even when non-fatal sub-steps failed on imported items (summarization/dedup/embedding), because these failures were only logged or silently ignored and never propagated to import state. This PR keeps import behavior compatible (still stores raw memory when possible), but correctly reports step-level failures in migration state and final done payload: - Track per-item step failures (`summarization`, `dedup`, `embedding`) in `item` events. - Aggregate step failures in migration state (`stepFailures`) and compute `success` from both hard errors and step failures. - Final `done`/`stopped` SSE payload now includes full state and `ok: false` when any step fails. - `migrate/status` and replayed `migrate/stream` done events now surface this status consistently. - Added structured logs for successful full-step import vs partial-step failures. Related Issue (Required): Fixes #1303 ## Type of change - [x] Bug fix (non-breaking change which fixes an issue) ## How Has This Been Tested? - [x] Unit Test - [ ] Test Script Or Test Steps (please provide) - [ ] Pipeline Automated API Test (please provide) Added `apps/memos-local-openclaw/tests/migration-status.test.ts` to verify: - import state is marked unsuccessful when step failures exist even without fatal errors; - clean path remains successful; - explicit item errors still produce failure. Executed: - `npm test -- tests/migration-status.test.ts` (from `apps/memos-local-openclaw`) ## Checklist - [x] I have performed a self-review of my own code - [x] I have commented my code in hard-to-understand areas - [x] I have added tests that prove my fix is effective or that my feature works - [ ] I have created related documentation issue/PR in [MemOS-Docs](https://github.com/MemTensor/MemOS-Docs) (if applicable) - [x] I have linked the issue to this PR (if applicable) - [ ] I have mentioned the person who will review this PR
### ## Description Please include a summary of the change, the problem it solves, the implementation approach, and relevant context. List any dependencies required for this change. Related Issue (Required): Fixes #issue_number ## Type of change Please delete options that are not relevant. - [ ] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] Refactor (does not change functionality, e.g. code style improvements, linting) - [ ] Documentation update ## How Has This Been Tested? Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration - [ ] Unit Test - [ ] Test Script Or Test Steps (please provide) - [ ] Pipeline Automated API Test (please provide) ## Checklist - [ ] I have performed a self-review of my own code | 我已自行检查了自己的代码 - [ ] I have commented my code in hard-to-understand areas | 我已在难以理解的地方对代码进行了注释 - [ ] I have added tests that prove my fix is effective or that my feature works | 我已添加测试以证明我的修复有效或功能正常 - [ ] I have created related documentation issue/PR in [MemOS-Docs](https://github.com/MemTensor/MemOS-Docs) (if applicable) | 我已在 [MemOS-Docs](https://github.com/MemTensor/MemOS-Docs) 中创建了相关的文档 issue/PR(如果适用) - [ ] I have linked the issue to this PR (if applicable) | 我已将 issue 链接到此 PR(如果适用) - [ ] I have mentioned the person who will review this PR | 我已提及将审查此 PR 的人 ## Reviewer Checklist - [ ] closes #xxxx (Replace xxxx with the GitHub issue number) - [ ] Made sure Checks passed - [ ] Tests have been provided
Collaborator
|
Thanks for the contribution! Verified end-to-end — config parsing, factory wiring, and live search all work as expected. Clean integration that stays consistent with the existing backend pattern. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
INTERNET_SEARCH_BACKENDenv var (defaults tobocha)Files Changed
src/memos/configs/internet_retriever.py— AddedTavilySearchConfigdataclass and registered'tavily'inInternetRetrieverConfigFactory.backend_to_classsrc/memos/memories/textual/tree_text_memory/retrieve/tavilysearch.py— New file:InternetTavilyRetrieverimplementation usingtavily-pythonSDK, mirroringbochasearch.pypatternssrc/memos/memories/textual/tree_text_memory/retrieve/internet_retriever_factory.py— Registered'tavily'inbackend_to_classand added instantiation logic infrom_config()src/memos/api/config.py— Updatedget_internet_config()to support'tavily'backend viaINTERNET_SEARCH_BACKENDenv varpyproject.toml— Addedtavily-pythonas optional dependency under[tavily]extra and in[all]docker/requirements.txt— Addedtavily-python==0.5.0docker/requirements-full.txt— Addedtavily-python==0.5.0docker/.env.example— DocumentedINTERNET_SEARCH_BACKENDandTAVILY_API_KEYapps/memos-local-openclaw/.env.example— DocumentedTAVILY_API_KEYexamples/basic_modules/textual_memory_internet_search_example.py— Added Tavily example blockDependency Changes
tavily-python (>=0.5.0,<1.0.0)topyproject.tomloptional extras ([tavily]and[all])tavily-python==0.5.0todocker/requirements.txttavily-python==0.5.0todocker/requirements-full.txtEnvironment Variable Changes
INTERNET_SEARCH_BACKEND(default:bocha) to select the search backendTAVILY_API_KEY— required whenINTERNET_SEARCH_BACKEND=tavilyTAVILY_SEARCH_DEPTH(optional, default:basic)TAVILY_INCLUDE_ANSWER(optional, default:false)Notes for Reviewers
require_python_packagedecorator ensurestavily-pythonandjiebaare installed at runtimeAutomated Review