Add non-batch memory embedding concurrency control#57733
Add non-batch memory embedding concurrency control#57733itilys wants to merge 1 commit intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR adds a new What changed:
Notable behavior change: The Ollama fallback ( Confidence Score: 5/5Safe to merge — no logic errors; the only concern is a silent performance regression for Ollama users on capable machines, which is a deliberate conservative tradeoff and is fully overridable. All P2 findings. The implementation is correct, well-validated (finite-number check + floor + max(1,…)), consistently propagated through the type and config system, and covered by a new test. The Ollama hardcoded default is a deliberate design choice documented in the PR description; the concern about powerful-machine users is a quality suggestion, not a bug. extensions/memory-core/src/memory/manager-embedding-ops.ts — review the Ollama fallback and consider adding a diagnostic log line so users on high-resource machines can discover the new knob.
|
| Filename | Overview |
|---|---|
| extensions/memory-core/src/memory/manager-embedding-ops.ts | getIndexConcurrency updated to respect nonBatchConcurrency config and add an Ollama-specific conservative default of 1; logic is correct but the Ollama fallback silently reduces concurrency for all Ollama users. |
| src/agents/memory-search.ts | nonBatchConcurrency added to ResolvedMemorySearchConfig type and correctly propagated in mergeConfig using the same override-then-default ?? pattern as other remote fields. |
| src/config/types.tools.ts | nonBatchConcurrency?: number added to the remote block of MemorySearchConfig with a clear JSDoc comment. |
| src/agents/memory-search.test.ts | New test verifies that nonBatchConcurrency from defaults survives an agent-level override of other remote fields; expectMergedRemoteConfig helper extended cleanly with an optional extras parameter. |
| src/config/schema.base.generated.ts | New schema entry for nonBatchConcurrency added with label, help text, and performance tag, consistent with adjacent entries. |
| src/config/schema.help.ts | Help text for the new key added in alphabetical position between headers and batch.enabled. |
| src/config/schema.labels.ts | Label for the new key added consistently with surrounding entries. |
| src/config/schema.help.quality.test.ts | New key added to TARGET_KEYS coverage list to ensure it is covered by the schema help-quality tests. |
Prompt To Fix All With AI
This is a comment left during a code review.
Path: extensions/memory-core/src/memory/manager-embedding-ops.ts
Line: 550-552
Comment:
**Ollama default is a silent performance regression for high-resource setups**
The hardcoded `return 1` for Ollama changes the previous default from `EMBEDDING_INDEX_CONCURRENCY` (4) to 1 for all Ollama users, not only those on constrained hardware. Anyone running Ollama on a powerful machine (a local workstation with plenty of resources, for example) will see indexing throughput drop to 25% of its previous level with no warning.
Since `nonBatchConcurrency` is already checked first, the conservative default is effectively the right call for fragile setups and is overridable — but it would be worth a log-level note (debug or info) when this path is taken, so users can discover the setting if they notice slow indexing. Alternatively, a comment in the code documenting the tradeoff would help future maintainers understand why only Ollama gets this treatment.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "add non-batch memory embedding concurren..." | Re-trigger Greptile
| if (this.provider?.id === "ollama") { | ||
| return 1; | ||
| } |
There was a problem hiding this comment.
Ollama default is a silent performance regression for high-resource setups
The hardcoded return 1 for Ollama changes the previous default from EMBEDDING_INDEX_CONCURRENCY (4) to 1 for all Ollama users, not only those on constrained hardware. Anyone running Ollama on a powerful machine (a local workstation with plenty of resources, for example) will see indexing throughput drop to 25% of its previous level with no warning.
Since nonBatchConcurrency is already checked first, the conservative default is effectively the right call for fragile setups and is overridable — but it would be worth a log-level note (debug or info) when this path is taken, so users can discover the setting if they notice slow indexing. Alternatively, a comment in the code documenting the tradeoff would help future maintainers understand why only Ollama gets this treatment.
Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/memory-core/src/memory/manager-embedding-ops.ts
Line: 550-552
Comment:
**Ollama default is a silent performance regression for high-resource setups**
The hardcoded `return 1` for Ollama changes the previous default from `EMBEDDING_INDEX_CONCURRENCY` (4) to 1 for all Ollama users, not only those on constrained hardware. Anyone running Ollama on a powerful machine (a local workstation with plenty of resources, for example) will see indexing throughput drop to 25% of its previous level with no warning.
Since `nonBatchConcurrency` is already checked first, the conservative default is effectively the right call for fragile setups and is overridable — but it would be worth a log-level note (debug or info) when this path is taken, so users can discover the setting if they notice slow indexing. Alternatively, a comment in the code documenting the tradeoff would help future maintainers understand why only Ollama gets this treatment.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 82b23db749
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| apiKey?: SecretInput; | ||
| headers?: Record<string, string>; | ||
| /** Max concurrent non-batch embedding tasks during indexing. Useful for slower local providers such as Ollama. */ | ||
| nonBatchConcurrency?: number; |
There was a problem hiding this comment.
Add nonBatchConcurrency to runtime config schema
This commit introduces memorySearch.remote.nonBatchConcurrency in config types and merge logic, but MemorySearchSchema.remote is still strict and does not include that field in src/config/zod-schema.agent-runtime.ts, so OpenClawSchema.safeParse will reject configs that set the new option as an unknown key. That makes the documented feature unusable and can fail config validation at startup for users who follow the new setting; the runtime schema needs a matching nonBatchConcurrency number field (with the intended constraints) for both defaults and agent overrides.
Useful? React with 👍 / 👎.
|
Thanks @itilys. I carried this forward on main in 802f13a. What landed:
Verification on the landed commit path:
Closing this PR as superseded-by-main while preserving your credit in the changelog. Thanks again. |
Summary
This PR adds
agents.defaults.memorySearch.remote.nonBatchConcurrencyto control indexing concurrency when memory embeddings run through the non-batch path.This is especially useful for slower or local providers such as Ollama, where aggressive parallel indexing can cause timeout-heavy reindex failures on small machines like Raspberry Pi or low-resource VPS hosts.
What changed
memorySearch.remote.nonBatchConcurrencyconfig supportollamato use concurrency1Why
Batch controls currently help only providers that expose batch embedding APIs.
Providers that fall back to non-batch indexing, especially Ollama, can still hit timeout failures because indexing concurrency remains too high for constrained hardware.
This change makes non-batch indexing tunable and improves local-first reliability.
Real-world motivation
In a Raspberry Pi-class environment running local Ollama:
1made indexing complete successfullyExample config
{ "agents": { "defaults": { "memorySearch": { "provider": "ollama", "remote": { "baseUrl": "http://127.0.0.1:11434", "apiKey": "ollama", "nonBatchConcurrency": 1 } } } } }