feat(orchestrator): wire memory-tree retrieval tools with periodic prefetch#1027
Conversation
…efetch Six Tool wrappers expose the existing Phase 4 memory-tree retrieval RPCs to the orchestrator agent so chat can answer questions about ingested email/chat/document memory: - memory_tree_search_entities — resolve names to canonical entity ids - memory_tree_query_topic — entity-scoped retrieval across every tree - memory_tree_query_source — filter by source kind + time window - memory_tree_query_global — cross-source daily digest - memory_tree_drill_down — expand a summary node's children - memory_tree_fetch_leaves — batch-hydrate raw chunks for citations TreeContextLoader pre-loads the 7-day digest into the orchestrator's turn context on session start AND every REFRESH_INTERVAL (30 min) thereafter, so long-running conversations stay current with new ingest. Each injection rides on the user message (NOT the system prompt) to keep the KV-cache prefix stable. A pure should_prefetch helper keeps the cadence decision deterministically testable. The orchestrator system prompt grows two sections: tool-selection guidance (when to call which retrieval tool) and a citation-marker convention ([^N] + footnote with node_id and source_ref). Tests: - tests/agent_retrieval_e2e.rs (2 tests including a real ingest-pipeline round trip on alice/phoenix data) covers the full deserialise -> typed retrieval -> serialise wrapper pipeline. - TreeContextLoader unit tests (5: empty-workspace + 4 covering the should_prefetch state machine). - Static check asserting agent.toml exposes all six tool names. End-to-end live verification on the developer's populated .openhuman/workspace: a single agent_chat turn dispatched all six tools (21 total dispatches: 5 query_source, 4 search_entities, 4 drill_down, 3 query_topic, 3 query_global, 2 fetch_leaves) and produced a correct sender-frequency answer ranked by mention count. about_app capability catalog gains intelligence.memory_tree_retrieval (Beta, LOCAL_RAW privacy). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughAdds a memory-tree retrieval feature: six new LLM-facing memory-tree tools and exports, orchestrator prompt/config updates to use them, eager tree-digest prefetching during agent turns with session timing, a TreeContextLoader implementation, capability catalog entry, and an end-to-end retrieval test. Changes
Sequence DiagramsequenceDiagram
participant Agent as Agent::turn
participant MemLoader as MemoryLoader
participant TreeLoader as TreeContextLoader
participant ConfigRPC as ConfigRPC (load_config_with_timeout)
participant Retrieval as Retrieval Backend
participant LLM as LLM Context
Agent->>MemLoader: load_context()
MemLoader-->>Agent: memory_context
Agent->>TreeLoader: should_prefetch(last_tree_prefetch_at, now)?
TreeLoader-->>Agent: boolean
alt prefetch due
Agent->>ConfigRPC: load_config_with_timeout()
ConfigRPC-->>Agent: Config
Agent->>TreeLoader: load(config)
TreeLoader->>Retrieval: query_global(window_days=7)
Retrieval-->>TreeLoader: hits
TreeLoader-->>Agent: tree_ctx (formatted digest)
Agent->>Agent: set last_tree_prefetch_at(now)
alt tree_ctx non-empty
Agent->>LLM: inject tree_ctx into context
end
else prefetch skipped
Agent->>Agent: use existing context
end
Agent-->>LLM: proceed with turn using enriched context
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Review rate limit: 4/5 reviews remaining, refill in 12 minutes. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/openhuman/agent/agents/orchestrator/prompt.md (1)
127-132:⚠️ Potential issue | 🟠 MajorRestrict verbatim quotes to raw leaf chunks, not generic retrieval hits.
Current wording allows quoting from any hit
content. Topic/source/global hits can be summaries, so this can present paraphrases as verbatim quotes.✏️ Proposed prompt fix
-Inline marker `[^N]` and a numbered footnote at the end carrying the node_id and source_ref from the RetrievalHit. Do not invent quotes — only quote text that appears verbatim in a hit's `content` field. +Inline marker `[^N]` and a numbered footnote at the end carrying the node_id and source_ref from the RetrievalHit. Do not invent quotes. Only use verbatim quotes from `memory_tree_fetch_leaves` chunk `content` (raw leaf text), and cite that chunk.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/agents/orchestrator/prompt.md` around lines 127 - 132, The prompt currently allows verbatim quotes from any RetrievalHit `content`; update the wording around the inline marker `[^N]` so that verbatim quotes are only allowed when the hit is a raw leaf chunk (e.g., check hit.type == "leaf" or hit.is_leaf / retrievalHit.kind == "leaf_chunk" in your system) and forbid quoting from topic/source/global summary hits; explicitly state that for non-leaf hits you must paraphrase or cite but not use verbatim quotation, and ensure the example and the inline-footnote format `[^N]: node_id · source_ref` remain unchanged.
🧹 Nitpick comments (1)
src/openhuman/agent/tree_loader.rs (1)
74-83: Align log prefixes with the repo’s standard tokens.These logs currently use
[memory_tree]; the guideline requires stable grep-friendly prefixes like[domain],[rpc], or[ui-flow]. Consider switching to one of those (and keep correlation fields in the message body as needed).As per coding guidelines
src/openhuman/**/*.rs: "Log ... with stable grep-friendly prefixes ([domain],[rpc],[ui-flow]) and correlation fields."Also applies to: 88-89, 107-110
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/agent/tree_loader.rs` around lines 74 - 83, Replace the non-standard log prefix "[memory_tree]" with a repo-approved, grep-friendly prefix (e.g. "[domain]") in the tree loader's logging calls; update the log statements inside the async load path (the match on query_global in tree_loader.load) and the other occurrences noted (the warnings and debug/info around the query_global error handling and subsequent blocks) to use the chosen prefix while preserving the existing correlation fields (window_days, error details, etc.) in the message body so grepability and context remain intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/openhuman/tools/impl/memory/tree/drill_down.rs`:
- Around line 33-35: The schema for the "max_depth" parameter currently allows 0
but the tool semantics require at least 1; update the JSON schema for
"max_depth" (the properties named "max_depth" in this file) to use "minimum": 1
instead of 0, and add a defensive check in the DrillDownTool's execute method to
reject or error when max_depth <= 0 (or specifically == 0) before proceeding;
reference the "max_depth" schema entries and the execute function in this file
to locate both the schema change and the runtime guard to enforce the new bound.
In `@src/openhuman/tools/impl/memory/tree/fetch_leaves.rs`:
- Around line 29-30: The doc says chunk_ids are capped at 20 but the handler
forwards any length; modify the request-handling code in fetch_leaves.rs to
enforce that cap at the tool boundary by checking the chunk_ids vector (the
chunk_ids variable) before forwarding to the downstream call (the
hydrate/hydrate_leaves or similar client method). If chunk_ids.len() > 20,
return a clear error/BadRequest result (or trim to 20 if policy prefers) instead
of forwarding; ensure the check is placed in the public handler function (e.g.,
the function that receives the request and calls the client) so the documented
contract is always honored.
In `@src/openhuman/tools/impl/memory/tree/search_entities.rs`:
- Around line 47-48: The runtime default for the "limit" query param doesn't
match the documented default of 5; find where the code reads the limit (e.g. the
variable/field named limit, likely via params.limit.unwrap_or(0) inside the
search_entities handler/function) and change it to use unwrap_or(5) and clamp it
to 100 (e.g. take min(100) after defaulting) so the effective default and the
schema/description ("Max matches (default 5, clamped to 100).") align.
---
Outside diff comments:
In `@src/openhuman/agent/agents/orchestrator/prompt.md`:
- Around line 127-132: The prompt currently allows verbatim quotes from any
RetrievalHit `content`; update the wording around the inline marker `[^N]` so
that verbatim quotes are only allowed when the hit is a raw leaf chunk (e.g.,
check hit.type == "leaf" or hit.is_leaf / retrievalHit.kind == "leaf_chunk" in
your system) and forbid quoting from topic/source/global summary hits;
explicitly state that for non-leaf hits you must paraphrase or cite but not use
verbatim quotation, and ensure the example and the inline-footnote format `[^N]:
node_id · source_ref` remain unchanged.
---
Nitpick comments:
In `@src/openhuman/agent/tree_loader.rs`:
- Around line 74-83: Replace the non-standard log prefix "[memory_tree]" with a
repo-approved, grep-friendly prefix (e.g. "[domain]") in the tree loader's
logging calls; update the log statements inside the async load path (the match
on query_global in tree_loader.load) and the other occurrences noted (the
warnings and debug/info around the query_global error handling and subsequent
blocks) to use the chosen prefix while preserving the existing correlation
fields (window_days, error details, etc.) in the message body so grepability and
context remain intact.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 10419be4-7222-4a50-9897-99c87bc3a34f
📒 Files selected for processing (18)
src/openhuman/about_app/catalog.rssrc/openhuman/agent/agents/orchestrator/agent.tomlsrc/openhuman/agent/agents/orchestrator/prompt.mdsrc/openhuman/agent/harness/session/builder.rssrc/openhuman/agent/harness/session/turn.rssrc/openhuman/agent/harness/session/types.rssrc/openhuman/agent/mod.rssrc/openhuman/agent/tree_loader.rssrc/openhuman/tools/impl/memory/mod.rssrc/openhuman/tools/impl/memory/tree/drill_down.rssrc/openhuman/tools/impl/memory/tree/fetch_leaves.rssrc/openhuman/tools/impl/memory/tree/mod.rssrc/openhuman/tools/impl/memory/tree/query_global.rssrc/openhuman/tools/impl/memory/tree/query_source.rssrc/openhuman/tools/impl/memory/tree/query_topic.rssrc/openhuman/tools/impl/memory/tree/search_entities.rssrc/openhuman/tools/ops.rstests/agent_retrieval_e2e.rs
- drill_down: schema `max_depth.minimum` 0 -> 1 to match the description
("one step or more"); reject `Some(0)` at runtime defensively.
- fetch_leaves: truncate `chunk_ids` to first 20 at the tool boundary
to honour the documented "max 20 per call" cap, matching the silent-
truncate behaviour the schema describes.
- search_entities: default `limit` was `unwrap_or(0)` which doesn't
match the documented "default 5, clamped to 100" — fix to
`unwrap_or(5).min(100)`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/openhuman/tools/impl/memory/tree/fetch_leaves.rs (1)
29-39: Encode the 20-id cap directly in JSON schema.The description says “capped at 20,” but the schema doesn’t enforce it. Adding
maxItemshelps tool-call validation and reduces oversized calls from the model.Suggested schema tweak
"chunk_ids": { "type": "array", "items": {"type": "string"}, + "maxItems": 20, "description": "Chunk ids to hydrate. Capped at 20 per call." }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/tools/impl/memory/tree/fetch_leaves.rs` around lines 29 - 39, The JSON schema for the request body (the json!({...}) literal that contains the "chunk_ids" property) doesn't enforce the 20-item cap described; update the "chunk_ids" property to include "maxItems": 20 (you can also add "minItems": 1 if desired) so the schema enforces the limit and prevents oversized calls from the model.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/openhuman/tools/impl/memory/tree/fetch_leaves.rs`:
- Around line 43-60: Replace the ad-hoc log prefixes `[tool][memory_tree]` with
the repo-standard prefix family (e.g., `[rpc][memory_tree]`) and include stable
correlation fields in each log call: when the function is invoked log the
requested_ids count (`req.chunk_ids.len()`), when truncating log `requested_ids`
and `truncated_to` (`take`), and when returning log `hits` (`hits.len()`);
update the three log::debug! calls in this file (the invocation, truncation
branch, and returning hits) to emit these grep-friendly fields so downstream
tooling can correlate requests.
---
Nitpick comments:
In `@src/openhuman/tools/impl/memory/tree/fetch_leaves.rs`:
- Around line 29-39: The JSON schema for the request body (the json!({...})
literal that contains the "chunk_ids" property) doesn't enforce the 20-item cap
described; update the "chunk_ids" property to include "maxItems": 20 (you can
also add "minItems": 1 if desired) so the schema enforces the limit and prevents
oversized calls from the model.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 85a5d1e4-02f2-46a7-8f77-9031517ffdae
📒 Files selected for processing (3)
src/openhuman/tools/impl/memory/tree/drill_down.rssrc/openhuman/tools/impl/memory/tree/fetch_leaves.rssrc/openhuman/tools/impl/memory/tree/search_entities.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- src/openhuman/tools/impl/memory/tree/drill_down.rs
- src/openhuman/tools/impl/memory/tree/search_entities.rs
…eaves logs CodeRabbit follow-up: align log lines with CLAUDE.md convention ([domain] / [rpc] / [ui-flow] grep-friendly prefixes plus correlation fields). Surfaces requested_ids / truncated_to / hits counts so a fetch_leaves invocation can be traced end-to-end. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/openhuman/tools/impl/memory/tree/fetch_leaves.rs (1)
32-39: Encode the 20-id cap directly in the JSON schema.Line 35 documents the limit, but the schema doesn’t enforce it. Add
maxItemsso tool callers and validators get the constraint upfront.Suggested diff
"properties": { "chunk_ids": { "type": "array", "items": {"type": "string"}, + "maxItems": MAX_CHUNK_IDS_PER_CALL, "description": "Chunk ids to hydrate. Capped at 20 per call." } },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/tools/impl/memory/tree/fetch_leaves.rs` around lines 32 - 39, The JSON schema for the "chunk_ids" array in fetch_leaves.rs documents a 20-id cap but doesn't enforce it; update the schema object that defines the "chunk_ids" property to include "maxItems": 20 (i.e., add maxItems to the "chunk_ids" property's JSON schema alongside "type" and "items") so callers and validators will enforce the limit (you can leave the description as-is or adjust it to match the enforced constraint).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/openhuman/tools/impl/memory/tree/fetch_leaves.rs`:
- Around line 60-65: The call to retrieval::fetch_leaves and the
serde_json::to_string call return errors that are currently propagated with "?"
without any RPC-prefixed logs or contextual wrapping; add log::error calls with
a stable prefix like "[rpc][memory_tree]" and correlation fields from the
incoming request (e.g., any request_id/trace_id on req) and chunk info
(chunk_ids.len() and take) when these operations fail, and wrap the errors with
additional context before returning (e.g., use anyhow::Context or map_err to
attach "rpc memory_tree: fetch_leaves failed" and "rpc memory_tree: serialize
hits failed") so callers and logs include grep-friendly prefixes and useful
correlation details for retrieval::fetch_leaves and serde_json::to_string
failures.
---
Nitpick comments:
In `@src/openhuman/tools/impl/memory/tree/fetch_leaves.rs`:
- Around line 32-39: The JSON schema for the "chunk_ids" array in
fetch_leaves.rs documents a 20-id cap but doesn't enforce it; update the schema
object that defines the "chunk_ids" property to include "maxItems": 20 (i.e.,
add maxItems to the "chunk_ids" property's JSON schema alongside "type" and
"items") so callers and validators will enforce the limit (you can leave the
description as-is or adjust it to match the enforced constraint).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d131ebb5-5a41-43d1-be04-bab6ab06377b
📒 Files selected for processing (1)
src/openhuman/tools/impl/memory/tree/fetch_leaves.rs
| let hits = retrieval::fetch_leaves(&cfg, &req.chunk_ids[..take]).await?; | ||
| log::debug!( | ||
| "[rpc][memory_tree] fetch_leaves completed hits={}", | ||
| hits.len() | ||
| ); | ||
| let json = serde_json::to_string(&hits)?; |
There was a problem hiding this comment.
Log and wrap downstream failures with RPC-prefixed context.
Line 60 and Line 65 propagate errors without a grep-friendly error log/correlation fields. Add contextual logging before returning errors.
Suggested diff
- let hits = retrieval::fetch_leaves(&cfg, &req.chunk_ids[..take]).await?;
+ let hits = retrieval::fetch_leaves(&cfg, &req.chunk_ids[..take])
+ .await
+ .map_err(|e| {
+ log::debug!(
+ "[rpc][memory_tree] fetch_leaves failed requested_ids={} err={}",
+ take,
+ e
+ );
+ anyhow::anyhow!("memory_tree_fetch_leaves: retrieval failed: {e}")
+ })?;
log::debug!(
"[rpc][memory_tree] fetch_leaves completed hits={}",
hits.len()
);
- let json = serde_json::to_string(&hits)?;
+ let json = serde_json::to_string(&hits).map_err(|e| {
+ log::debug!(
+ "[rpc][memory_tree] fetch_leaves serialize_failed hits={} err={}",
+ hits.len(),
+ e
+ );
+ anyhow::anyhow!("memory_tree_fetch_leaves: serialize failed: {e}")
+ })?;As per coding guidelines: "Log entry/exit, branches, external calls, retries/timeouts, state transitions, errors with stable grep-friendly prefixes ([domain], [rpc], [ui-flow]) and correlation fields."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/openhuman/tools/impl/memory/tree/fetch_leaves.rs` around lines 60 - 65,
The call to retrieval::fetch_leaves and the serde_json::to_string call return
errors that are currently propagated with "?" without any RPC-prefixed logs or
contextual wrapping; add log::error calls with a stable prefix like
"[rpc][memory_tree]" and correlation fields from the incoming request (e.g., any
request_id/trace_id on req) and chunk info (chunk_ids.len() and take) when these
operations fail, and wrap the errors with additional context before returning
(e.g., use anyhow::Context or map_err to attach "rpc memory_tree: fetch_leaves
failed" and "rpc memory_tree: serialize hits failed") so callers and logs
include grep-friendly prefixes and useful correlation details for
retrieval::fetch_leaves and serde_json::to_string failures.
Summary
memory_tree_{search_entities, query_topic, query_source, query_global, drill_down, fetch_leaves}) wrapping the existing Phase 4 retrieval RPCs so the orchestrator agent can answer questions about ingested email / chat / document memory.TreeContextLoaderinjects the 7-day cross-source digest into the orchestrator's turn context on the first turn AND everyREFRESH_INTERVAL(30 min) thereafter, so long-running sessions stay current with newly-ingested memory. Each injection rides on the user message (NOT the system prompt) to keep the KV-cache prefix stable.[^N]+ footnote withnode_idandsource_ref).tests/agent_retrieval_e2e.rsexercises the wrapper pipeline against a real ingest fixture, plus a static check thatagent.tomlexposes all six tool names.intelligence.memory_tree_retrieval(Beta,LOCAL_RAWprivacy).Problem
Phase 4 (#710) shipped six JSON-RPC retrieval primitives over the hierarchical memory tree, but they're only reachable from external callers. The orchestrator agent — the user-facing chat front — has no way to call them, so it can't answer questions like "what emails did I get from Anthropic this week?" against the user's own ingested memory. The agent dispatches to the old episodic
memory_recallonly.Solution
src/openhuman/tools/impl/memory/tree/. Each wrapper deserialises args into the matching*Requeststruct fromretrieval/rpc.rs, loads config viaconfig_rpc::load_config_with_timeout, calls the typedretrieval::*function (not_rpcvariants — keeps tool output free of the CLI log envelope), and returnsToolResult::success(json).tools/ops.rs::all_tools_with_runtimenext to the existing memory tools, and in the orchestrator'sagent.tomlnamed=[…].TreeContextLoader): pureshould_prefetch(last, now, interval)helper isolates the cadence decision fromInstant::now()so it stays deterministically testable. Session struct gainslast_tree_prefetch_at: Option<Instant>, bumped on every successfulload(including empty digests, so empty workspaces don't get re-queried every turn). Failures are non-fatal — barecontextreturned on any error.prompt.md: LLM emits[^N]markers + footnote block withnode_id+source_reffrom eachRetrievalHit. UI rendering is deferred — markers stay as plaintext until that lands.query_*summaries; only calldrill_down/fetch_leaveswhen the user wants details or a verbatim quote. Live testing observed exactly this pattern.Verification
End-to-end smoke against the developer's populated
.openhuman/workspace:cargo run --bin openhuman-core -- call --method openhuman.local_ai_agent_chat --params '{"message":"Who is the most frequent email sender across my inbox? Only check via the memory tree retrieval tools."}'tree_scope,child_ids) and produced an accurate answer rankingnotifications@github.com(27) >no-reply@otter.ai(5) by mention count.cargo test --test agent_retrieval_e2e: 2/2 passcargo test --lib agent::tree_loader: 5/5 pass (4 newshould_prefetchcases + the empty-workspace case)cargo test --lib memory::tree::retrieval: 70/70 pass (no regressions)cargo fmt --check+cargo clippy --all-targets -- -D warnings: zero warnings introduced in changed files.Submission Checklist
cargo test --lib agent::tree_loader(5/5) covers the empty-workspace path and the fullshould_prefetchstate machine. Per-tool wrappers are covered transitively by the integration test.tests/agent_retrieval_e2e.rsingests a real email fixture and exercises the full tool-wrapper → typed retrieval → serialised result pipeline; static test assertsagent.tomlexposes all six tool names.//!headers on every new module;//rationale comments on the KV-cache invariant inturn.rsand the typed-vs-_rpcchoice in each tool wrapper.Impact
query_globalcall (cached for 30 min); each refresh adds at most ~1.5 KB of context. No background work added.LOCAL_RAWprivacy.Related
[^N]) are absent in current LLM output; needs prompt iteration with concrete anti-hallucination examples.named=[…]— pre-existing condition unrelated to this PR but worth investigating whethernamedis supposed to be authoritative.query_globalreturns 0 against a freshly-ingested workspace becauseglobal_tree::digest::end_of_day_digestdaily-digest job hasn't run; the loader handles this gracefully but the prefetch is a no-op until the digest pipeline fires.Summary by CodeRabbit
New Features
Behavior
Tests