feat: add IMAP email_search tool for branch read-back#246
Conversation
WalkthroughAdds an IMAP-based email_search tool (implementation, prompts, and docs), wires it into branch ToolServers via an added runtime_config parameter, extends messaging with mailbox search APIs, updates prompt rendering error handling, and adjusts related call sites and tests. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
|
||
| /// Search the configured mailbox directly via IMAP. | ||
| /// | ||
| /// Results are returned newest-first across searched folders. |
There was a problem hiding this comment.
This doc comment reads like a global sort, but the implementation appends per-folder hits (newest-first within each folder). Either sort results across folders or tweak the comment to match.
| /// Results are returned newest-first across searched folders. | |
| /// Results are returned newest-first within each searched folder. |
src/messaging/email.rs
Outdated
| let mut message_uids: Vec<u32> = match session.uid_search(&criterion) { | ||
| Ok(uids) => uids.into_iter().collect(), | ||
| Err(error) => { | ||
| tracing::warn!(folder, criterion, %error, "failed IMAP mailbox search"); |
There was a problem hiding this comment.
Minor privacy footgun: logging the raw criterion can include user-provided search terms (subject/text) which might be sensitive. Consider dropping it from warn! (or logging it at debug!).
| tracing::warn!(folder, criterion, %error, "failed IMAP mailbox search"); | |
| tracing::warn!(folder, %error, "failed IMAP mailbox search"); |
| .tool(MemoryRecallTool::new(memory_search.clone())) | ||
| .tool(MemoryDeleteTool::new(memory_search)) | ||
| .tool(ChannelRecallTool::new(conversation_logger, channel_store)) | ||
| .tool(EmailSearchTool::new(runtime_config)) |
There was a problem hiding this comment.
Worth double-checking the blast radius here: adding email_search to the shared branch ToolServer makes it available from any channel that can spawn a branch (Discord/Telegram/etc). In multi-user deployments that’s effectively “any user can query the bot’s mailbox” unless there’s a separate permission layer. Consider gating this behind an explicit config flag / permission check / channel-type check.
src/tools/email_search.rs
Outdated
| parts.push(format!("subject={subject}")); | ||
| } | ||
| if let Some(text) = &query.text { | ||
| parts.push(format!("text={text}")); |
There was a problem hiding this comment.
Nit: the tool arg is named query, but the criteria string prints text=.... Using the same label makes logs/output less confusing.
| parts.push(format!("text={text}")); | |
| parts.push(format!("query={text}")); |
| break; | ||
| } | ||
|
|
||
| let fetches = match session.uid_fetch(uid.to_string(), "(UID RFC822)") { |
There was a problem hiding this comment.
Potential perf win: uid_fetch(..., "(UID RFC822)") downloads the full raw message for every hit. If the intent is just a bounded snippet + a couple headers, it might be worth switching to a partial/body-peek fetch (or fetching just headers + text) to avoid pulling large attachments over IMAP.
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/messaging/email.rs`:
- Around line 1015-1018: search_mailbox currently short-circuits per-folder when
pushing matches (using the per-folder traversal and early break when `limit` is
reached), which violates the advertised global newest-first ordering; change the
logic to collect matches from all searched folders into a single results
collection (preserving the message date/received timestamp and source folder),
then after all folders are scanned sort that collection by the timestamp
descending (newest-first) and only then apply the `limit` to return the top N
results. Update the per-folder loops and the code paths that currently break on
`limit` (the sequential folder traversal and push/append sites) to always
accumulate instead of breaking, ensuring any filters/score computations still
run per message but defer truncation until after the global sort.
- Around line 1053-1057: The warning currently logs the raw IMAP search
criterion (see the session.uid_search(&criterion) error branch and the
tracing::warn! call), which may leak user-provided query text; change the
tracing::warn! invocation to avoid including the raw `criterion` value—replace
it with a redacted placeholder or a non-sensitive derived value (e.g.,
"<redacted_criterion>" or a fixed-length hash/length like `criterion.len()`),
keep `folder` and `%error` intact, and update the message text accordingly so no
raw query text is emitted in logs.
In `@src/tools/email_search.rs`:
- Around line 169-185: The function load_email_config currently prefers loading
Config::load_from_path when config.toml exists which bypasses the global
precedence rules; replace the file-first branch with the centralized subsystem
resolver call (use the existing per-subsystem resolve() API) so resolution
follows env > DB > default. Concretely, remove the Config::load_from_path /
Config::load_from_env branching in load_email_config and invoke the subsystem
resolver (e.g., EmailConfig::resolve(&runtime_config) or the project's
Config::resolve::<EmailConfig>(&runtime_config) /
subsystem_resolver.resolve("email", &runtime_config)) and map any error into
EmailSearchError with the same error propagation pattern. Ensure the unique
symbols touched are load_email_config, RuntimeConfig, EmailConfig (or project
resolver API), and EmailSearchError so the change integrates with the rest of
the code.
- Around line 123-125: The call to load_email_config() inside async fn call()
blocks the executor because load_email_config() uses Config::load_from_path()
which does synchronous std::fs::read_to_string(); fix by either (A) moving the
config load into a blocking task using tokio::task::spawn_blocking and await
that result inside call(), or (B) load and cache the resolved email config once
in the tool's state (e.g., store the EmailConfig on EmailSearchTool
initialization) and then read from that cached field in call() instead of
invoking load_email_config() each time; update references to load_email_config,
Config::load_from_path, and the EmailSearchTool (or its constructor/new method)
accordingly.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
docs/content/docs/(features)/tools.mdxdocs/content/docs/(messaging)/email-setup.mdxdocs/content/docs/(messaging)/messaging.mdxprompts/en/tools/email_search_description.md.j2src/agent/channel.rssrc/agent/ingestion.rssrc/messaging/email.rssrc/prompts/engine.rssrc/prompts/text.rssrc/tools.rssrc/tools/email_search.rstests/context_dump.rs
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/messaging/email.rs (1)
1069-1134:⚠️ Potential issue | 🟠 MajorLimit is applied too late, causing potentially unbounded RFC822 downloads.
search_mailboxfetches full raw messages for every matched UID across folders, then applieslimitat the end. For broad criteria (especially defaultALL), this can become very expensive and slow.#!/bin/bash set -euo pipefail # Verify where search happens, where full-message fetch happens, and where limiting happens. rg -n -C3 'uid_search\(&criterion\)|uid_fetch\(uid\.to_string\(\), "\(UID RFC822\)"\)|sort_and_limit_search_hits\(ranked_results, limit\)' src/messaging/email.rsExpected result: full
RFC822fetch occurs inside the UID loop before finalsort_and_limit_search_hits, confirming heavy I/O precedes truncation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/messaging/email.rs` around lines 1069 - 1134, search_mailbox currently uid_fetches full "(UID RFC822)" for every matched UID then applies sort_and_limit_search_hits, causing unbounded downloads; instead fetch only headers/metadata first (change uid_fetch calls used in the loop over message_uids to request headers/date/from/subject/Message-ID or use BODY.PEEK[HEADER] or a HEADER.FIELDS fetch), build ranked_results with that metadata only, call sort_and_limit_search_hits to pick the top N, and only then call uid_fetch for the selected UIDs to retrieve full RFC822 bodies and call extract_text_and_attachments; references: uid_fetch, message_uids, ranked_results, sort_and_limit_search_hits, extract_text_and_attachments.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/messaging/email.rs`:
- Around line 1069-1134: search_mailbox currently uid_fetches full "(UID
RFC822)" for every matched UID then applies sort_and_limit_search_hits, causing
unbounded downloads; instead fetch only headers/metadata first (change uid_fetch
calls used in the loop over message_uids to request
headers/date/from/subject/Message-ID or use BODY.PEEK[HEADER] or a HEADER.FIELDS
fetch), build ranked_results with that metadata only, call
sort_and_limit_search_hits to pick the top N, and only then call uid_fetch for
the selected UIDs to retrieve full RFC822 bodies and call
extract_text_and_attachments; references: uid_fetch, message_uids,
ranked_results, sort_and_limit_search_hits, extract_text_and_attachments.
Summary
email_searchtool that queries IMAP directly with sender/subject/text/unread/since filters and returns bounded snippets plus metadata for precise email read-backTesting
cargo fmt --allcargo checkcargo test tools::email_search --libcargo test messaging::email --libcargo clippy --all-targetsNote
This PR adds a new
EmailSearchTool(src/tools/email_search.rs) that allows branches to search IMAP mailboxes directly. Key changes include:search_mailbox()function and supporting types in src/messaging/email.rs with criterion sanitization and folder normalizationWritten by Tembo for commit 2405da2.