feat(mcp): add tree.tag write tool (completes Phase 3 #2269)#2316
Conversation
Expose two new MCP write tools that route to the existing doc_put RPC: - memory.store: create a memory document from content (title, content, namespace, optional tags) with source_type=mcp provenance - memory.note: append an annotation to an existing chunk by ID, stored as a linked document with metadata.annotates_chunk_id Adds enforce_write_policy() for write-specific security gating (uses the Act operation gate, with a distinct log path for audit clarity), dispatch_write_tool() with structured tracing::info! audit logging, and 11 unit tests covering param validation, defaults, and slugging. Closes tinyhumansai#2269 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Address CodeRabbit review: slug_from now returns "untitled" when the title contains only non-alphanumeric characters, preventing key collisions on the mcp-store- prefix. Also adds the missing memory.store and memory.note entries to the protocol integration test. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses CodeRabbit review: titles with no ASCII-alphanumeric chars (e.g. "会议记录", "Протокол", emoji) now produce "untitled-<sha256_hex>" instead of a constant "untitled", ensuring distinct stable slugs. Adds 2 regression tests for uniqueness and stability across calls. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Closes tinyhumansai#2269 by adding the third MCP write tool from the RFC, on top of the memory.store + memory.note scaffolding already in tinyhumansai#2306. ## What this adds A new `tree.tag` MCP tool that attaches one or more category labels to an existing memory chunk: - Input: `chunk_id` (string) + `tags` (non-empty array of strings). - Routes through the existing `openhuman.memory_doc_put` RPC using `enforce_write_policy` + `dispatch_write_tool` from tinyhumansai#2306, so it inherits the same write-policy gate and audit-log tracing as `memory.store` / `memory.note`. - Source provenance: `source_type = "mcp"` (matches tinyhumansai#2306). - Deterministic document key `mcp-tag-<chunk_id>` so re-tagging the same chunk upserts the prior tag-record rather than accumulating duplicate annotations. The RFC's intent is that callers supply the complete desired tag set on each call. Distinguishes from `memory.note` in two ways: 1. The payload is a categorical label list (typed `tags`), not free text. The top-level `tags` field flows to the document tags index so callers can filter by tag downstream. 2. The upsert semantic means re-tagging *replaces* the tag set, where `memory.note` is intended for additive annotations. Metadata carries a back-reference (`tags_for_chunk_id`) plus a mirror of the applied tag list (`applied_tags`) so consumers reading the metadata view don't need to also join against the top-level `tags` field. ## Coverage - 8 new unit tests covering: missing chunk_id, missing tags, empty tags array, all-blank tags, non-string tag entries, happy path, blank-trim semantics with non-empty residue, empty chunk_id rejection, unknown-argument rejection. - All 59 existing `mcp_server` tests still pass — no regression. ## Validation - `cargo check --lib` — clean (pre-existing warnings only) - `cargo test --lib mcp_server` — 59/59 pass - `cargo test --lib tree_tag` — 8/8 pass - `cargo fmt --check` — clean - `cargo clippy --lib --no-deps` — no new warnings in `src/openhuman/mcp_server/tools.rs` Depends on tinyhumansai#2306.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds three MCP write tools—memory.store, memory.note, and tree.tag—with JSON input schemas, write-gated call dispatch to openhuman.memory_doc_put, parameter-building/validation (including slug_from), and unit tests; also updates the tools/list test expectations. ChangesMCP Write Tools: memory.store, memory.note, tree.tag
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
graycyrus
left a comment
There was a problem hiding this comment.
Review — tree.tag write tool (Phase 3 completion)
Solid, well-documented PR. The tree.tag tool is cleanly implemented, validation is thorough (especially required_non_empty_string_array), and the 8 new unit tests are comprehensive. Good use of upsert semantics keyed on chunk_id and the tag-index design for queryability.
One semantic issue in memory.note that should be addressed before merge.
| Area | Files | Verdict |
|---|---|---|
| MCP tool specs | tools.rs (schemas + dispatch) |
1 major, 2 minor |
| Protocol test | protocol.rs |
✅ Clean |
Nice work on the slug fallback for non-ASCII titles — the SHA-256 hash approach is a good call.
| reject_unexpected_arguments(&args, MEMORY_NOTE_ARGUMENTS)?; | ||
| let chunk_id = required_non_empty_string(&args, "chunk_id")?; | ||
| let note_text = required_non_empty_string(&args, "note_text")?; | ||
| let key = format!("mcp-note-{chunk_id}"); |
There was a problem hiding this comment.
[major] memory.note has an upsert-vs-append semantic mismatch.
The key is mcp-note-{chunk_id} and it routes through memory_doc_put — so calling memory.note twice on the same chunk_id silently replaces the first note. But the tool description above says "Append a note" and the PR description table lists the re-call semantic as "Append-style annotation per call."
This is confusing for MCP clients: an LLM that calls memory.note multiple times on the same chunk expecting accumulated annotations will instead get only the last one.
Options:
- Change the key to include a nonce/timestamp (e.g.
mcp-note-{chunk_id}-{unix_millis}) so each call creates a distinct document — actual append semantics. - Keep upsert but fix the description to say "Upsert a note" and document that re-noting replaces the prior annotation (matching
tree.tag's honesty about this).
Either way, the description and behavior need to agree.
| tool_name: &str, | ||
| params: &Map<String, Value>, | ||
| ) -> Result<Value, ToolCallError> { | ||
| let rpc_method = "openhuman.memory_doc_put"; |
There was a problem hiding this comment.
[minor] rpc_method is hardcoded to "openhuman.memory_doc_put" here, but each tool spec already declares this via rpc_method: Some("openhuman.memory_doc_put"). If a future write tool uses a different RPC method, this function will silently send it to the wrong endpoint.
Consider passing the McpToolSpec (or at least spec.rpc_method) into this function instead of hardcoding.
| let namespace = | ||
| optional_non_empty_string(&args, "namespace")?.unwrap_or_else(|| "mcp".to_string()); | ||
| // Generate a deterministic key from the title for upsert dedup. | ||
| let key = format!("mcp-store-{}", slug_from(&title)); |
There was a problem hiding this comment.
[minor] slug_from maps non-alphanumeric chars to hyphens, so titles that differ only in punctuation collide silently. E.g. "Hello World!" and "Hello World?" both produce key mcp-store-hello-world, causing one to overwrite the other.
This may be acceptable if title-based dedup is intentional, but it's worth a comment acknowledging the trade-off — or mixing in a short content hash to make collisions less likely.
## Summary - Adds MCP stdio session state that captures `initialize.params.clientInfo.name` for the lifetime of the session. - Normalizes known MCP client names into stable source labels such as `mcp:claude-desktop`, `mcp:cursor`, and `mcp:windsurf`. - Preserves the existing bare `mcp` fallback for missing, empty, whitespace-only, or Unicode-only client names. - Keeps existing stateless protocol helpers for tests/callers while wiring the stdio loop through the new stateful handler. - Documents the provenance contract for follow-up write-capable MCP tools. ## Problem #2317 needs MCP write tools to distinguish which client wrote memory, not only that the write came from MCP. The write-tool PRs are still in flight (#2306 for `memory.store` / `memory.note`, #2316 for `tree.tag`), so implementing the full source-type propagation directly on `main` would duplicate those open PRs. ## Solution This PR lands the non-duplicative foundation first: the MCP server records the client identity at `initialize` time and exposes a session source label internally. The default remains `mcp`, so older clients and clients without `clientInfo.name` retain current behavior. Once #2306/#2316 merge, the write dispatch path can use `session.source_type()` instead of the placeholder `mcp` source string. ## Submission Checklist > If a section does not apply to this change, mark the item as `N/A` with a one-line reason. Do not delete items. - [x] Tests added or updated (happy path + at least one failure / edge case) per [Testing Strategy](../gitbooks/developing/testing-strategy.md#failure-path-requirement) - [x] **Diff coverage ≥ 80%** — local coverage not run; CI coverage gate is the source of truth for changed-line coverage on this PR. - [x] Coverage matrix updated — N/A: MCP protocol/session foundation only; no feature matrix row added/removed/renamed. - [x] All affected feature IDs from the matrix are listed in the PR description under `## Related` — N/A: no feature matrix row applies. - [x] No new external network dependencies introduced (mock backend used per [Testing Strategy](../gitbooks/developing/testing-strategy.md#mock-policy)) - [x] Manual smoke checklist updated if this touches release-cut surfaces — N/A: no release manual smoke flow changes. - [x] Linked issue closed via `Closes #NNN` in the `## Related` section — N/A: this prepares #2317 but does not fully close it until write tools consume the session source label. ## Impact - Runtime/platform impact: CLI stdio MCP server only. - Compatibility: existing stateless helpers remain available; stdio sessions now retain client provenance across newline-delimited JSON-RPC messages. - Security/privacy: records only the client name already supplied by the MCP initialize payload; no wire-format mutation and no new persistence. - Performance: negligible in-memory string normalization during initialize only. ## Related - Refs #2317 - Depends conceptually on #2306 and #2316 for write-tool consumption. - Follow-up PR(s)/TODOs: after #2306/#2316 merge, thread `McpSession::source_type()` into `memory.store`, `memory.note`, and `tree.tag` source_type construction. --- ## AI Authored PR Metadata (required for Codex/Linear PRs) > Keep this section for AI-authored PRs. For human-only PRs, mark each field `N/A`. ### Linear Issue - Key: N/A - URL: N/A ### Commit & Branch - Branch: `feat/mcp-client-provenance` - Commit SHA: `95ab2dfe` ### Validation Run - [x] `pnpm --filter openhuman-app format:check` — blocked locally; see Validation Blocked. - [x] `pnpm typecheck` — not run locally because Node/app dependency environment is blocked; see Validation Blocked. - [x] Focused tests: `GGML_NATIVE=OFF cargo test --lib mcp_server --manifest-path Cargo.toml` — 47 passed, 0 failed. - [x] Rust fmt/check (if changed): `cargo fmt --check --manifest-path Cargo.toml` — passed. - [x] Tauri fmt/check (if changed): N/A, no Tauri shell files changed. - [x] Additional: `git diff --check` — passed. ### Validation Blocked - `command:` `pnpm --filter openhuman-app format:check` via pre-push hook - `error:` local app dependencies are not installed (`prettier: command not found`), and local Node is `v22.14.0` while `openhuman-app` requires `>=24.0.0`. - `impact:` local JS/Prettier validation could not run in this environment; Rust-focused validation for the touched MCP core files passed. Push used `--no-verify` because the hook failure was local environment/dependency setup, not this change. - `command:` `GGML_NATIVE=OFF cargo clippy --lib --manifest-path Cargo.toml --no-deps -- -D warnings` - `error:` blocked by 119 pre-existing lint errors in unrelated files (examples: unused imports in `src/openhuman/inference/local/mod.rs`, duplicate module lints in `src/openhuman/inference/provider/*`, doc/comment lints, and unrelated clippy style lints across memory/tools/wallet). - `impact:` clippy cannot currently be used as a clean global gate locally; focused MCP tests and Rust formatting passed. ### Behavior Changes - Intended behavior change: MCP stdio sessions now remember the normalized client source label from `initialize.params.clientInfo.name` and preserve it for the session. - User-visible effect: none immediately for read-only tools; follow-up write tools can attribute memory writes to `mcp:<client>` while preserving `mcp` fallback. ### Parity Contract - Legacy behavior preserved: missing/empty/blank/unusable client names continue to produce bare `mcp`; later malformed initialize payloads do not clear already captured session provenance. - Guard/fallback/dispatch parity checks: existing `handle_json_line` / `handle_json_value` APIs still work; stdio loop uses the new stateful handlers so session data persists between messages. ### Duplicate / Superseded PR Handling - Duplicate PR(s): #2306 and #2316 are related dependencies, not duplicates. - Canonical PR: this PR is the canonical non-duplicative foundation for #2317 on `main`. - Resolution (closed/superseded/updated): N/A. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * MCP server now captures and preserves a client "source" label from initialization, normalizing client names and falling back to a default when absent. * **Documentation** * Added guidance on client provenance, name-normalization rules, and recommended source-label usage for tools. * **Tests** * Added unit tests verifying client name normalization and initialization behavior for captured source labels. <!-- review_stack_entry_start --> [](https://app.coderabbit.ai/change-stack/tinyhumansai/openhuman/pull/2332?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: 李冠辰 <liguanchen@xiaomi.com> Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>
|
@graycyrus — thanks for the careful read. All three inline notes are valid concerns, and all three are in code that landed in #2306 (
|
## Summary - Adds MCP stdio session state that captures `initialize.params.clientInfo.name` for the lifetime of the session. - Normalizes known MCP client names into stable source labels such as `mcp:claude-desktop`, `mcp:cursor`, and `mcp:windsurf`. - Preserves the existing bare `mcp` fallback for missing, empty, whitespace-only, or Unicode-only client names. - Keeps existing stateless protocol helpers for tests/callers while wiring the stdio loop through the new stateful handler. - Documents the provenance contract for follow-up write-capable MCP tools. ## Problem tinyhumansai#2317 needs MCP write tools to distinguish which client wrote memory, not only that the write came from MCP. The write-tool PRs are still in flight (tinyhumansai#2306 for `memory.store` / `memory.note`, tinyhumansai#2316 for `tree.tag`), so implementing the full source-type propagation directly on `main` would duplicate those open PRs. ## Solution This PR lands the non-duplicative foundation first: the MCP server records the client identity at `initialize` time and exposes a session source label internally. The default remains `mcp`, so older clients and clients without `clientInfo.name` retain current behavior. Once tinyhumansai#2306/tinyhumansai#2316 merge, the write dispatch path can use `session.source_type()` instead of the placeholder `mcp` source string. ## Submission Checklist > If a section does not apply to this change, mark the item as `N/A` with a one-line reason. Do not delete items. - [x] Tests added or updated (happy path + at least one failure / edge case) per [Testing Strategy](../gitbooks/developing/testing-strategy.md#failure-path-requirement) - [x] **Diff coverage ≥ 80%** — local coverage not run; CI coverage gate is the source of truth for changed-line coverage on this PR. - [x] Coverage matrix updated — N/A: MCP protocol/session foundation only; no feature matrix row added/removed/renamed. - [x] All affected feature IDs from the matrix are listed in the PR description under `## Related` — N/A: no feature matrix row applies. - [x] No new external network dependencies introduced (mock backend used per [Testing Strategy](../gitbooks/developing/testing-strategy.md#mock-policy)) - [x] Manual smoke checklist updated if this touches release-cut surfaces — N/A: no release manual smoke flow changes. - [x] Linked issue closed via `Closes #NNN` in the `## Related` section — N/A: this prepares tinyhumansai#2317 but does not fully close it until write tools consume the session source label. ## Impact - Runtime/platform impact: CLI stdio MCP server only. - Compatibility: existing stateless helpers remain available; stdio sessions now retain client provenance across newline-delimited JSON-RPC messages. - Security/privacy: records only the client name already supplied by the MCP initialize payload; no wire-format mutation and no new persistence. - Performance: negligible in-memory string normalization during initialize only. ## Related - Refs tinyhumansai#2317 - Depends conceptually on tinyhumansai#2306 and tinyhumansai#2316 for write-tool consumption. - Follow-up PR(s)/TODOs: after tinyhumansai#2306/tinyhumansai#2316 merge, thread `McpSession::source_type()` into `memory.store`, `memory.note`, and `tree.tag` source_type construction. --- ## AI Authored PR Metadata (required for Codex/Linear PRs) > Keep this section for AI-authored PRs. For human-only PRs, mark each field `N/A`. ### Linear Issue - Key: N/A - URL: N/A ### Commit & Branch - Branch: `feat/mcp-client-provenance` - Commit SHA: `95ab2dfe` ### Validation Run - [x] `pnpm --filter openhuman-app format:check` — blocked locally; see Validation Blocked. - [x] `pnpm typecheck` — not run locally because Node/app dependency environment is blocked; see Validation Blocked. - [x] Focused tests: `GGML_NATIVE=OFF cargo test --lib mcp_server --manifest-path Cargo.toml` — 47 passed, 0 failed. - [x] Rust fmt/check (if changed): `cargo fmt --check --manifest-path Cargo.toml` — passed. - [x] Tauri fmt/check (if changed): N/A, no Tauri shell files changed. - [x] Additional: `git diff --check` — passed. ### Validation Blocked - `command:` `pnpm --filter openhuman-app format:check` via pre-push hook - `error:` local app dependencies are not installed (`prettier: command not found`), and local Node is `v22.14.0` while `openhuman-app` requires `>=24.0.0`. - `impact:` local JS/Prettier validation could not run in this environment; Rust-focused validation for the touched MCP core files passed. Push used `--no-verify` because the hook failure was local environment/dependency setup, not this change. - `command:` `GGML_NATIVE=OFF cargo clippy --lib --manifest-path Cargo.toml --no-deps -- -D warnings` - `error:` blocked by 119 pre-existing lint errors in unrelated files (examples: unused imports in `src/openhuman/inference/local/mod.rs`, duplicate module lints in `src/openhuman/inference/provider/*`, doc/comment lints, and unrelated clippy style lints across memory/tools/wallet). - `impact:` clippy cannot currently be used as a clean global gate locally; focused MCP tests and Rust formatting passed. ### Behavior Changes - Intended behavior change: MCP stdio sessions now remember the normalized client source label from `initialize.params.clientInfo.name` and preserve it for the session. - User-visible effect: none immediately for read-only tools; follow-up write tools can attribute memory writes to `mcp:<client>` while preserving `mcp` fallback. ### Parity Contract - Legacy behavior preserved: missing/empty/blank/unusable client names continue to produce bare `mcp`; later malformed initialize payloads do not clear already captured session provenance. - Guard/fallback/dispatch parity checks: existing `handle_json_line` / `handle_json_value` APIs still work; stdio loop uses the new stateful handlers so session data persists between messages. ### Duplicate / Superseded PR Handling - Duplicate PR(s): tinyhumansai#2306 and tinyhumansai#2316 are related dependencies, not duplicates. - Canonical PR: this PR is the canonical non-duplicative foundation for tinyhumansai#2317 on `main`. - Resolution (closed/superseded/updated): N/A. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * MCP server now captures and preserves a client "source" label from initialization, normalizing client names and falling back to a default when absent. * **Documentation** * Added guidance on client provenance, name-normalization rules, and recommended source-label usage for tools. * **Tests** * Added unit tests verifying client name normalization and initialization behavior for captured source labels. <!-- review_stack_entry_start --> [](https://app.coderabbit.ai/change-stack/tinyhumansai/openhuman/pull/2332?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: 李冠辰 <liguanchen@xiaomi.com> Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>
Security & Robustness Review — Final VerdictClean, well-scoped PR. The Overall: this is good to merge (once #2306 lands and you rebase). Two items worth noting, both minor: Tag-specific concerns
Already covered by #2306's foundationThe content size limit, Nice work closing out the Phase 3 RFC. The three-tool write surface ( |
Addresses graycyrus's tinyhumansai#2316 review minor tinyhumansai#1 (tag-array sanitization). Adds two upper bounds in `build_rpc_params("tree.tag")`: TREE_TAG_MAX_TAGS = 50 TREE_TAG_MAX_TAG_LENGTH = 128 Both reject up-front via `ToolCallError::InvalidParams` rather than silently truncating — same "explicit rejection over silent clamping" discipline already used by `required_non_empty_string_array` for empty/blank entries. Error messages match the existing argument-rejection style: "argument `tags` accepts at most 50 entries (got N)" "argument `tags` entry exceeds 128 characters (got N chars)" The caps prevent two specific misuse patterns: - A misbehaving MCP client flooding a single chunk's tag-record document with hundreds of categorical labels (would bloat the document tags index disproportionately). - A single oversize tag that's actually free-form text — that payload belongs in `memory.note`, not the categorical tag surface. Rejecting up-front surfaces the misuse rather than silently writing a giant token into the queryable `tags` index. Three new unit tests cover: - Oversize array rejection (TREE_TAG_MAX_TAGS + 1 entries). - Oversize single-entry rejection (TREE_TAG_MAX_TAG_LENGTH + 1 chars). - Boundary acceptance (exactly TREE_TAG_MAX_TAGS entries with each entry at exactly TREE_TAG_MAX_TAG_LENGTH chars) so a future off-by-one refactor fails this test, not user calls. Tests: 11 tree_tag unit tests pass (was 8; +3 new). All 59 existing `mcp_server` tests still pass — no regression. Verified locally: - `cargo check --lib` clean - `cargo test --lib tree_tag` 11/11 pass - `cargo fmt --check` clean - `cargo clippy --lib --no-deps` no new warnings
|
@graycyrus thanks for the thorough review. Minor #1 (tag-array sanitization) — addressed in
11 Minor #2 (silent tag replacement → UI notification) — agreed this lives at the broader "MCP write surface UI" follow-up theme you raised on #2306 rather than being Ready for re-review. |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/openhuman/mcp_server/tools.rs (1)
752-756: 💤 Low valueByte length vs character count terminology.
String::len()returns byte length, but the error message says "characters". For ASCII tags (typical for categorical labels) this is equivalent, but for Unicode input (e.g.,"会议"is 2 characters but 6 bytes) the reported count would be misleading.Consider either changing the message to say "bytes" or using
.chars().count()if true character-count semantics are intended. Given tags are categorical labels (expected ASCII), this is a minor documentation nit.Option A: Update terminology (simpler)
if let Some(oversize) = tags.iter().find(|t| t.len() > TREE_TAG_MAX_TAG_LENGTH) { return Err(ToolCallError::InvalidParams(format!( - "argument `tags` entry exceeds {TREE_TAG_MAX_TAG_LENGTH} characters (got {} chars)", + "argument `tags` entry exceeds {TREE_TAG_MAX_TAG_LENGTH} bytes (got {} bytes)", oversize.len() ))); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/mcp_server/tools.rs` around lines 752 - 756, The error message reports "characters" but uses String::len() (byte length); update the code that constructs the ToolCallError in the oversize branch (the tags.iter().find(...) check that returns ToolCallError::InvalidParams) to either report "bytes" instead of "characters" to match oversize.len(), or replace oversize.len() with oversize.chars().count() if you want true character counts; keep the referenced constant TREE_TAG_MAX_TAG_LENGTH unchanged and ensure the formatted value reflects the chosen unit (bytes or characters).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/openhuman/mcp_server/tools.rs`:
- Around line 752-756: The error message reports "characters" but uses
String::len() (byte length); update the code that constructs the ToolCallError
in the oversize branch (the tags.iter().find(...) check that returns
ToolCallError::InvalidParams) to either report "bytes" instead of "characters"
to match oversize.len(), or replace oversize.len() with oversize.chars().count()
if you want true character counts; keep the referenced constant
TREE_TAG_MAX_TAG_LENGTH unchanged and ensure the formatted value reflects the
chosen unit (bytes or characters).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 444a0ba3-d376-4b30-b4d9-06c337374de9
📒 Files selected for processing (1)
src/openhuman/mcp_server/tools.rs
CI flagged a fmt diff on `3c50c975` — the three new tag-cap tests had multi-line `build_rpc_params(...)` / `assert!(...)` invocations that fmt wants collapsed onto single lines. Locally `cargo fmt --check` had run against the wrong branch when the commit was prepared, so the diff slipped through. Re-ran fmt on this branch and the formatter applied: - `let err = build_rpc_params(...)` calls collapsed to single line - `assert!(...)` calls with short bodies collapsed to single line - `let params = build_rpc_params(...)` collapsed to single line No behaviour change — purely the formatter's preferred width-based collapse. All 11 tree_tag unit tests still pass.
|
Quick CI status note for anyone scanning this surface:
Plan: once #2470 lands and @Liohtml rebases #2306 onto new main, I'll do the matching rebase here and add |
|
Status link for the dependency chain: I prepared and validated a #2306 sync branch at This PR still has three unresolved review threads pointing at |
## Summary - Adds MCP stdio session state that captures `initialize.params.clientInfo.name` for the lifetime of the session. - Normalizes known MCP client names into stable source labels such as `mcp:claude-desktop`, `mcp:cursor`, and `mcp:windsurf`. - Preserves the existing bare `mcp` fallback for missing, empty, whitespace-only, or Unicode-only client names. - Keeps existing stateless protocol helpers for tests/callers while wiring the stdio loop through the new stateful handler. - Documents the provenance contract for follow-up write-capable MCP tools. ## Problem tinyhumansai#2317 needs MCP write tools to distinguish which client wrote memory, not only that the write came from MCP. The write-tool PRs are still in flight (tinyhumansai#2306 for `memory.store` / `memory.note`, tinyhumansai#2316 for `tree.tag`), so implementing the full source-type propagation directly on `main` would duplicate those open PRs. ## Solution This PR lands the non-duplicative foundation first: the MCP server records the client identity at `initialize` time and exposes a session source label internally. The default remains `mcp`, so older clients and clients without `clientInfo.name` retain current behavior. Once tinyhumansai#2306/tinyhumansai#2316 merge, the write dispatch path can use `session.source_type()` instead of the placeholder `mcp` source string. ## Submission Checklist > If a section does not apply to this change, mark the item as `N/A` with a one-line reason. Do not delete items. - [x] Tests added or updated (happy path + at least one failure / edge case) per [Testing Strategy](../gitbooks/developing/testing-strategy.md#failure-path-requirement) - [x] **Diff coverage ≥ 80%** — local coverage not run; CI coverage gate is the source of truth for changed-line coverage on this PR. - [x] Coverage matrix updated — N/A: MCP protocol/session foundation only; no feature matrix row added/removed/renamed. - [x] All affected feature IDs from the matrix are listed in the PR description under `## Related` — N/A: no feature matrix row applies. - [x] No new external network dependencies introduced (mock backend used per [Testing Strategy](../gitbooks/developing/testing-strategy.md#mock-policy)) - [x] Manual smoke checklist updated if this touches release-cut surfaces — N/A: no release manual smoke flow changes. - [x] Linked issue closed via `Closes #NNN` in the `## Related` section — N/A: this prepares tinyhumansai#2317 but does not fully close it until write tools consume the session source label. ## Impact - Runtime/platform impact: CLI stdio MCP server only. - Compatibility: existing stateless helpers remain available; stdio sessions now retain client provenance across newline-delimited JSON-RPC messages. - Security/privacy: records only the client name already supplied by the MCP initialize payload; no wire-format mutation and no new persistence. - Performance: negligible in-memory string normalization during initialize only. ## Related - Refs tinyhumansai#2317 - Depends conceptually on tinyhumansai#2306 and tinyhumansai#2316 for write-tool consumption. - Follow-up PR(s)/TODOs: after tinyhumansai#2306/tinyhumansai#2316 merge, thread `McpSession::source_type()` into `memory.store`, `memory.note`, and `tree.tag` source_type construction. --- ## AI Authored PR Metadata (required for Codex/Linear PRs) > Keep this section for AI-authored PRs. For human-only PRs, mark each field `N/A`. ### Linear Issue - Key: N/A - URL: N/A ### Commit & Branch - Branch: `feat/mcp-client-provenance` - Commit SHA: `95ab2dfe` ### Validation Run - [x] `pnpm --filter openhuman-app format:check` — blocked locally; see Validation Blocked. - [x] `pnpm typecheck` — not run locally because Node/app dependency environment is blocked; see Validation Blocked. - [x] Focused tests: `GGML_NATIVE=OFF cargo test --lib mcp_server --manifest-path Cargo.toml` — 47 passed, 0 failed. - [x] Rust fmt/check (if changed): `cargo fmt --check --manifest-path Cargo.toml` — passed. - [x] Tauri fmt/check (if changed): N/A, no Tauri shell files changed. - [x] Additional: `git diff --check` — passed. ### Validation Blocked - `command:` `pnpm --filter openhuman-app format:check` via pre-push hook - `error:` local app dependencies are not installed (`prettier: command not found`), and local Node is `v22.14.0` while `openhuman-app` requires `>=24.0.0`. - `impact:` local JS/Prettier validation could not run in this environment; Rust-focused validation for the touched MCP core files passed. Push used `--no-verify` because the hook failure was local environment/dependency setup, not this change. - `command:` `GGML_NATIVE=OFF cargo clippy --lib --manifest-path Cargo.toml --no-deps -- -D warnings` - `error:` blocked by 119 pre-existing lint errors in unrelated files (examples: unused imports in `src/openhuman/inference/local/mod.rs`, duplicate module lints in `src/openhuman/inference/provider/*`, doc/comment lints, and unrelated clippy style lints across memory/tools/wallet). - `impact:` clippy cannot currently be used as a clean global gate locally; focused MCP tests and Rust formatting passed. ### Behavior Changes - Intended behavior change: MCP stdio sessions now remember the normalized client source label from `initialize.params.clientInfo.name` and preserve it for the session. - User-visible effect: none immediately for read-only tools; follow-up write tools can attribute memory writes to `mcp:<client>` while preserving `mcp` fallback. ### Parity Contract - Legacy behavior preserved: missing/empty/blank/unusable client names continue to produce bare `mcp`; later malformed initialize payloads do not clear already captured session provenance. - Guard/fallback/dispatch parity checks: existing `handle_json_line` / `handle_json_value` APIs still work; stdio loop uses the new stateful handlers so session data persists between messages. ### Duplicate / Superseded PR Handling - Duplicate PR(s): tinyhumansai#2306 and tinyhumansai#2316 are related dependencies, not duplicates. - Canonical PR: this PR is the canonical non-duplicative foundation for tinyhumansai#2317 on `main`. - Resolution (closed/superseded/updated): N/A. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * MCP server now captures and preserves a client "source" label from initialization, normalizing client names and falling back to a default when absent. * **Documentation** * Added guidance on client provenance, name-normalization rules, and recommended source-label usage for tools. * **Tests** * Added unit tests verifying client name normalization and initialization behavior for captured source labels. <!-- review_stack_entry_start --> [](https://app.coderabbit.ai/change-stack/tinyhumansai/openhuman/pull/2332?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: 李冠辰 <liguanchen@xiaomi.com> Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>
- Add `annotations` to the new memory.store / memory.note / tree.tag McpToolSpec entries (writeable, idempotent upsert, closed-world). This unblocks the `Rust Quality (clippy)` CI failure introduced by the `annotations` field added in tinyhumansai#2268 — the new specs were forked from pre-tinyhumansai#2268 tinyhumansai#2306 scaffolding and were missing the required field on rebase against current main. - Extend the read-only-by-default test exemption (`act_tool_names`) to cover the three new write tools, matching their `readOnlyHint: false`. - Fix CodeRabbit nit (tools.rs:803): the oversize-tag error said "characters" but used `String::len()` (byte length). Reword to "bytes" so Unicode tag input gets a self-consistent message. Defers @graycyrus's three inline threads on lines 686/705/1072 (slug_from collisions, memory.note upsert-vs-append, dispatch_write_tool hardcoded rpc_method) to tinyhumansai#2306 per the author's review-thread reply — that's where the underlying functions land and where the fix should originate so this branch can simply inherit on rebase.
|
Status update (PR-finisher pass on 2026-05-22T20:02Z) Pushed
Deferred per @justinhsu1477's review reply (not addressed here):
All three live in #2306's foundation code ( |
…) (tinyhumansai#2316) Co-authored-by: Lionel <lionel.machire@gmail.com> Co-authored-by: Steven Enamakel <31011319+senamakel@users.noreply.github.com> Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>
Summary
Adds the third MCP write tool from the Phase 3 RFC (#2269) on top of @Liohtml's
memory.store+memory.notescaffolding in #2306, so the RFC closes out fully.tree.tagMCP tool: attach one or more category labels to an existing memory chunk.openhuman.memory_doc_putRPC.enforce_write_policy+dispatch_write_toolfrom feat(mcp): add memory.store and memory.note write tools #2306 unchanged — inherits the same policy gate and audit-log tracing.source_type = "mcp"provenance convention asmemory.store/memory.note.Depends on #2306
This PR is layered directly on top of #2306. The diff here only adds the new tool surface; if #2306 changes during review (e.g. CodeRabbit's open
CHANGES_REQUESTEDitems), I'll rebase this branch.What
tree.tagdoes{ "name": "tree.tag", "arguments": { "chunk_id": "chunk-42", "tags": ["todo", "q3-planning"] } }→ Creates / upserts a tag-record document keyed on
mcp-tag-<chunk_id>in themcpnamespace, with:tags = ["todo", "q3-planning"]on the top-level document tags index (queryable downstream viadoc_list/ search filters).metadata.tags_for_chunk_id = "chunk-42"— back-reference so consumers reading the metadata view know which chunk the tag-record points at.metadata.applied_tags = ["todo", "q3-planning"]— mirror of the tag list for callers that read the metadata blob without joining against the top-level field.How it differs from
memory.notememory.notetree.tagnote_text)tags[])chunk_id(silently overwrites — see #2306 for the open contract discussion)tagsindex (filterable)mcp-note-<chunk_id>mcp-tag-<chunk_id>Both upsert by
chunk_id. As graycyrus noted in #2306,memory.noteoverwrites on re-call — whether that should change to a content-hash key (true append) or stay as set-semantic is the open contract discussion happening on the parent PR.tree.tagis unambiguously replace-on-re-call: supply the complete desired set each time.Implementation footprint
Plus a small reusable helper
required_non_empty_string_arrayadded next to its siblingoptional_string_array— used to rejecttags: []up-front at the MCP layer rather than letting an empty tag-set propagate to the document RPC where the failure mode would be silent.Submission Checklist
build_rpc_paramsvalidation + schema, fully exercised by the new unit tests; helper function paths covered by the same tests.memory_doc_put.Closes #2269in## Related(transitively — feat(mcp): add memory.store and memory.note write tools #2306 already declaresCloses #2269; this PR completes the RFC's three-tool surface so it's logically the closing piece).Impact
memory.store/memory.notebehavior unchanged.matchbranch inbuild_rpc_params.enforce_write_policygate from feat(mcp): add memory.store and memory.note write tools #2306 — no new attack surface relative to that PR.Related
client_infopropagation — capturing the calling MCP client name on each written chunk (e.g.source_type = "mcp:Claude Desktop") — separate issue per feat(mcp): add memory.store and memory.note write tools #2306 review thread, will file once this lands.AI Authored PR Metadata
Linear Issue
Commit & Branch
feat/mcp-tree-tag-write-toolValidation Run
pnpm --filter openhuman-app format:check— Rust-only change.pnpm typecheck— Rust-only change.cargo test --lib tree_tag(8/8 pass);cargo test --lib mcp_server(59/59 pass, no regression onmemory.store/memory.notefrom feat(mcp): add memory.store and memory.note write tools #2306).cargo fmt --checkclean;cargo check --libclean (pre-existing warnings only);cargo clippy --lib --no-depsno new warnings inmcp_server/tools.rs.app/src-tauri/src/**changes.Validation Blocked
Behavior Changes
tree.tag— that attaches typed category labels to existing memory chunks.Parity Contract
core.*/agent.*/memory.*/tree.*(read_chunk/browse/top_entities/list_sources) tool schemas and dispatch are unchanged.memory.storeandmemory.notefrom feat(mcp): add memory.store and memory.note write tools #2306 are unchanged.enforce_write_policygate inherited from feat(mcp): add memory.store and memory.note write tools #2306;dispatch_write_toolinherited from feat(mcp): add memory.store and memory.note write tools #2306; both reused without modification.Duplicate / Superseded PR Handling
tree.tagvia search before opening. @Liohtml explicitly invited this follow-up in the feat(mcp): add memory.store and memory.note write tools #2306 review thread.Summary by CodeRabbit
New Features
Tests