diff --git a/docs/plans/2026-03-17-fix-utf8-tool-notifier-panic.md b/docs/plans/2026-03-17-fix-utf8-tool-notifier-panic.md new file mode 100644 index 0000000..66e9d08 --- /dev/null +++ b/docs/plans/2026-03-17-fix-utf8-tool-notifier-panic.md @@ -0,0 +1,249 @@ +# Fix UTF-8 Panic in tool_notifier.rs Implementation Plan + +> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. + +**Goal:** Fix a panic crash in `format_args_preview` caused by byte-indexed string slicing that breaks on multi-byte UTF-8 characters (Chinese/Japanese/etc). + +**Architecture:** The fix replaces two unsafe `&s[..60]` byte-slices with a char-count-based truncation using `.chars().take(60).collect::()`. This is pure Rust stdlib — no new dependencies needed. Both the single-arg fast-path (line 32-33) and the fallback raw-JSON path (line 43-44) in `format_args_preview` are affected. + +**Tech Stack:** Rust 2021 edition, `serde_json`, no new dependencies. + +--- + +## Background + +**The crash:** +``` +thread 'tokio-runtime-worker' panicked at src/platform/tool_notifier.rs:44:36: +byte index 60 is not a char boundary; it is inside '香' (bytes 58..61) of `{"description":"每日上午10點 arXiv AI 論文摘要(香港時間)",...` +``` + +**Why it crashes:** +- `String::len()` returns **byte length**, not character count +- `&s[..60]` requires byte index 60 to sit on a UTF-8 character boundary +- Chinese characters are 3 bytes each in UTF-8, so byte 60 can fall in the middle of one +- Rust panics rather than silently producing garbled output + +**Two affected lines in `src/platform/tool_notifier.rs`:** + +| Line | Code | Path | +|------|------|------| +| 32-33 | `if s.len() > 60 { format!("{}...", &s[..60]) }` | single-arg fast-path | +| 43-44 | `if args_json.len() > 60 { format!("{}...", &args_json[..60]) }` | fallback raw-JSON path | + +**Safe replacement pattern:** +```rust +// BEFORE (panics on multibyte chars): +if s.len() > 60 { + format!("{}...", &s[..60]) +} else { + s +} + +// AFTER (safe): +let truncated: String = s.chars().take(60).collect(); +if s.chars().count() > 60 { + format!("{}...", truncated) +} else { + s // or truncated — same value +} +``` + +Or more efficiently (single-pass): +```rust +let mut char_count = 0usize; +let mut byte_end = 0usize; +for ch in s.chars() { + if char_count == 60 { break; } + char_count += 1; + byte_end += ch.len_utf8(); +} +if char_count == 60 { + format!("{}...", &s[..byte_end]) +} else { + s +} +``` + +The simplest correct approach (collect into String) is fine for preview strings — no hot path concern. + +--- + +### Task 1: Add Failing Tests for Multi-byte Characters + +**Files:** +- Modify: `src/platform/tool_notifier.rs` (the `#[cfg(test)]` block, currently lines 172-232) + +**Step 1: Add two failing tests in the existing test module** + +Open `src/platform/tool_notifier.rs` and add these two tests inside the existing `#[cfg(test)] mod tests` block (after the last existing test, before the closing `}`): + +```rust +#[test] +fn test_format_args_preview_single_arg_chinese() { + // Chinese chars are 3 bytes each; byte index 60 falls mid-char without the fix. + // "每日上午10點 arXiv AI 論文摘要(香港時間)" is >60 bytes but we want char-safe truncation. + let long_chinese = "每日上午10點 arXiv AI 論文摘要(香港時間)很長的標題讓我們繼續寫下去直到超過六十個字"; + let json = format!(r#"{{"query":"{}"}}"#, long_chinese); + // Must not panic — any result is acceptable as long as it doesn't crash. + let preview = format_args_preview(&json); + assert!(preview.contains("\""), "should be quoted single-arg preview"); + // Verify result is valid UTF-8 (it always is if we don't panic) + assert!(std::str::from_utf8(preview.as_bytes()).is_ok()); +} + +#[test] +fn test_format_args_preview_multi_arg_chinese_no_panic() { + // Multi-arg JSON falls through to the raw-JSON fallback path (line 43-44). + // This is the exact case that crashed in production. + let args = r#"{"description":"每日上午10點 arXiv AI 論文摘要(香港時間)","prompt":"使用 arxiv-daily-briefing skill","trigger_type":"recurring","trigger_value":"0 0 2 * * *"}"#; + // Must not panic + let preview = format_args_preview(args); + assert!(!preview.is_empty()); + assert!(std::str::from_utf8(preview.as_bytes()).is_ok()); +} +``` + +**Step 2: Run the tests to confirm they fail (or panic)** + +```bash +cargo test -p rustfox test_format_args_preview_single_arg_chinese test_format_args_preview_multi_arg_chinese_no_panic 2>&1 +``` + +Expected: one or both tests **panic** (not just fail) with `byte index N is not a char boundary`. If they don't panic yet, the strings may not be long enough — adjust the Chinese string to be longer. + +> Note: A panic in a test shows as `FAILED` with the panic message in output. + +**Step 3: Commit the failing tests** + +```bash +git add src/platform/tool_notifier.rs +git commit -m "test(tool_notifier): add failing tests for multibyte UTF-8 truncation" +``` + +--- + +### Task 2: Fix the UTF-8 Slicing in `format_args_preview` + +**Files:** +- Modify: `src/platform/tool_notifier.rs` lines 21-48 + +**Step 1: Read current `format_args_preview` to confirm line numbers** + +The function currently looks like this (lines 21-48): + +```rust +pub fn format_args_preview(args_json: &str) -> String { + // Try to extract a single-value preview for readability + if let Ok(val) = serde_json::from_str::(args_json) { + if let Some(obj) = val.as_object() { + if obj.len() == 1 { + if let Some((_, v)) = obj.iter().next() { + let s = match v { + serde_json::Value::String(s) => s.clone(), + other => other.to_string(), + }; + let truncated = if s.len() > 60 { // BUG: byte len + format!("{}...", &s[..60]) // BUG: byte slice + } else { + s + }; + return format!("\"{}\"", truncated); + } + } + } + } + // Fallback: truncate raw JSON + if args_json.len() > 60 { // BUG: byte len + format!("{}...", &args_json[..60]) // BUG: byte slice + } else { + args_json.to_string() + } +} +``` + +**Step 2: Replace the function body with the safe version** + +Replace the entire `format_args_preview` function with: + +```rust +pub fn format_args_preview(args_json: &str) -> String { + // Try to extract a single-value preview for readability + // e.g. {"query":"Docker setup"} -> "Docker setup" + if let Ok(val) = serde_json::from_str::(args_json) { + if let Some(obj) = val.as_object() { + if obj.len() == 1 { + if let Some((_, v)) = obj.iter().next() { + let s = match v { + serde_json::Value::String(s) => s.clone(), + other => other.to_string(), + }; + let truncated = truncate_chars(&s, 60); + return format!("\"{}\"", truncated); + } + } + } + } + // Fallback: truncate raw JSON + truncate_chars(args_json, 60) +} + +/// Truncate `s` to at most `max_chars` Unicode scalar values. +/// Appends "..." if truncation occurred. +/// Safe for any UTF-8 input including Chinese, Japanese, emoji, etc. +fn truncate_chars(s: &str, max_chars: usize) -> String { + let mut char_count = 0usize; + let mut byte_end = 0usize; + for ch in s.chars() { + if char_count == max_chars { + // We have more chars — truncate here + return format!("{}...", &s[..byte_end]); + } + char_count += 1; + byte_end += ch.len_utf8(); + } + // Fewer than max_chars — return as-is + s.to_string() +} +``` + +**Step 3: Run all tool_notifier tests** + +```bash +cargo test -p rustfox tool_notifier 2>&1 +``` + +Expected: ALL tests pass including the two new ones. If any fail, read the error and fix before proceeding. + +**Step 4: Run the full test suite and lints** + +```bash +cargo test 2>&1 && cargo clippy -- -D warnings 2>&1 && cargo fmt --check 2>&1 +``` + +Expected: all clean. If `cargo fmt --check` fails, run `cargo fmt` to auto-fix formatting. + +**Step 5: Commit the fix** + +```bash +git add src/platform/tool_notifier.rs +git commit -m "fix(tool_notifier): use char-safe truncation to prevent UTF-8 boundary panic + +Byte-indexed slicing (&s[..60]) panics when the byte boundary falls +inside a multi-byte character (e.g. Chinese 香 = 3 bytes). +Replace with a single-pass char-counting loop that finds the safe +byte boundary, used in both the single-arg fast-path and the raw-JSON +fallback inside format_args_preview." +``` + +--- + +## Verification Checklist + +- [ ] `cargo test` — all tests green including the two new multibyte tests +- [ ] `cargo clippy -- -D warnings` — no warnings +- [ ] `cargo fmt --check` — formatted +- [ ] The exact production input no longer panics: + ```rust + format_args_preview(r#"{"description":"每日上午10點 arXiv AI 論文摘要(香港時間)","prompt":"使用 arxiv-daily-briefing skill,生成每天的 AI 論文簡報(香港時間),並以廣東話摘要提供給用戶。","trigger_type":"recurring","trigger_value":"0 0 2 * * *"}"#); + ``` diff --git a/docs/plans/2026-03-18-fix-utf8-rag-query-rewriter.md b/docs/plans/2026-03-18-fix-utf8-rag-query-rewriter.md new file mode 100644 index 0000000..d336c68 --- /dev/null +++ b/docs/plans/2026-03-18-fix-utf8-rag-query-rewriter.md @@ -0,0 +1,337 @@ +# Fix UTF-8 Byte-Slice Panic in rag.rs and query_rewriter.rs + +> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. + +**Goal:** Eliminate two `&s[..N]` byte-slice panics on multi-byte UTF-8 input (Chinese/Japanese) in `src/memory/rag.rs` and `src/memory/query_rewriter.rs` by extracting `truncate_chars` into a shared `src/utils/str.rs` utility and using it everywhere. + +**Architecture:** A new `src/utils/str.rs` module exposes `pub fn truncate_chars` (already implemented privately in `tool_notifier.rs`). All three callers — `tool_notifier`, `rag`, and `query_rewriter` — use the shared version. This is a DRY refactor + bug fix in one pass. No new dependencies. + +**Tech Stack:** Rust 2021, stdlib only. + +--- + +## Background + +The same byte-indexing panic fixed in `tool_notifier.rs` (`&s[..60]`) exists in two more files: + +| File | Line | Broken code | Trigger | +|------|------|-------------|---------| +| `src/memory/rag.rs` | 45–46 | `&content[..300]` | any RAG result with Chinese text >300 bytes | +| `src/memory/query_rewriter.rs` | 97–98 | `&c[..200]` | any conversation message with Chinese text >200 bytes | + +The existing test `test_format_history_truncates_long_content` in `query_rewriter.rs` only uses ASCII `"x".repeat(500)` — it will never catch this. + +--- + +### Task 1: Create `src/utils/str.rs` Shared Utility + +**Files:** +- Create: `src/utils/str.rs` +- Create: `src/utils/mod.rs` +- Modify: `src/main.rs` (add `mod utils;`) +- Modify: `src/platform/tool_notifier.rs` (replace private copy with shared one) + +**Step 1: Write the test for the new module** + +Create `src/utils/str.rs` with only the tests (no implementation yet): + +```rust +/// Truncates `s` to at most `max_chars` Unicode scalar values. +/// Appends "..." if truncation occurred. +/// Safe for any UTF-8 input including Chinese, Japanese, emoji, etc. +pub fn truncate_chars(s: &str, max_chars: usize) -> String { + todo!() +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_truncate_chars_ascii_short() { + assert_eq!(truncate_chars("hello", 10), "hello"); + } + + #[test] + fn test_truncate_chars_ascii_exact() { + assert_eq!(truncate_chars("hello", 5), "hello"); + } + + #[test] + fn test_truncate_chars_ascii_truncated() { + assert_eq!(truncate_chars("hello world", 5), "hello..."); + } + + #[test] + fn test_truncate_chars_empty() { + assert_eq!(truncate_chars("", 10), ""); + } + + #[test] + fn test_truncate_chars_chinese_no_panic() { + // Chinese chars are 3 bytes each — naive &s[..N] panics here + let s = "每日上午10點 arXiv AI 論文摘要(香港時間)這是一段很長的中文文字用來測試截斷功能是否正確運作"; + let result = truncate_chars(s, 10); + // Must not panic, must end with "..." + assert!(result.ends_with("..."), "should truncate: {}", result); + // Must be valid UTF-8 + assert!(std::str::from_utf8(result.as_bytes()).is_ok()); + // Must be at most 10 chars + "..." = 13 chars + let char_count = result.chars().count(); + assert!(char_count <= 13, "too long: {} chars", char_count); + } + + #[test] + fn test_truncate_chars_chinese_short_no_ellipsis() { + // String shorter than max_chars — no "..." appended + let s = "你好世界"; + let result = truncate_chars(s, 10); + assert_eq!(result, "你好世界"); + } + + #[test] + fn test_truncate_chars_300_boundary() { + // Simulate the rag.rs usage: 300-char limit + let chinese = "香港時間每日簡報".repeat(50); // 8 chars * 50 = 400 chars + let result = truncate_chars(&chinese, 300); + assert!(result.ends_with("...")); + assert!(std::str::from_utf8(result.as_bytes()).is_ok()); + } +} +``` + +**Step 2: Run tests to confirm they fail** + +```bash +cargo test utils::str 2>&1 +``` + +Expected: compile error or all tests fail with `not yet implemented` (from `todo!()`). + +**Step 3: Implement `truncate_chars`** + +Replace the `todo!()` in `src/utils/str.rs` with the real implementation: + +```rust +pub fn truncate_chars(s: &str, max_chars: usize) -> String { + let mut byte_end = 0usize; + for (char_count, ch) in s.chars().enumerate() { + if char_count == max_chars { + return format!("{}...", &s[..byte_end]); + } + byte_end += ch.len_utf8(); + } + s.to_string() +} +``` + +**Step 4: Create `src/utils/mod.rs`** + +```rust +pub mod str; +``` + +**Step 5: Declare `utils` in `src/main.rs`** + +Add `mod utils;` to the module list at the top of `src/main.rs` (after the existing `mod tools;` line): + +```rust +mod utils; +``` + +**Step 6: Update `src/platform/tool_notifier.rs` to use the shared version** + +Remove the private `fn truncate_chars` from `tool_notifier.rs` (currently lines ~45–54) and replace all calls to `truncate_chars(...)` in that file with `crate::utils::str::truncate_chars(...)`. + +The two call sites in `format_args_preview`: +```rust +// line ~32 +let truncated = crate::utils::str::truncate_chars(&s, 60); + +// line ~39 +crate::utils::str::truncate_chars(args_json, 60) +``` + +**Step 7: Run all tests** + +```bash +cargo test 2>&1 +``` + +Expected: all tests pass. If `tool_notifier` tests fail, the refactor broke something — check the call sites. + +**Step 8: Run lints** + +```bash +cargo clippy -- -D warnings 2>&1 && cargo fmt --check 2>&1 +``` + +Both must be clean. Run `cargo fmt` if format check fails. + +**Step 9: Commit** + +```bash +git add src/utils/str.rs src/utils/mod.rs src/main.rs src/platform/tool_notifier.rs +git commit -m "refactor(utils): extract truncate_chars to shared src/utils/str.rs + +Move the UTF-8-safe string truncation helper from tool_notifier.rs +into a new src/utils/str module so rag.rs and query_rewriter.rs +can reuse it without duplicating the implementation." +``` + +--- + +### Task 2: Fix `src/memory/rag.rs` + +**Files:** +- Modify: `src/memory/rag.rs` (lines 45–46) + +**Step 1: Write a failing test** + +Add this test at the bottom of `src/memory/rag.rs` (create a `#[cfg(test)] mod tests` block if one doesn't exist): + +```rust +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_build_retrieved_block_chinese_content_no_panic() { + // Simulate what auto_retrieve_context does when content has Chinese chars + // longer than 300 bytes. The old &content[..300] would panic here. + // This calls the snippet-building logic indirectly via the format path. + // We test truncate_chars directly since the inner loop isn't easily callable. + let long_chinese = "每日論文摘要香港時間測試".repeat(30); // >300 bytes + let result = crate::utils::str::truncate_chars(&long_chinese, 300); + assert!(result.ends_with("...")); + assert!(std::str::from_utf8(result.as_bytes()).is_ok()); + } +} +``` + +**Step 2: Run test to verify it compiles and passes** (it should already pass since Task 1 introduced `truncate_chars`) + +```bash +cargo test memory::rag 2>&1 +``` + +**Step 3: Fix the byte-slice** + +In `src/memory/rag.rs` lines 45–49, replace: + +```rust +let snippet = if content.len() > 300 { + format!("{}...", &content[..300]) +} else { + content.clone() +}; +``` + +With: + +```rust +let snippet = crate::utils::str::truncate_chars(content, 300); +``` + +**Step 4: Run tests and lints** + +```bash +cargo test 2>&1 && cargo clippy -- -D warnings 2>&1 +``` + +All must pass. + +**Step 5: Commit** + +```bash +git add src/memory/rag.rs +git commit -m "fix(rag): replace byte-slice with truncate_chars to prevent UTF-8 panic + +&content[..300] panics when byte 300 falls inside a multi-byte char. +Chinese/Japanese input in retrieved context triggered this path." +``` + +--- + +### Task 3: Fix `src/memory/query_rewriter.rs` + +**Files:** +- Modify: `src/memory/query_rewriter.rs` (lines 97–98 and the existing test at lines 183–194) + +**Step 1: Write a failing test for the Chinese input case** + +The existing test `test_format_history_truncates_long_content` uses ASCII only and will not catch this bug. Add a new multibyte test alongside it (before the closing `}` of the `mod tests` block): + +```rust +#[test] +fn test_format_history_truncates_long_chinese_no_panic() { + // Old &c[..200] panics when byte 200 falls inside a multibyte char. + // Chinese chars are 3 bytes each — 67 chars already exceed 200 bytes. + let long_chinese = "每日論文摘要(香港時間)人工智能最新研究".repeat(15); // >200 bytes + let msgs = vec![msg("user", &long_chinese)]; + let result = format_history(&msgs); + // Must not panic + assert!(!result.is_empty()); + assert!(std::str::from_utf8(result.as_bytes()).is_ok()); + // Must be truncated + assert!(result.contains("..."), "should truncate long content: {}", &result[..result.len().min(80)]); +} +``` + +**Step 2: Run to confirm test currently panics** + +```bash +cargo test memory::query_rewriter::tests::test_format_history_truncates_long_chinese_no_panic 2>&1 +``` + +Expected: FAILED with `byte index 200 is not a char boundary`. + +**Step 3: Fix the byte-slice** + +In `src/memory/query_rewriter.rs` lines 97–101, replace: + +```rust +let snippet = if c.len() > 200 { + format!("{}...", &c[..200]) +} else { + c.clone() +}; +``` + +With: + +```rust +let snippet = crate::utils::str::truncate_chars(c, 200); +``` + +**Step 4: Run all tests** + +```bash +cargo test 2>&1 && cargo clippy -- -D warnings 2>&1 && cargo fmt --check 2>&1 +``` + +All must pass. The new multibyte test and the old ASCII test should both be green. + +**Step 5: Commit** + +```bash +git add src/memory/query_rewriter.rs +git commit -m "fix(query_rewriter): replace byte-slice with truncate_chars to prevent UTF-8 panic + +&c[..200] panics when byte 200 falls inside a multi-byte character. +Also adds a multibyte regression test — the existing test only used ASCII." +``` + +--- + +## Verification Checklist + +- [ ] `src/utils/str.rs` exists with `pub fn truncate_chars` and 6+ tests +- [ ] `src/utils/mod.rs` exists with `pub mod str;` +- [ ] `mod utils;` declared in `src/main.rs` +- [ ] `tool_notifier.rs` no longer has a private `truncate_chars` — uses shared one +- [ ] `rag.rs:45–46` — no `&content[..300]` +- [ ] `query_rewriter.rs:97–98` — no `&c[..200]` +- [ ] `cargo test` — all pass including multibyte regression tests +- [ ] `cargo clippy -- -D warnings` — clean +- [ ] `cargo fmt --check` — clean diff --git a/docs/superpowers/plans/2026-04-12-mcp-notion-ui-and-oauth-refresh.md b/docs/superpowers/plans/2026-04-12-mcp-notion-ui-and-oauth-refresh.md new file mode 100644 index 0000000..ae24c8c --- /dev/null +++ b/docs/superpowers/plans/2026-04-12-mcp-notion-ui-and-oauth-refresh.md @@ -0,0 +1,180 @@ +# Notion MCP UI + OAuth Refresh Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use `superpowers:subagent-driven-development` (recommended) or `superpowers:executing-plans` to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Streamline the setup wizard Notion MCP UX (OAuth-only token, no redundant guide modal) and implement **persistent refresh-token rotation** with **automatic access-token refresh** so the bot can run indefinitely without manual re-auth, aligned with [Notion MCP: Step 8 — Handle token refresh](https://developers.notion.com/guides/mcp/build-mcp-client). + +**Architecture:** Extend `McpServerConfig` with optional OAuth fields used only for HTTP MCP servers that completed Notion’s DCR + authorization code flow. The setup wizard persists `auth_token`, `oauth_refresh_token`, `oauth_client_id`, `oauth_client_secret` (if any), and `oauth_access_expires_at` (derived from `expires_in` at callback time). Extract shared **discovery + token refresh** HTTP logic into `src/notion_oauth.rs` and include it from both `main` and `setup` binaries via `#[path = "../notion_oauth.rs"] mod notion_oauth;` in `setup.rs` and `mod notion_oauth;` in the primary binary root. `McpManager` resolves a fresh access token before connecting (refresh when expired or within a short skew window), then writes updated secrets back to `config.toml` using **`toml_edit`** so comments and unrelated sections are preserved. Optional: lightweight periodic refresh task if the process stays up past the access-token lifetime. + +**Tech Stack:** Rust (`src/config.rs`, `src/mcp.rs`, `src/bin/setup.rs`, `setup/index.html`), `reqwest`, `toml` / `toml_edit`, Notion OAuth (RFC 9470 + 8414 + PKCE + refresh grant). + +**Design decisions (brainstorming summary):** + +| Topic | Options | Recommendation | +|-------|---------|------------------| +| Token field UX | Readonly vs editable for power users | **Readonly** — user asked OAuth-only; document that advanced users may edit `config.toml` by hand if needed. | +| Remove guide | Modal + guide button vs one-line doc link | Remove **modal** and **“Notion Setup Guide”** button; keep a **single** short line + link to official docs on the card (not a separate wizard). | +| Persist OAuth | Sidecar JSON vs extra TOML keys | **Extra optional keys** on `[[mcp_servers]]` for the `notion` entry — single file, matches “save to config”, easier backup story. | +| Config rewrite | Full `Serialize` of `Config` vs surgical edit | **`toml_edit`** surgical update of the relevant `[[mcp_servers]]` row — avoids deriving `Serialize` for the entire config and preserves user comments/order where possible. | +| Shared OAuth code | Duplicate vs `lib.rs` vs `#[path]` module | **`src/notion_oauth.rs` + `#[path]` in setup binary** — minimal churn; no full crate lib split required. | +| When to refresh | Only at startup vs on 401 vs periodic | **At MCP connect time** if `oauth_refresh_token` present and access token missing/expired/near expiry; **optional** `tokio::interval` (e.g. 15–30 min) to refresh proactively for long uptime. Retry **once** after refresh on `invalid_token` if feasible without deep `rmcp` changes (else document “restart bot” as fallback). | + +--- + +## References + +- [Integrating your own MCP client](https://developers.notion.com/guides/mcp/build-mcp-client) — Steps 6–8 (tokens, Bearer, refresh, rotation). +- Prior notes: `docs/superpowers/plans/2026-04-12-notion-mcp-invalid-token-root-cause-and-fix.md` (Task 4). +- Internal OAuth details: `docs/superpowers/plans/2026-04-12-notion-mcp-oauth.md` (deduplicate overlapping work). + +--- + +### Task 1: Config schema — OAuth fields on `McpServerConfig` + +**Files:** + +- Modify: [`src/config.rs`](../../../src/config.rs) — `McpServerConfig` +- Modify: [`config.example.toml`](../../../config.example.toml) — commented Notion example + +- [ ] **Step 1:** Add optional fields (all `Option` or appropriate types), `#[serde(default)]`: + + - `oauth_refresh_token` — refresh token from token endpoint (rotate when refreshed). + - `oauth_client_id` — from dynamic client registration (required for refresh grant). + - `oauth_client_secret` — optional; include when registration returned a secret. + - `oauth_access_expires_at` — store as **Unix timestamp** (`i64` or `u64`) when access token expires; compute at OAuth callback as `now + expires_in` when `expires_in` present, else `None` (force refresh-on-next-connect or conservative behaviour). + +- [ ] **Step 2:** Add unit test in `config.rs` `#[cfg(test)]` parsing a TOML snippet with `[[mcp_servers]]` `name = "notion"` including the new keys. + +- [ ] **Step 3:** Run `cargo test` and `cargo clippy -- -D warnings`. + +--- + +### Task 2: Shared module `src/notion_oauth.rs` (discovery + refresh) + +**Files:** + +- Create: [`src/notion_oauth.rs`](../../../src/notion_oauth.rs) +- Modify: [`src/main.rs`](../../../src/main.rs) — `mod notion_oauth;` + +- [ ] **Step 1:** Move or reimplement (without behaviour change) from [`src/bin/setup.rs`](../../../src/bin/setup.rs): + + - HTTP discovery: protected resource metadata → authorization server metadata (`authorization_endpoint`, `token_endpoint`, `registration_endpoint`, `scopes_supported`). + - `refresh_access_token(http, token_endpoint, client_id, client_secret, refresh_token) -> Result` where `TokenRefreshResponse` contains at least `access_token`, `refresh_token` (optional — use prior if omitted per RFC), `expires_in` (optional). + +- [ ] **Step 2:** Map Notion Step 8: `grant_type=refresh_token`, form body fields per official doc; on `invalid_grant` return a distinct error so callers can log “re-run Connect Notion”. + +- [ ] **Step 3:** Unit tests with `mockito` or `wiremock` **if** already in tree; else `#[cfg(test)]` with a stub — if no HTTP mock dep, test pure URL parsing/helpers only; **do not** add heavy deps without justification. + +- [ ] **Step 4:** `cargo test` / `clippy`. + +--- + +### Task 3: Wire `notion_oauth` into setup binary + persist full OAuth row + +**Files:** + +- Modify: [`src/bin/setup.rs`](../../../src/bin/setup.rs) — top: `#[path = "../notion_oauth.rs"] mod notion_oauth;` then delete duplicated discovery structs/functions now living in `notion_oauth.rs` (keep setup-specific `PendingNotionOAuth`, Axum handlers, HTML callback). + +- [ ] **Step 1:** Callback handler: after successful token exchange, pass `client_id`, `client_secret`, `access_token`, `refresh_token`, `expires_in` to the HTML/JSON payload for the wizard. Extend `oauth_callback_html_ok` JSON payload to include `client_id`, `client_secret` (if any), `expires_in` (already partially there). + +- [ ] **Step 2:** [`setup/index.html`](../../../setup/index.html) — extend `postMessage` handler to stash `oauth_client_id`, `oauth_client_secret`, `oauth_refresh_token`, and compute/store expiry in `state` (e.g. `state.mcp_selections.notion.oauthExpiresAt` as ms timestamp). + +- [ ] **Step 3:** [`generateToml()`](../../../setup/index.html) — for catalog tool `notion`, emit new TOML keys under `[[mcp_servers]]` when values exist (escape with existing `esc()`). + +- [ ] **Step 4:** [`loadExistingConfig`](../../../setup/index.html) / [`parse_existing_config`](../../../src/bin/setup.rs) — round-trip `ExistingMcpServer` in setup.rs to include new fields; merge into wizard state on load. + +- [ ] **Step 5:** Setup tests in `setup.rs` for parse TOML containing new fields. + +--- + +### Task 4: Notion MCP wizard UI — simplify + readonly token + +**Files:** + +- Modify: [`setup/index.html`](../../../setup/index.html) + +- [ ] **Step 1:** Remove the **Notion MCP Setup Modal** block (`#notion-modal` and related CSS if unused). + +- [ ] **Step 2:** Remove `openNotionModal`, `closeNotionModal`, `__NOTION_GUIDE_BUTTON__` branch, and any `onclick` that opened the modal. + +- [ ] **Step 3:** On the Notion card: one concise line, e.g. “Sign in with Notion via OAuth — [Get started ↗](https://developers.notion.com/guides/mcp/get-started-with-mcp) · [Client integration ↗](https://developers.notion.com/guides/mcp/build-mcp-client)”. No separate guide modal. + +- [ ] **Step 4:** Token ``: set `readonly` attribute for `NOTION_TOKEN`; remove `oninput` that called `setEnv` for manual typing — only `setEnv` from OAuth `postMessage` and `disconnectNotionOAuth` updates state. Style readonly (e.g. muted background) to match UX. + +- [ ] **Step 5:** Keep **Connect Notion** / **Disconnect**; ensure previous fix remains: **do not** use `noopener` on `window.open` for OAuth popup. + +- [ ] **Step 6:** Adjust `looksLikeNotionIntegrationSecret` / validation: still useful when **loading** old configs; readonly prevents new pastes. + +- [ ] **Step 7:** Manual browser check: load wizard, Connect Notion, verify field fills + TOML preview contains OAuth fields. + +--- + +### Task 5: `toml_edit` — update Notion MCP block after refresh + +**Files:** + +- Modify: [`Cargo.toml`](../../../Cargo.toml) — add `toml_edit = "0.22"` (or current compatible). + +- Create or modify: e.g. [`src/config_toml_patch.rs`](../../../src/config_toml_patch.rs) or `src/config.rs` — `pub fn update_mcp_server_oauth_in_file(path: &Path, server_name: &str, patch: &OauthTokenPatch) -> Result<()>`. + +- [ ] **Step 1:** Implement read file → parse as `toml_edit::DocumentMut` → locate `[[mcp_servers]]` where `name == "notion"` (or passed `server_name`) → set `auth_token`, `oauth_refresh_token`, `oauth_client_id`, `oauth_client_secret`, `oauth_access_expires_at`. + +- [ ] **Step 2:** Write atomically: temp file in same dir + `rename` (pattern already used elsewhere if present; else standard `fs::write` to temp + rename). + +- [ ] **Step 3:** Unit test with a string constant TOML fixture in `config.rs` or new test module. + +--- + +### Task 6: `McpManager` — refresh then connect + persist + +**Files:** + +- Modify: [`src/mcp.rs`](../../../src/mcp.rs) +- Modify: [`src/main.rs`](../../../src/main.rs) — pass `config_path` into MCP layer + +- [ ] **Step 1:** Extend `McpManager::new` or `connect_all` to accept `config_path: PathBuf` (or `Option` — `None` skips persist for tests). + +- [ ] **Step 2:** For each `McpServerConfig` with `url` set and `oauth_refresh_token` + `oauth_client_id` present, and access token expired or near expiry (compare `oauth_access_expires_at` with `SystemTime` + 5 min skew), call `notion_oauth::refresh_access_token`, build updated `auth_token`, then patch TOML via Task 5 helper, then connect with new bearer. + +- [ ] **Step 3:** If no refresh fields (legacy config with only `auth_token`), connect as today. + +- [ ] **Step 4:** Log success/failure at `info!`/`error!`; on refresh failure with `invalid_grant`, log clear “run setup Connect Notion again”. + +- [ ] **Step 5:** `cargo test`, `clippy`. + +--- + +### Task 7 (optional but recommended): Proactive refresh while bot runs + +**Files:** + +- Modify: [`src/main.rs`](../../../src/main.rs) or [`src/mcp.rs`](../../../src/mcp.rs) + +- [ ] **Step 1:** If `oauth_refresh_token` present, spawn `tokio::spawn` interval (e.g. every 20–30 minutes): if token expires within next 15 minutes, refresh and patch TOML (same as Task 6). + +- [ ] **Step 2:** Use `tokio::sync::Mutex` or single-flight guard so concurrent refresh never runs twice. + +--- + +## Self-review (writing-plans) + +| Requirement | Task | +|-------------|------| +| Better Notion setup UI, remove guide modal, readonly token | Task 4 | +| Refresh token + persist + auto refresh “for lifetime” | Tasks 1–3, 5–7 | +| Align with Notion build-mcp-client Step 8 | Task 2 (refresh grant), Task 6–7 | +| No placeholder-only steps | Checked — file paths and behaviours explicit | + +--- + +## Execution handoff + +**Plan saved to:** `docs/superpowers/plans/2026-04-12-mcp-notion-ui-and-oauth-refresh.md` + +**Two execution options:** + +1. **Subagent-driven (recommended)** — One subagent per task; review between tasks. **Skill:** `superpowers:subagent-driven-development`. + +2. **Inline** — Same session, checkpoints. **Skill:** `superpowers:executing-plans`. + +**Which approach?** diff --git a/docs/superpowers/plans/2026-04-12-notion-mcp-invalid-token-root-cause-and-fix.md b/docs/superpowers/plans/2026-04-12-notion-mcp-invalid-token-root-cause-and-fix.md new file mode 100644 index 0000000..fad5d71 --- /dev/null +++ b/docs/superpowers/plans/2026-04-12-notion-mcp-invalid-token-root-cause-and-fix.md @@ -0,0 +1,166 @@ +# Notion MCP `invalid_token` — Root Cause and Fix Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use `superpowers:subagent-driven-development` (recommended) or `superpowers:executing-plans` to implement task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Ensure RustFox connects to `https://mcp.notion.com/mcp` with a credential Notion accepts: an **OAuth 2.0 access token** obtained via Authorization Code + PKCE, sent as `Authorization: Bearer `, per [Integrating your own MCP client](https://developers.notion.com/guides/mcp/build-mcp-client). + +**Architecture:** The bot already uses `rmcp` streamable HTTP with `StreamableHttpClientTransportConfig::auth_header` ([`src/mcp.rs`](../../../src/mcp.rs)). The setup wizard ([`setup/index.html`](../../../setup/index.html), [`src/bin/setup.rs`](../../../src/bin/setup.rs)) implements discovery (RFC 9470 → RFC 8414), dynamic registration, PKCE, and token exchange. The gap is not “missing OAuth code” in the abstract—it is **operational**: users (or stale config) can still persist the **wrong token type**, which produces exactly the log you see. + +**Tech Stack:** Rust (`src/mcp.rs`, `src/bin/setup.rs`), Axum + `oauth2`, Notion MCP docs. + +--- + +## Systematic debugging — Phase 1: root cause (confirmed) + +### Error (from `run-2026-04-12.log`) + +```text +AuthRequired(AuthRequiredError { www_authenticate_header: "Bearer realm=\"OAuth\", error=\"invalid_token\"" }) +Failed to connect to MCP server 'notion': ... Auth required, when send initialize request +``` + +The server is rejecting the `Authorization` bearer value during MCP `initialize`. + +### Evidence: wrong credential type in `config.toml` + +The workspace `config.toml` (user-local, not committed) contained: + +```toml +[[mcp_servers]] +name = "notion" +url = "https://mcp.notion.com/mcp" +auth_token = "ntn_…" # prefix indicates Notion *internal integration* token style +``` + +**Interpretation:** Tokens prefixed with `ntn_` are **internal integration secrets** (API access for the Integrations API), not OAuth access tokens for the **hosted MCP** resource. Notion’s guide states that connecting to `https://mcp.notion.com/mcp` requires the full OAuth flow (discovery, PKCE, token endpoint) and then passing the **returned `access_token`** on MCP requests ([build-mcp-client](https://developers.notion.com/guides/mcp/build-mcp-client), Step 7). + +Therefore: + +| Symptom | Root cause | +|--------|------------| +| OAuth browser flow “succeeds” but bot still fails | Config was never updated with the OAuth `access_token`, or an old/manual `ntn_` value was saved instead. | +| `invalid_token` on initialize | Bearer value is missing, expired, or **not an OAuth access token** (e.g. integration secret). | + +### What the official guide requires (checklist) + +From [Integrating your own MCP client](https://developers.notion.com/guides/mcp/build-mcp-client): + +1. **Discovery:** Protected Resource Metadata → Authorization Server Metadata (`authorization_endpoint`, `token_endpoint`, optional `registration_endpoint`). +2. **PKCE:** `code_verifier` / `code_challenge` (S256). +3. **Dynamic client registration** when `registration_endpoint` is present. +4. **Token exchange:** `grant_type=authorization_code` with `code_verifier`, `client_id`, `redirect_uri`. +5. **MCP connect:** `Authorization: Bearer ` on streamable HTTP to `https://mcp.notion.com/mcp` (SSE fallback optional). +6. **Refresh:** Access tokens expire (~1 hour per Notion); refresh token rotation must persist the latest refresh token. + +RustFox **implements 1–5 in setup**; the **runtime bot** only sends `auth_token` from config—it does not refresh (see Task 4). + +--- + +## Brainstorming — approaches (3 options) + +| Approach | Pros | Cons | +|----------|------|------| +| **A. User fixes config** | Immediate: re-run setup, **Connect Notion**, verify `auth_token` is **not** `ntn_…`, save, restart bot. | Does not prevent recurrence. | +| **B. Guardrails in UI + startup** | Reject or warn on `ntn_` / `secret_` in the Notion token field; log a clear error at MCP connect. | Small code change; does not replace OAuth. | +| **C. Persist refresh token + refresh in bot** | Matches Notion’s rotation semantics; bot stays up >1h. | Config/schema + token endpoint calls in `mcp` or a small OAuth helper. | + +**Recommendation:** Do **A** immediately to unblock; ship **B** in the same milestone to stop confusion; schedule **C** as follow-up for long-running deployments. + +--- + +## Tasks + +### Task 0: Immediate verification (no code) + +**Files:** User’s `config.toml` (local). + +- [ ] **Step 1:** Open `config.toml` and inspect `[[mcp_servers]]` for `name = "notion"`. +- [ ] **Step 2:** If `auth_token` starts with `ntn_` or looks like an integration secret, **delete that value** and obtain a token via the wizard only: `cargo run --bin setup` → enable Notion → **Connect Notion** → complete browser login → confirm the token field updates → **Save** → restart `cargo run --bin rustfox`. +- [ ] **Step 3:** Expected log: connection succeeds (no `invalid_token`). If it still fails, capture **redacted** first 20 chars of `auth_token` and whether `expires_in` was shown in OAuth callback (for support). + +--- + +### Task 1: Setup wizard — block or warn on integration-token shape + +**Files:** + +- Modify: [`setup/index.html`](../../../setup/index.html) — Notion token input / `setEnv` / `postMessage` handler. + +- [ ] **Step 1:** After setting `NOTION_TOKEN` from OAuth **or** paste, if the value matches `^ntn_` or `^secret_`, show inline error: “This looks like a Notion **integration** secret. Hosted MCP at mcp.notion.com requires an **OAuth access token** — use **Connect Notion**.” + +- [ ] **Step 2:** Optionally disable **Save** on Step 5 when Notion is selected and the token matches those prefixes. + +- [ ] **Step 3:** Manual test: paste fake `ntn_test` → warning appears; OAuth flow still fills a different-shaped token → save → config valid. + +--- + +### Task 2: Bot — clearer MCP connect error for bad Notion token + +**Files:** + +- Modify: [`src/mcp.rs`](../../../src/mcp.rs) — `connect_http` error path (map `AuthRequired` / body if available). + +- [ ] **Step 1:** When HTTP MCP connection fails for server name `notion` and error mentions auth / `invalid_token`, log one line: suggest checking that `auth_token` is the OAuth access token from setup **Connect Notion**, not an integration secret (`ntn_…`). + +- [ ] **Step 2:** `cargo clippy -- -D warnings` && `cargo test`. + +--- + +### Task 3: Confirm load-config round-trip for HTTP MCP + +**Files:** + +- Modify (if not already complete): [`src/bin/setup.rs`](../../../src/bin/setup.rs) — `ExistingMcpServer` / `parse_existing_config`. +- Modify: [`setup/index.html`](../../../setup/index.html) — `loadExistingConfig` merge. + +- [ ] **Step 1:** Verify `GET /api/load-config` returns `url` and `auth_token` for HTTP entries so editing does not drop the bearer token. +- [ ] **Step 2:** Add or extend unit test in `setup.rs` for TOML containing `url` + `auth_token` for notion. + +*(See also the broader checklist in [`2026-04-12-notion-mcp-oauth.md`](./2026-04-12-notion-mcp-oauth.md); deduplicate any overlapping work.)* + +--- + +### Task 4 (follow-up): Refresh token persistence and runtime refresh + +**Files:** + +- Modify: [`src/config.rs`](../../../src/config.rs) — optional fields for Notion OAuth refresh (YAGNI: only what’s needed). +- Modify: [`src/mcp.rs`](../../../src/mcp.rs) or new small module — refresh before connect or on 401. + +Per [build-mcp-client](https://developers.notion.com/guides/mcp/build-mcp-client) Step 8 and Notion’s refresh-token rotation note: + +- [ ] **Step 1:** Persist `refresh_token` from setup callback into config or a sidecar JSON next to config (document permissions). +- [ ] **Step 2:** Store `client_id` (and `client_secret` if any) required for refresh — same as registration result from setup. +- [ ] **Step 3:** On startup or on `invalid_token` when expiry is plausible, call token endpoint with `grant_type=refresh_token`; atomically update stored tokens. + +--- + +## Self-review (writing-plans) + +| Requirement | Task | +|-------------|------| +| Explain `invalid_token` vs wrong token type | Root cause section | +| Align with Notion official OAuth + MCP HTTP | References + Task 4 | +| User-unblocking path | Task 0 | +| Prevent repeat mistakes | Task 1–2 | +| Long-running bot | Task 4 | + +--- + +## Execution handoff + +**Plan saved to:** `docs/superpowers/plans/2026-04-12-notion-mcp-invalid-token-root-cause-and-fix.md` + +**Two execution options:** + +1. **Subagent-driven (recommended)** — One subagent per task; review between tasks. **Skill:** `superpowers:subagent-driven-development`. + +2. **Inline** — Same session, checkpoints. **Skill:** `superpowers:executing-plans`. + +--- + +## References + +- [Integrating your own MCP client](https://developers.notion.com/guides/mcp/build-mcp-client) — discovery, PKCE, Bearer on streamable HTTP, refresh. +- [Get started with Notion MCP](https://developers.notion.com/guides/mcp/get-started-with-mcp) — endpoint URL. +- Related internal plan: [`2026-04-12-notion-mcp-oauth.md`](./2026-04-12-notion-mcp-oauth.md). diff --git a/docs/superpowers/plans/2026-04-12-notion-mcp-oauth.md b/docs/superpowers/plans/2026-04-12-notion-mcp-oauth.md new file mode 100644 index 0000000..e357aa6 --- /dev/null +++ b/docs/superpowers/plans/2026-04-12-notion-mcp-oauth.md @@ -0,0 +1,208 @@ +# Notion MCP `invalid_token` / AuthRequired Fix + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Eliminate misleading configuration and implement **OAuth 2.0 Authorization Code + PKCE** so users obtain a valid **Notion MCP access token** during setup and RustFox can connect to `https://mcp.notion.com/mcp` without `AuthRequired` / `invalid_token` failures. + +**Architecture:** Notion’s hosted server requires OAuth (not an internal integration secret) per [Connecting to Notion MCP](https://developers.notion.com/guides/mcp/get-started-with-mcp) and [Integrating your own MCP client](https://developers.notion.com/guides/mcp/build-mcp-client). RustFox already embeds the setup SPA from [`setup/index.html`](../../../setup/index.html) via [`src/bin/setup.rs`](../../../src/bin/setup.rs) (Axum on a local port, default **8719**). The **primary delivery path** is: extend that setup binary with OAuth discovery + PKCE endpoints; update the **Notion modal and Step 5 UI** so the user clicks **Sign in with Notion**, completes OAuth in a browser window, and the callback delivers the **access token** (and optionally **refresh token**) into the form so generated `config.toml` contains `url = "https://mcp.notion.com/mcp"` and `auth_token = ""` (raw string, no `Bearer ` prefix). The main bot’s [`src/mcp.rs`](../../../src/mcp.rs) continues to pass that value to `StreamableHttpClientTransportConfig::auth_header` ([rmcp docs](https://docs.rs/rmcp/latest/rmcp/transport/streamable_http_client/struct.StreamableHttpClientTransportConfig.html)). A **follow-up** task adds **token refresh** at runtime so expired access tokens do not break the bot without re-running setup. + +**Tech Stack:** Rust (`src/bin/setup.rs`, `src/mcp.rs`), Axum, `oauth2` + PKCE (`PkceCodeChallenge`), optional `reqwest` for discovery HTTP GETs, vanilla JS in `setup/index.html`, Notion MCP docs. + +--- + +## Root cause (from logs and docs) + +Log excerpt: + +- `AuthRequired` / `www_authenticate_header: "Bearer realm=\"OAuth\", error=\"invalid_token\""` during `initialize` on `https://mcp.notion.com/mcp`. + +Interpretation: + +1. **Notion MCP** requires user OAuth; custom clients must implement discovery + PKCE + refresh per [build-mcp-client](https://developers.notion.com/guides/mcp/build-mcp-client). +2. A **Notion internal integration secret** (`ntn_` / `secret_`) is the **wrong credential type** for this hosted MCP endpoint unless Notion explicitly documents otherwise for your integration type. +3. **rmcp `auth_header`** expects the **raw bearer token without the `Bearer ` prefix**. + +**Current setup wizard gap:** The Notion modal in `setup/index.html` still describes **internal integration + paste secret** (`#notion-modal`, lines ~489–540), and `MCP_CATALOG` uses `authTokenVar:'NOTION_TOKEN'` with placeholder “Notion integration token”—which matches the old mental model, not OAuth. [`src/bin/setup.rs`](../../../src/bin/setup.rs) parses `url` / `auth_token` from TOML for forward compatibility but **does not expose them** in `ExistingMcpServer` / `GET /api/load-config`, so HTTP MCP entries are dropped when reloading an existing config. + +--- + +### Task 1: Align `config.example.toml` and internal docs with Notion’s auth model + +**Files:** + +- Modify: `config.example.toml` (Notion HTTP MCP example block) +- Optional: `README.md` setup section if it still says “integration token” for hosted Notion MCP + +- [ ] **Step 1: Replace the misleading Notion example** + +- State that `auth_token` must be an **OAuth access token** from the setup wizard’s **Sign in with Notion** flow (once Task 3 ships), or from another OAuth-capable client that implements Notion’s MCP OAuth—**not** an internal integration secret. +- Link: [get-started-with-mcp](https://developers.notion.com/guides/mcp/get-started-with-mcp), [build-mcp-client](https://developers.notion.com/guides/mcp/build-mcp-client). + +- [ ] **Step 2: Add “Alternatives” in the same comment block** + +- Legacy **open-source** [`notion-mcp-server`](https://github.com/makenotion/notion-mcp-server) with an API token (deprecated). +- **stdio bridge:** `npx -y mcp-remote https://mcp.notion.com/mcp` ([Notion troubleshooting](https://developers.notion.com/guides/mcp/get-started-with-mcp))—still requires OAuth for first connection. + +- [ ] **Step 3: Commit** + +```bash +git add config.example.toml +git commit -m "docs(mcp): clarify Notion hosted MCP OAuth vs integration secret" +``` + +--- + +### Task 2: Normalize `auth_token` for HTTP MCP in the bot + +**Files:** + +- Modify: `src/mcp.rs` (`connect_http`) + +- [ ] **Step 1: Strip accidental `Bearer ` prefix** + +```rust +fn normalize_bearer_token(raw: Option) -> String { + let Some(s) = raw else { + return String::new(); + }; + let t = s.trim(); + t.strip_prefix("Bearer ") + .or_else(|| t.strip_prefix("bearer ")) + .map(str::trim) + .unwrap_or(t) + .to_string() +} +``` + +Use `normalize_bearer_token(config.auth_token.clone())` where `auth_header` is set. + +- [ ] **Step 2: Run checks** + +```bash +cargo fmt +cargo clippy -- -D warnings +cargo test +``` + +- [ ] **Step 3: Commit** + +```bash +git add src/mcp.rs +git commit -m "fix(mcp): strip Bearer prefix from HTTP MCP auth_token" +``` + +--- + +### Task 3: OAuth2 PKCE for Notion — setup wizard UI + `setup.rs` backend + +**Primary references:** + +- [Integrating your own MCP client](https://developers.notion.com/guides/mcp/build-mcp-client) — discovery (RFC 9470 → RFC 8414), PKCE, token endpoint, scopes. +- [get-started-with-mcp](https://developers.notion.com/guides/mcp/get-started-with-mcp) — endpoint URL `https://mcp.notion.com/mcp`. + +**Files:** + +- Modify: [`setup/index.html`](../../../setup/index.html) — Notion modal, catalog labels, JS for OAuth popup / `postMessage` handler. +- Modify: [`src/bin/setup.rs`](../../../src/bin/setup.rs) — new routes, PKCE + token exchange, in-memory session state. +- Modify: `Cargo.toml` — dependencies (`oauth2`, and any HTTP client already available via workspace if shared with main crate; `setup` binary may need `reqwest` for discovery GETs). + +#### 3.1 Load-config: preserve HTTP MCP servers + +- [ ] **Extend `ExistingMcpServer`** with optional `url: String` and `auth_token: String` (default empty). +- [ ] **Update `RawMcpServer`** — remove `#[allow(dead_code)]` on `url` / `auth_token`; map them in `parse_existing_config` when building `ExistingMcpServer`. +- [ ] **Update `setup/index.html`** `loadExistingConfig` / merge logic so Step 5 pre-checks Notion and fills the token field from `auth_token` (not from `env` for HTTP entries). + +#### 3.2 OAuth discovery (setup binary) + +- [ ] **Implement `discover_oauth_metadata(mcp_url: &str) -> ...`** per Notion’s TypeScript sketch in [build-mcp-client](https://developers.notion.com/guides/mcp/build-mcp-client): + - GET `/.well-known/oauth-protected-resource` relative to the MCP origin. + - Follow `authorization_servers[0]` to fetch `/.well-known/oauth-authorization-server` (or equivalent) for `authorization_endpoint`, `token_endpoint`, `registration_endpoint` (if dynamic registration is required). + +#### 3.3 PKCE session state (in-memory) + +- [ ] **Struct** `PendingNotionOAuth { pkce_verifier: String, created: Instant }` stored in `Arc>>` keyed by **cryptographically random `state`** (CSRF). +- [ ] **TTL:** Reject exchanges older than e.g. 15 minutes; prune on access. + +#### 3.4 HTTP routes (Axum) + +Register on the same `Router` as `/` and `/api/load-config`: + +- [ ] **`GET /api/notion/oauth/start`** + - Input (query): optional `redirect` same-origin check. + - Run discovery against `https://mcp.notion.com/mcp` (or configurable base). + - Generate PKCE (`PkceCodeChallenge::new(S256)` via `oauth2` crate). + - If Notion requires **dynamic client registration** (see Notion doc § registration), POST to `registration_endpoint` and cache `client_id` (and `client_secret` if issued) in memory for the token exchange. + - Build authorization URL with scopes from Notion metadata (`scopes_supported` intersection with Notion MCP required scopes—read Notion doc for exact list). + - **`redirect_uri`:** must be fixed and documented, e.g. `http://127.0.0.1:{port}/api/notion/oauth/callback` where `port` is the bound setup server port (use `127.0.0.1` consistently in Notion app settings if pre-registered). + - Return JSON: `{ "authorization_url": "..." }` for the SPA to `window.open`. + +- [ ] **`GET /api/notion/oauth/callback?code=...&state=...`** + - Validate `state`, load PKCE verifier, exchange `code` at `token_endpoint`. + - Return small **HTML** document that: + - Calls `window.opener.postMessage({ type: 'rustfox-notion-oauth', ok: true, access_token: '...', refresh_token: '...', expires_in: N }, window.location.origin)` (or `'*'` if origin matching is awkward during dev—prefer origin). + - `window.close()` after short delay; show error HTML if exchange fails. + +- [ ] **Security notes in code comments:** Setup server is local-only; tokens cross from callback page to SPA via `postMessage`—validate `event.origin` in the parent listener. + +#### 3.5 `setup/index.html` UX + +- [ ] **Replace `#notion-modal` body:** Remove “internal integration / paste secret” steps. Replace with: + - Short explanation: hosted MCP requires **Sign in with Notion** (OAuth). + - Button **Connect Notion** → `fetch('/api/notion/oauth/start')` → `window.open(data.authorization_url, 'notion-oauth', 'width=...')`. + - Listener: `window.addEventListener('message', ...)` → on success, set `state.mcp_selections.notion` env map entry for `NOTION_TOKEN` to `access_token` (or introduce dedicated `state.notion_oauth_token` and teach `buildMcpToml` / checkbox sync to use it). + - Show **Connected** state and masked token preview; **Disconnect** clears stored token. + +- [ ] **Update `MCP_CATALOG` notion row:** Change `authTokenVar` label in UI from “integration token” to **OAuth access token**; keep key name `NOTION_TOKEN` only if it is still used as the form field id—or rename to `NOTION_OAUTH_ACCESS_TOKEN` for clarity (requires JS + save path consistency). + +- [ ] **Placeholder text** for the Notion token input: e.g. “Click Connect Notion or paste access token”. + +- [ ] **Link** [get-started-with-mcp](https://developers.notion.com/guides/mcp/get-started-with-mcp) in the modal subtitle. + +#### 3.6 Tests and manual verification + +- [ ] **Unit tests** in `setup.rs`: discovery parsing (mock JSON fixtures), PKCE state round-trip without network if feasible; or integration test with `curl` against local server (optional). + +- [ ] **Manual:** `cargo run --bin setup` → Step 5 → Connect Notion → complete OAuth → save config → inspect `config.toml` for `[[mcp_servers]]` with `url` + `auth_token`. Run `cargo run --bin rustfox` and confirm MCP connects (see prior log line `Connected to MCP server 'notion'`). + +- [ ] **Commit** (split if large): `feat(setup): Notion MCP OAuth PKCE in setup wizard`, `fix(setup): load-config returns url/auth_token for HTTP MCP`. + +--- + +### Task 4 (follow-up): Runtime refresh for Notion access token + +**Files:** + +- Modify: `src/config.rs` — optional `notion_refresh_token` or generic `mcp_server` OAuth fields (design with YAGNI: only what Notion needs). +- Modify: `src/mcp.rs` — before `serve(transport)`, if token expired or API returns 401, refresh using `refresh_token` and update on-disk config or a sidecar file (document security: file permissions). + +- [ ] **Step 1:** Persist `refresh_token` from Task 3 callback into `config.toml` (or `~/.config/rustfox/notion-mcp.json`) when user saves from setup wizard. +- [ ] **Step 2:** Implement refresh using `token_endpoint` + `oauth2` `refresh_token` grant (reuse discovery or cache endpoints in config). +- [ ] **Step 3:** Document token rotation in `config.example.toml`. + +--- + +## Self-review (writing-plans checklist) + +| Spec / requirement | Covered by | +|--------------------|------------| +| OAuth2 for Notion MCP | Task 3 | +| **setup/index.html** retrieves access token via OAuth | Task 3.5 + 3.4 | +| Explain `invalid_token` / wrong token type | Root cause | +| rmcp `auth_header` format | Task 2 + root cause | +| Load existing HTTP MCP + token in wizard | Task 3.1 | +| Refresh / long-running bot | Task 4 | +| Notion official docs | Links throughout | + +**Placeholder scan:** Registration details (exact scopes, whether dynamic registration is mandatory) depend on Notion’s current API—Task 3.2/3.4 explicitly call for reading [build-mcp-client](https://developers.notion.com/guides/mcp/build-mcp-client) at implementation time and fixing constants accordingly. + +--- + +## Execution handoff + +**Plan updated and saved to `docs/superpowers/plans/2026-04-12-notion-mcp-oauth.md`. Two execution options:** + +1. **Subagent-Driven (recommended)** — Dispatch a fresh subagent per task; review between tasks. **REQUIRED SUB-SKILL:** superpowers:subagent-driven-development. + +2. **Inline Execution** — Execute tasks in this session using checkpoints. **REQUIRED SUB-SKILL:** superpowers:executing-plans. + +**Which approach?** diff --git a/docs/superpowers/plans/2026-05-10-fix-skill-invocation.md b/docs/superpowers/plans/2026-05-10-fix-skill-invocation.md new file mode 100644 index 0000000..77f472a --- /dev/null +++ b/docs/superpowers/plans/2026-05-10-fix-skill-invocation.md @@ -0,0 +1,261 @@ +# Fix Skill Invocation: code-interpreter & arxiv-daily-briefing Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Fix "Unknown built-in tool: code-interpreter" and the broken arxiv-daily-briefing news fetching by converting both instruction skills to subagent skills. + +**Architecture:** Both `code-interpreter` and `arxiv-daily-briefing` are instruction skills (no `model:` in frontmatter). The LLM ignores the "Load with: read_skill_file(...)" guidance in the system prompt and calls them directly as tool names, hitting the "Unknown built-in tool" error. Converting them to subagent skills causes the system prompt to advertise them via `invoke_agent(agent="...", prompt="...")` — a much more explicit calling convention that the LLM reliably follows. The existing `run_subagent` infrastructure automatically bootstraps each subagent with `read_skill_file` and enforces a tools whitelist. + +**Tech Stack:** SKILL.md frontmatter (YAML), Rust bot (no code changes needed), arXiv API (curl over HTTP) + +--- + +## Root Cause Deep Dive + +### Bug 1: `code-interpreter` → "Unknown built-in tool: code-interpreter" + +**Call chain:** +1. LLM receives system prompt listing `code-interpreter (instruction)` with "Load with: read_skill_file(...)" +2. LLM ignores this and generates `tool_call { name: "code-interpreter", arguments: {...} }` +3. `agent.rs::execute_tool("code-interpreter", ...)` hits the `_ =>` fallthrough arm +4. Calls `tools::execute_builtin_tool("code-interpreter", ...)` → `anyhow::bail!("Unknown built-in tool: code-interpreter")` +5. Error string "Tool error: Unknown built-in tool: code-interpreter" returned to LLM + +**Evidence in code:** +- `skills/code-interpreter/SKILL.md` — no `model:` field → classified as instruction skill +- `src/skills/mod.rs::build_context()` — instruction skills listed with "Load with: read_skill_file(...)" +- `src/agent.rs::execute_tool()` line ~1895 — `_ =>` arm calls `execute_builtin_tool` +- `src/tools.rs::execute_builtin_tool()` — last arm `_ => anyhow::bail!("Unknown built-in tool: {}", tool_name)` + +### Bug 2: `arxiv-daily-briefing` not working + +**Same root cause:** `arxiv-daily-briefing/SKILL.md` also has no `model:` field. One of: +- (a) LLM tries to call `arxiv-daily-briefing` directly as a tool (same error path) +- (b) LLM reads the skill via `read_skill_file` but the curl command fails silently (network, timeout, XML parsing difficulty) + +**Secondary concern:** The arXiv API URL in SKILL.md uses date range format `submittedDate:[YYYYMMDD0000+TO+YYYYMMDD2359]`. Verify this query string is properly passed through `sh -c` in the sandbox without shell escaping issues. + +--- + +## File Map + +| File | Change | +|------|--------| +| `skills/code-interpreter/SKILL.md` | Add `model: moonshotai/kimi-k2.5` to frontmatter | +| `skills/arxiv-daily-briefing/SKILL.md` | Add `model: moonshotai/kimi-k2.5` to frontmatter + verify curl command | + +No Rust source changes required. The existing `run_subagent` infrastructure in `src/agent.rs` handles everything once the `model:` field is present. + +--- + +## Task 1: Fix `code-interpreter` skill + +**Files:** +- Modify: `skills/code-interpreter/SKILL.md` + +- [ ] **Step 1: Read the current SKILL.md** + + Verify the exact frontmatter to understand what needs changing. + + ```bash + cat skills/code-interpreter/SKILL.md + ``` + +- [ ] **Step 2: Add `model:` to frontmatter** + + Add `model: moonshotai/kimi-k2.5` to the YAML frontmatter block. The frontmatter should look like: + + ```yaml + --- + name: code-interpreter + description: Execute code snippets and scripts in the sandbox. Supports Python 3 and Node.js. Use for calculations, data processing, file generation, and scripting tasks. + tags: [code, execution, scripting] + model: moonshotai/kimi-k2.5 + tools: + - read_file + - write_file + - execute_command + --- + ``` + + The `tools:` whitelist is already correct — the subagent will have `read_skill_file` (always added by `effective_subagent_tools`), `read_file`, `write_file`, and `execute_command`. + +- [ ] **Step 3: Verify system prompt output changes** + + After saving, restart the bot (or `reload_skills`) and check startup logs for: + ``` + Registered skill: code-interpreter — Execute code snippets... + ``` + Confirm the system prompt now shows: + ``` + Invoke via: `invoke_agent(agent="code-interpreter", prompt="")` + ``` + instead of the old instruction-skill hint. + +- [ ] **Step 4: Test end-to-end** + + Send to the bot: "Use code-interpreter to calculate the factorial of 10 in Python." + + **Expected flow:** + 1. LLM calls `invoke_agent(agent="code-interpreter", prompt="calculate factorial of 10 in Python")` + 2. Subagent bootstraps, reads SKILL.md + 3. Subagent calls `write_file` to write `tmp_script.py` + 4. Subagent calls `execute_command("python3 tmp_script.py")` + 5. Result `3628800` returned to main agent, then to user + + **Previously failing:** LLM called `code-interpreter(...)` directly → "Unknown built-in tool" error + +- [ ] **Step 5: Commit** + + ```bash + git add skills/code-interpreter/SKILL.md + git commit -m "fix(skills): convert code-interpreter to subagent skill + + The LLM was calling 'code-interpreter' as a direct tool call, hitting + the 'Unknown built-in tool: code-interpreter' error. Converting it from + an instruction skill to a subagent skill (adding model: field) causes + the system prompt to advertise it via invoke_agent(), which the LLM + reliably follows. The tools whitelist [read_file, write_file, + execute_command] was already correct." + ``` + +--- + +## Task 2: Fix `arxiv-daily-briefing` skill + +**Files:** +- Modify: `skills/arxiv-daily-briefing/SKILL.md` + +- [ ] **Step 1: Test the curl command manually in the sandbox** + + Before changing the skill, verify the arXiv API is reachable and the curl command works. + Check the sandbox directory from config.toml, then run: + + ```bash + YESTERDAY=$(date -d "yesterday" +%Y%m%d 2>/dev/null || date -v-1d +%Y%m%d) + curl -L --max-time 30 \ + "http://export.arxiv.org/api/query?search_query=cat:cs.AI+AND+submittedDate:[${YESTERDAY}0000+TO+${YESTERDAY}2359]&sortBy=submittedDate&sortOrder=descending&max_results=5" \ + -H "User-Agent: Mozilla/5.0" 2>/dev/null | head -100 + ``` + + If this returns XML with `` elements, the network and URL are fine. + If it fails → investigate network access from the sandbox (may need to adjust the curl command or URL). + +- [ ] **Step 2: Check shell escaping of the date range URL** + + The `execute_command` tool runs via `sh -c ""`. The arXiv URL contains `[` and `]` + which are safe in most shells, but verify by checking if `execute_command` with the exact + curl string returns proper XML. If `[]` causes glob expansion, the SKILL.md instruction + body should add single-quotes around the URL. + + Update the curl example in SKILL.md if needed: + ```bash + curl -L --max-time 30 'http://export.arxiv.org/api/query?...' -H "User-Agent: Mozilla/5.0" + ``` + +- [ ] **Step 3: Add `model:` to frontmatter** + + Add `model: moonshotai/kimi-k2.5` to the YAML frontmatter: + + ```yaml + --- + name: arxiv-daily-briefing + description: Fetch yesterday's AI papers from arXiv, summarize them in Cantonese, and produce a concise daily briefing. + model: moonshotai/kimi-k2.5 + tools: + - execute_command + --- + ``` + + The `tools:` list only needs `execute_command` since the subagent: + - Gets `read_skill_file` automatically (always included by `effective_subagent_tools`) + - Reads the SKILL.md to get the curl command and parsing instructions + - Calls `execute_command` with the curl command + - Produces the Cantonese summary from the parsed XML response + +- [ ] **Step 4: Verify activation phrasing** + + After the model is set, the system prompt will show: + ``` + - **arxiv-daily-briefing**: Fetch yesterday's AI papers from arXiv... + Invoke via: `invoke_agent(agent="arxiv-daily-briefing", prompt="")` + ``` + + Update any daily briefing scheduled task or user-facing instructions that reference + triggering this skill, if needed. + +- [ ] **Step 5: Test end-to-end** + + Send to the bot: "Give me today's arXiv AI daily briefing." + + **Expected flow:** + 1. LLM calls `invoke_agent(agent="arxiv-daily-briefing", prompt="Give me today's arXiv AI briefing")` + 2. Subagent reads SKILL.md + 3. Subagent calculates yesterday's date + 4. Subagent calls `execute_command("curl -L ...")` + 5. Subagent parses XML, selects top papers, summarizes in Cantonese + 6. Result returned to main agent, then to user + +- [ ] **Step 6: Commit** + + ```bash + git add skills/arxiv-daily-briefing/SKILL.md + git commit -m "fix(skills): convert arxiv-daily-briefing to subagent skill + + Same pattern as code-interpreter: LLM was calling the skill name as a + direct tool. Converting to subagent (adding model: field) ensures the + LLM uses invoke_agent() instead. Also verified curl command works and + fixed URL quoting in SKILL.md instructions if needed." + ``` + +--- + +## Task 3: Regression Check & Final Verification + +- [ ] **Step 1: Run cargo check** + + No Rust code changes were made, but verify nothing is broken: + ```bash + cargo check + ``` + +- [ ] **Step 2: Check both skills are in startup logs** + + When bot starts, look for: + ``` + Registered skill: code-interpreter — Execute code snippets... + Registered skill: arxiv-daily-briefing — Fetch yesterday's AI papers... + ``` + +- [ ] **Step 3: Verify the skills appear as subagents in system prompt** + + Use the `/tools` command (or equivalent) in Telegram to confirm the system prompt + lists both skills under "Available Subagent Skills" with `invoke_agent(...)` instructions, + NOT as "(instruction)" skills. + +- [ ] **Step 4: Run existing tests** + + ```bash + cargo test + ``` + + All existing tests should pass. No new tests required for SKILL.md frontmatter changes. + +--- + +## Notes for Implementer + +- The `effective_subagent_tools()` helper in `src/agent.rs` always adds `read_skill_file` + and `read_agent_file` to the subagent's tool list, regardless of what's declared in `tools:`. + This is why subagents always bootstrap by reading their SKILL.md first. + +- The `run_subagent()` function uses `chat_with_model()` (not the default `chat()`), + passing the skill's `model:` value. Both skills will use `moonshotai/kimi-k2.5`. + +- If a different (faster/cheaper) model is preferred for code-interpreter, use e.g. + `google/gemini-flash-1.5` or `anthropic/claude-haiku-3`. Just set the `model:` field + accordingly in SKILL.md. + +- The `arxiv-daily-briefing` subagent only needs `execute_command` in its tools list. + It does NOT need `read_file`/`write_file` since it processes the curl output in-memory + (LLM parses the stdout XML response directly). diff --git a/src/config.rs b/src/config.rs index 3784762..ca0d9ce 100644 --- a/src/config.rs +++ b/src/config.rs @@ -264,7 +264,7 @@ fn default_system_prompt() -> String { You have skills. For every user request:\n\ - Check if a relevant skill exists (listed in your system context)\n\ - If yes: load and follow it via read_skill_file before responding\n\ - - If no matching skill: reason directly, or use code-interpreter for computation/scripting tasks\n\ + - If no matching skill: reason directly, or load the code-interpreter skill via read_skill_file for computation/scripting tasks\n\ - For complex multi-step problems: invoke the problem-solver subagent\n\ \n\ ## Sandbox\n\ diff --git a/src/skills/mod.rs b/src/skills/mod.rs index 6d5e464..5686ad6 100644 --- a/src/skills/mod.rs +++ b/src/skills/mod.rs @@ -75,7 +75,12 @@ impl SkillRegistry { let mut subagent_section = String::new(); for skill in self.skills.values() { - if skill.model.is_some() { + // A skill is treated as a subagent if it has a model explicitly set + // OR if it declares a tools whitelist (meaning it needs sandboxed execution). + // In both cases the LLM invokes it via invoke_agent(); the model used is + // the explicitly set one, or the global default from config when absent. + let is_subagent = skill.model.is_some() || !skill.tools.is_empty(); + if is_subagent { // Subagent skill — metadata only subagent_section.push_str(&format!( "- **{}**: {}\n Invoke via: `invoke_agent(agent=\"{}\", prompt=\"\")`\n",