Skip to content

telegram-all MCP reconnect 失敗 -32000 on Claude Code startup #31

@kiki830621

Description

@kiki830621

Problem

Original brief(verbatim from /mcp output):

❯ /mcp
  ⎿  Failed to reconnect to plugin:che-telegram-mcp:telegram-all: -32000 可能你要看清楚 /plugin-update

— Source: Claude Code prompt on 2026-05-21 session startup, working directory /Users/che/Developer/che-msg

Claude Code session 啟動時跑 /mcp 顯示 plugin:che-telegram-mcp:telegram-all 無法 reconnect,JSON-RPC error code -32000(Server error,通用)。

對比同 session 的 startup hook 輸出(都正常):

  • che-apple-notes-mcp v0.2.0 (latest)
  • che-ical-mcp v1.10.0 (latest)
  • telegram-all (TDLib) v0.5.0 installed: /Users/che/bin/CheTelegramAllMCP
  • telegram-bot v0.5.0 installed: /Users/che/bin/CheTelegramBotMCP
  • ✓ API ID / API Hash / Bot Token 都在 Keychain

也就是 binary file v0.5.0 確實在,startup hook 的 health check 過了 —— 但 Claude Code 的 MCP transport 在 reconnect 時失敗。

使用者觀察(verbatim from brief):

可能你要看清楚 /plugin-update

懷疑 plugin auto-upgrade 流程(che-telegram-mcp plugin v1.3.0 加的 sidecar versioning binary auto-download)可能有 race / 殘留 process / 狀態未清。這只是假設,實際 root cause 待 diagnose

Type

bug

Expected

  • /mcp 顯示 telegram-all ✓ connected
  • mcp__plugin_che-telegram-mcp_telegram-all__* tools 在工具清單裡可用(send_message / search / get_chat_history / dump_chat_to_markdown 等)

Actual

  • /mcp 顯示 Failed to reconnect to plugin:che-telegram-mcp:telegram-all: -32000
  • 對應 MCP tools 不出現在這個 session 的工具清單
  • 其他 MCP server(telegram-bot, che-ical-mcp, che-apple-notes-mcp)同 session 都正常 — 隔離問題在 telegram-all 這隻
  • Binary 本身存在,startup hook 確認版本 v0.5.0

Trigger

Claude Code session 啟動時 /mcp 就看到 — 不是 runtime 中途斷線

Impact

  • 個人帳號 Telegram MCP 完全無法使用 → send / search / get_chat_history / dump_chat_to_markdown 全失效
  • 工作流程裡需要從 Telegram 抽 source / tag 對話的步驟全部 block
  • Priority: P1(本週要解決)

Environment

  • Date: 2026-05-21
  • Platform: darwin 25.4.0
  • Working directory: /Users/che/Developer/che-msg
  • Binary path: /Users/che/bin/CheTelegramAllMCP v0.5.0(universal,Developer ID signed + notarized per release pipeline)
  • Plugin: che-telegram-mcp from psychquant-claude-plugins marketplace
  • MCP server id 在 plugin 內: telegram-all

Diagnosis hints(待 idd-diagnose 驗證)

  1. Stderr log: 看 ~/Library/Logs/Claude/mcp-server-* 內 telegram-all 對應的 log file(注意 macOS 用 display_name 命名 log,不是技術 id)
  2. 手動跑 binary 看是否 crash on stdin:echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}' | /Users/che/bin/CheTelegramAllMCP
  3. Stale process / pipe: ps aux | grep CheTelegramAllMCP,看有沒有殘留
  4. Plugin wrapper 順序: che-telegram-mcp plugin 的 .mcp.json + wrapper script,確認 session 啟動時 binary spawn 的 path / args
  5. 與 /plugin-update 的時序關係: 確認上一次成功連線到這次失敗之間有沒有跑過 plugin-update / auto-upgrade
  6. 比對 telegram-bot 成功路徑: 同 plugin 內 telegram-bot 同 session 正常,差異點縮在 telegram-all 專屬程式碼(TDLib bootstrap / auth flow / receive loop)

Cross-references

  • Memory: feedback_binary_plugin_auto_upgrade.md — 之前因 wrapper 缺 version-check 導致 v0.4.3 binary 卡住,後在 plugin v1.3.0 修正(sidecar versioning + DESIRED_VERSION + atomic mv)
  • 此 issue 可能是該機制的後續 regression,或新 failure mode

Clarification (added during /idd-diagnose 2026-05-21)

Q: Acceptance criteria — 這個 issue 的 "done" 是哪個?
A: (b) 被 block 時 surface human-readable error + 復原指令 → 對應 Diagnosis Strategy A+D。

Concretely:

  • Wrapper lock-refused 分支 emit MCP-shaped JSON-RPC error to stdout(取代目前只 echo stderr + exit 1)
  • User 在 /mcp 看到 Another instance running (PID NNNN). Kill it first or use that Claude Code window. 而非通用 -32000
  • 加 plugin README 的 "Multi-session limitation" 段,說明 TDLib upstream constraint
  • 做 auto-reap stale binary(B/C 太危險,user explicitly skipped)
  • 只 doc(D-only — user explicitly chose UX fix in addition)

未選定但 open 的 sub-decision:

  • 要不要 同時加 session-start hook 先 warn(option B,可以跟 b 組合)
  • 要不要 同時加 binary --check-stale 子命令(Strategy D 的一部分)

Linked-Context Siblings Filed (v2.48.0+ #529)

(none — no orphan sibling mentions in linked context)

Current Status

Phase: closed
Last updated: 2026-05-25 by /idd-close (PR #90 merged d4d7c2e)

Key Decisions

  • Empirical re-test concluded 2026-05-22 via expect-driven fresh Claude Code session + --debug mcp log capture
    • Cache wrapper swap → spawn fresh session → /mcp → captured /tmp/cc-mcp-debug-31.log (1978 lines)
    • Debug log line 1295 confirms: Claude Code's MCP transport DID parse our envelope + stored full message:
      Connection failed: MCP error -32000:
      Another instance of CheTelegramAllMCP is already running (lock held by PID 11252).
      Use the existing Claude Code window, or kill the previous wrapper first.
      
    • Verdict: PR-90 + PR-1b satisfies acceptance criterion (b) at the MCP protocol level
  • **PR-1b shipd on same branch** (idd/31-mcp-error-surfacecommit9387965+ CHANGELOG update38a5ffd`):
    • read_initialize_id() reads stdin with 2s timeout, extracts JSON-RPC id, falls back to null on timeout/parse failure
    • jq preferred, bash regex fallback for portability
    • Both lock-refused branches (flock + mkdir) updated
    • 2 new tests in test-wrapper-mcp-error.sh: id-matching + timeout fallback. All 6 GREEN.
    • Existing test-wrapper-pid.sh 10/10 still GREEN (no regression)
  • Known UX gap (out of plugin scope): Claude Code's /mcp short-list UI may always show -32000 truncated form, not the full message. The full message IS captured internally and surfaces in:
    • --debug mcp log output
    • Any downstream tools/list / tool-call error responses
    • Future Claude Code UI improvements to /mcp may surface it directly

Scope Changes

  • 5th commit 9387965 added during empirical follow-up (PR-1b)
  • 6th commit 38a5ffd documents the empirical findings in CHANGELOG

Blocking

  • (none) — fix is empirically verified at the MCP protocol level
  • Optional: open follow-up issue for Claude Code upstream re: /mcp UI showing full error.message (not in this PR/repo's scope)

Commits (in psychquant-claude-plugins, branch idd/31-mcp-error-surface)

  • 5840b76 — feat: emit MCP JSON-RPC error envelope on lock refusal
  • 3d9f20d — docs: v1.3.2 README multi-session limitation + version bumps
  • a8e396f — fix: address in-scope verify findings (recoveryCommand semicolon + README Version drift)
  • 9387965 — feat: read stdin to match initialize id (PR-1b)
  • 38a5ffd — docs: CHANGELOG v1.3.2 expanded with PR-1b + empirical findings

Verify artifacts

Next

# 1. Merge PR-90 (5 commits total, all on idd/31-mcp-error-surface)
# 2. After merge → /plugin-tools:plugin-update che-telegram-mcp
#    (syncs marketplace + bumps cache to v1.3.2)
# 3. /idd-close #31 — write closing summary (checklist all done)

PR-2 (binary --check-stale in che-msg) remains optional — Plan allows opt-out. Acceptance (b) is met by PR-90 alone.

Recovery for current stale state: the stale processes (PID 11252 + 11266 from Wed 2026-05-20) are still holding the lock. To get telegram-all working in current sessions:

pkill CheTelegramAllMCP 2>/dev/null; rm -rf ~/.cache/che-telegram-all-mcp.lock ~/.cache/che-telegram-all-mcp.lock.flock
# Then Cmd-Q Claude Code + reopen

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions