feat(telegram): add Telegram bot with multi-user support and progress…#3
Conversation
… streaming Implements a Telegram bot via long-polling (no public IP required) that follows the same architecture as the existing DingTalk and Lark integrations. Key changes: - internal/robot/telegram.go: new bot implementation using Telegram Bot API HTTP calls (no new dependencies). Features: long-polling with exponential reconnect backoff, multi-user sessions (AllowedUserIDs whitelist), group chat @ mention filtering, live progress streaming via throttled message edits, typing action indicator, automatic message splitting at 4096 chars. - internal/robot/conn.go: adds StreamingMessageHandler interface so Telegram can receive agent tool-call progress events. - internal/handler/robot.go: adds HandleMessageStream() satisfying the new interface; command dispatch is instant, agent messages use streaming. - internal/handler/agent.go: adds ProcessMessageForRobotStream() with a notifyFn callback that fires on tool_call/tool_result/progress events. - internal/config/config.go: adds RobotTelegramConfig (BotToken, AllowedUserIDs) and Telegram field to RobotsConfig. - internal/app/app.go: wires telegramCancel into startRobotConnections(), RestartRobotConnections(), and Shutdown(). - web/templates/index.html + web/static/js/settings.js: Telegram section in Bot Settings (enable toggle, bot token, allowed user IDs field). - config.yaml: telegram block with commented defaults. - docs/robot_en.md: full Telegram setup guide (sections 3.3, 9, 12, 13). - ROADMAP.md: marks Telegram as shipped; adds near-term Telegram items and a detailed Telegram roadmap table. https://claude.ai/code/session_01EHroFMw7DJwUuszDzcFkpk
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly expands the bot integration capabilities by adding a robust Telegram bot. This new bot provides users with an interactive and responsive experience, including live progress updates during AI agent tasks, and is designed for easy setup and secure multi-user access. The changes ensure that the system's conversational AI features are accessible and performant across a wider range of messaging platforms. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive Telegram bot integration with multi-user support, progress streaming, and UI configuration. However, two significant security issues were identified: Broken Access Control (IDOR), where bot commands (list, switch, delete) lack conversation ownership enforcement, potentially leading to data leakage and unauthorized modification; and Sensitive Information Leakage, as the Telegram bot token might be logged if HTTP requests fail. Additionally, the review highlights opportunities to improve maintainability by reducing code duplication in the handler layer and enhancing Telegram API error handling robustness through refactoring to centralize logic.
| case strings.HasPrefix(text, robotCmdSwitch+" ") || strings.HasPrefix(text, robotCmdContinue+" ") || strings.HasPrefix(text, "switch ") || strings.HasPrefix(text, "continue "): | ||
| var id string | ||
| switch { | ||
| case strings.HasPrefix(text, robotCmdSwitch+" "): | ||
| id = strings.TrimSpace(text[len(robotCmdSwitch)+1:]) | ||
| case strings.HasPrefix(text, robotCmdContinue+" "): | ||
| id = strings.TrimSpace(text[len(robotCmdContinue)+1:]) | ||
| case strings.HasPrefix(text, "switch "): | ||
| id = strings.TrimSpace(text[7:]) | ||
| default: | ||
| id = strings.TrimSpace(text[9:]) | ||
| } | ||
| return h.cmdSwitch(platform, userID, id) |
There was a problem hiding this comment.
This section of the RobotHandler implementation has a critical security vulnerability: it lacks ownership checks for conversations. This means any authorized user can use commands like list and switch to access or manipulate sensitive data belonging to other users. To fix this, implement user_id or owner_id in the conversations table, store the userID upon conversation creation, and enforce ownership verification in cmdSwitch.
Furthermore, the HandleMessageStream function duplicates the command dispatching switch block found in HandleMessage. Extracting this common logic into a private helper function would significantly reduce redundancy and improve maintainability.
| case strings.HasPrefix(text, robotCmdDelete+" ") || strings.HasPrefix(text, "delete "): | ||
| var convID string | ||
| if strings.HasPrefix(text, robotCmdDelete+" ") { | ||
| convID = strings.TrimSpace(text[len(robotCmdDelete)+1:]) | ||
| } else { | ||
| convID = strings.TrimSpace(text[7:]) | ||
| } | ||
| return h.cmdDelete(platform, userID, convID) |
There was a problem hiding this comment.
Similar to the switch command, the delete command lacks ownership verification. Any authorized bot user can delete any conversation in the system if they know its ID. This could lead to unauthorized data loss.
Remediation: Ensure that the userID of the requester is checked against the conversation owner before performing the deletion.
| apiURL: fmt.Sprintf("%s/bot%s", telegramAPIBase, cfg.BotToken), | ||
| cfg: cfg, | ||
| h: h, | ||
| logger: logger, | ||
| allowedSet: allowedSet, | ||
| } | ||
|
|
||
| logger.Info("Telegram bot connecting...") | ||
| err := bot.runPollLoop(ctx) | ||
|
|
||
| if ctx.Err() != nil { | ||
| logger.Info("Telegram bot stopped per configuration reload") | ||
| return | ||
| } | ||
| if err != nil { | ||
| logger.Warn("Telegram bot polling error, will reconnect", zap.Error(err), zap.Duration("retry_after", backoff)) |
There was a problem hiding this comment.
The Telegram bot token is embedded directly in the API URL. If an HTTP request to the Telegram API fails (e.g., due to a network timeout or connection error), the resulting error object returned by Go's http package typically includes the full URL, including the token. This error is then logged using zap.Warn (line 121), which will write the bot token into the application logs in plaintext.
To remediate this, you should wrap the error or use a custom error handler that scrubs the sensitive token from the URL before logging.
| func (h *AgentHandler) ProcessMessageForRobotStream( | ||
| ctx context.Context, | ||
| conversationID, message, role string, | ||
| notifyFn func(step string), | ||
| ) (response string, convID string, err error) { | ||
| if conversationID == "" { | ||
| title := safeTruncateString(message, 50) | ||
| conv, createErr := h.db.CreateConversation(title) | ||
| if createErr != nil { | ||
| return "", "", fmt.Errorf("failed to create conversation: %w", createErr) | ||
| } | ||
| conversationID = conv.ID | ||
| } else { | ||
| if _, getErr := h.db.GetConversation(conversationID); getErr != nil { | ||
| return "", "", fmt.Errorf("conversation does not exist") | ||
| } | ||
| } | ||
|
|
||
| agentHistoryMessages, err := h.loadHistoryFromReActData(conversationID) | ||
| if err != nil { | ||
| historyMessages, getErr := h.db.GetMessages(conversationID) | ||
| if getErr != nil { | ||
| agentHistoryMessages = []agent.ChatMessage{} | ||
| } else { | ||
| agentHistoryMessages = make([]agent.ChatMessage, 0, len(historyMessages)) | ||
| for _, msg := range historyMessages { | ||
| agentHistoryMessages = append(agentHistoryMessages, agent.ChatMessage{Role: msg.Role, Content: msg.Content}) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| finalMessage := message | ||
| var roleTools, roleSkills []string | ||
| if role != "" && role != "Default" && h.config.Roles != nil { | ||
| if r, exists := h.config.Roles[role]; exists && r.Enabled { | ||
| if r.UserPrompt != "" { | ||
| finalMessage = r.UserPrompt + "\n\n" + message | ||
| } | ||
| roleTools = r.Tools | ||
| roleSkills = r.Skills | ||
| } | ||
| } | ||
|
|
||
| if _, err = h.db.AddMessage(conversationID, "user", message, nil); err != nil { | ||
| return "", "", fmt.Errorf("failed to save user message: %w", err) | ||
| } | ||
|
|
||
| assistantMsg, err := h.db.AddMessage(conversationID, "assistant", "Processing...", nil) | ||
| if err != nil { | ||
| h.logger.Warn("Robot stream: failed to create assistant message placeholder", zap.Error(err)) | ||
| } | ||
| var assistantMessageID string | ||
| if assistantMsg != nil { | ||
| assistantMessageID = assistantMsg.ID | ||
| } | ||
|
|
||
| // Build a sendEventFunc that forwards relevant events to notifyFn | ||
| var sendEventFunc func(eventType, message string, data interface{}) | ||
| if notifyFn != nil { | ||
| sendEventFunc = func(eventType, evtMessage string, data interface{}) { | ||
| switch eventType { | ||
| case "tool_call": | ||
| if dataMap, ok := data.(map[string]interface{}); ok { | ||
| if toolName, ok := dataMap["toolName"].(string); ok && toolName != "" { | ||
| notifyFn("calling tool: " + toolName) | ||
| return | ||
| } | ||
| } | ||
| if evtMessage != "" { | ||
| notifyFn(evtMessage) | ||
| } | ||
| case "tool_result": | ||
| if dataMap, ok := data.(map[string]interface{}); ok { | ||
| if toolName, ok := dataMap["toolName"].(string); ok && toolName != "" { | ||
| notifyFn("tool result: " + toolName) | ||
| return | ||
| } | ||
| } | ||
| case "progress": | ||
| if evtMessage != "" { | ||
| notifyFn(evtMessage) | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| progressCallback := h.createProgressCallback(conversationID, assistantMessageID, sendEventFunc) | ||
|
|
||
| result, err := h.agent.AgentLoopWithProgress(ctx, finalMessage, agentHistoryMessages, conversationID, progressCallback, roleTools, roleSkills) | ||
| if err != nil { | ||
| errMsg := "Execution failed: " + err.Error() | ||
| if assistantMessageID != "" { | ||
| _, _ = h.db.Exec("UPDATE messages SET content = ? WHERE id = ?", errMsg, assistantMessageID) | ||
| _ = h.db.AddProcessDetail(assistantMessageID, conversationID, "error", errMsg, nil) | ||
| } | ||
| return "", conversationID, err | ||
| } | ||
|
|
||
| if assistantMessageID != "" { | ||
| mcpIDsJSON := "" | ||
| if len(result.MCPExecutionIDs) > 0 { | ||
| jsonData, _ := json.Marshal(result.MCPExecutionIDs) | ||
| mcpIDsJSON = string(jsonData) | ||
| } | ||
| _, err = h.db.Exec( | ||
| "UPDATE messages SET content = ?, mcp_execution_ids = ? WHERE id = ?", | ||
| result.Response, mcpIDsJSON, assistantMessageID, | ||
| ) | ||
| if err != nil { | ||
| h.logger.Warn("Robot stream: failed to update assistant message", zap.Error(err)) | ||
| } | ||
| } else { | ||
| if _, err = h.db.AddMessage(conversationID, "assistant", result.Response, result.MCPExecutionIDs); err != nil { | ||
| h.logger.Warn("Robot stream: failed to save assistant message", zap.Error(err)) | ||
| } | ||
| } | ||
| if result.LastReActInput != "" || result.LastReActOutput != "" { | ||
| _ = h.db.SaveReActData(conversationID, result.LastReActInput, result.LastReActOutput) | ||
| } | ||
| return result.Response, conversationID, nil | ||
| } |
There was a problem hiding this comment.
This new function ProcessMessageForRobotStream duplicates a significant amount of logic from the existing ProcessMessageForRobot function. Specifically, the setup code for conversation handling, history loading, and role application, as well as the teardown code for saving results, appears to be nearly identical. To improve maintainability and reduce redundancy, consider refactoring the common logic into shared private helper functions. For example, you could have a helper for preparing the agent call and another for finalizing the results, which both ProcessMessageForRobot and ProcessMessageForRobotStream could use. This would make the code cleaner and easier to maintain in the future.
| if err := b.apiPost(ctx, "sendMessage", params, &msg); err != nil { | ||
| // Fall back to plain text if Markdown parse fails | ||
| params["parse_mode"] = "" | ||
| if err2 := b.apiPost(ctx, "sendMessage", params, &msg); err2 != nil { | ||
| b.logger.Warn("Telegram sendMessage failed", zap.Error(err2), zap.Int("part", i)) | ||
| continue | ||
| } | ||
| } |
There was a problem hiding this comment.
The fallback logic here in sendMessage (and similarly in editMessageText at line 237) retries the API call without Markdown for any error. This could lead to unnecessary retries for non-parsing errors (e.g., network issues, other API errors). To make this more robust, consider making the retry conditional on a specific Markdown parsing error. The Telegram API typically returns an error message containing can't parse entities for such failures. Checking for this substring in the error would make the fallback more targeted and efficient.
… streaming
Implements a Telegram bot via long-polling (no public IP required) that follows the same architecture as the existing DingTalk and Lark integrations.
Key changes: