fix: cannot receive image, file in dingtalk#5920
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses critical issues preventing the proper reception and processing of image, file, and audio messages from DingTalk. It introduces new parsing logic and robust download mechanisms to ensure that these message types are correctly handled and made available within the system. This significantly improves the integration with DingTalk by enabling full media message support and enhancing the reliability of file downloads. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request fixes issues with receiving images and files from DingTalk by adding handlers for picture, audio, voice, and file message types, and improves the robustness of file downloading by handling different API response structures. However, it introduces a high-severity path traversal vulnerability in the handling of file extensions for audio and file messages. User-supplied extensions are used directly in file path construction without adequate sanitization, allowing for arbitrary file writes on the server. This should be addressed by ensuring that file extensions are properly sanitized (e.g., by taking only the filename part of the extension string). Additionally, I've provided a few suggestions to reorder validation checks to be consistent across different message type handlers, which improves code readability and maintainability.
| voice_ext = cast(str, raw_content.get("fileExtension") or "") | ||
| if not voice_ext: | ||
| voice_ext = "amr" | ||
| voice_ext = voice_ext.lstrip(".") |
There was a problem hiding this comment.
The voice_ext variable is populated from untrusted user input (raw_content.get("fileExtension")) without proper sanitization. This variable is subsequently used to construct a file path in download_ding_file, which leads to a path traversal vulnerability. An attacker can provide a malicious extension like ../../../../tmp/malicious.sh to write files to arbitrary locations on the server. Since the application uses open(path, "wb") to save the downloaded content, this allows for arbitrary file overwrite in any directory the application has write access to.
| voice_ext = cast(str, raw_content.get("fileExtension") or "") | |
| if not voice_ext: | |
| voice_ext = "amr" | |
| voice_ext = voice_ext.lstrip(".") | |
| voice_ext = cast(str, raw_content.get("fileExtension") or "") | |
| if not voice_ext: | |
| voice_ext = "amr" | |
| voice_ext = Path(voice_ext).name.lstrip(".") |
| if not file_ext: | ||
| file_ext = cast(str, raw_content.get("fileExtension") or "") |
There was a problem hiding this comment.
Similar to the voice_ext issue, the file_ext variable is populated from untrusted user input (raw_content.get("fileExtension")) when file_name is missing. This leads to a path traversal vulnerability when constructing the local file path. An attacker can supply a malicious extension to write files to arbitrary locations on the filesystem.
| if not file_ext: | |
| file_ext = cast(str, raw_content.get("fileExtension") or "") | |
| if not file_ext: | |
| file_ext = Path(cast(str, raw_content.get("fileExtension") or "")).name.lstrip(".") |
| if not download_code: | ||
| logger.warning( | ||
| "钉钉富文本图片消息缺少 downloadCode,已跳过" | ||
| ) | ||
| continue | ||
| if not robot_code: | ||
| logger.error( | ||
| "钉钉富文本图片消息解析失败: 回调中缺少 robotCode" | ||
| ) | ||
| continue |
There was a problem hiding this comment.
为了保持一致性,建议将 robot_code 的检查放在 download_code 之前。robot_code 是所有下载操作所必需的,先检查它可以更早地发现并报告这个关键性依赖的缺失。这与 picture 消息类型的处理逻辑也保持了一致。
| if not download_code: | |
| logger.warning( | |
| "钉钉富文本图片消息缺少 downloadCode,已跳过" | |
| ) | |
| continue | |
| if not robot_code: | |
| logger.error( | |
| "钉钉富文本图片消息解析失败: 回调中缺少 robotCode" | |
| ) | |
| continue | |
| if not robot_code: | |
| logger.error( | |
| "钉钉富文本图片消息解析失败: 回调中缺少 robotCode" | |
| ) | |
| continue | |
| if not download_code: | |
| logger.warning( | |
| "钉钉富文本图片消息缺少 downloadCode,已跳过" | |
| ) | |
| continue |
| if not download_code: | ||
| logger.warning("钉钉语音消息缺少 downloadCode,已跳过") | ||
| elif not robot_code: | ||
| logger.error("钉钉语音消息解析失败: 回调中缺少 robotCode") |
There was a problem hiding this comment.
为了保持一致性,建议将 robot_code 的检查放在 download_code 之前。robot_code 是所有下载操作所必需的,先检查它可以更早地发现并报告这个关键性依赖的缺失。这与 picture 消息类型的处理逻辑也保持了一致。
| if not download_code: | |
| logger.warning("钉钉语音消息缺少 downloadCode,已跳过") | |
| elif not robot_code: | |
| logger.error("钉钉语音消息解析失败: 回调中缺少 robotCode") | |
| if not robot_code: | |
| logger.error("钉钉语音消息解析失败: 回调中缺少 robotCode") | |
| elif not download_code: | |
| logger.warning("钉钉语音消息缺少 downloadCode,已跳过") |
| if not download_code: | ||
| logger.warning("钉钉文件消息缺少 downloadCode,已跳过") | ||
| elif not robot_code: | ||
| logger.error("钉钉文件消息解析失败: 回调中缺少 robotCode") |
There was a problem hiding this comment.
为了保持一致性,建议将 robot_code 的检查放在 download_code 之前。robot_code 是所有下载操作所必需的,先检查它可以更早地发现并报告这个关键性依赖的缺失。这与 picture 消息类型的处理逻辑也保持了一致。
| if not download_code: | |
| logger.warning("钉钉文件消息缺少 downloadCode,已跳过") | |
| elif not robot_code: | |
| logger.error("钉钉文件消息解析失败: 回调中缺少 robotCode") | |
| if not robot_code: | |
| logger.error("钉钉文件消息解析失败: 回调中缺少 robotCode") | |
| elif not download_code: | |
| logger.warning("钉钉文件消息缺少 downloadCode,已跳过") |
There was a problem hiding this comment.
Hey - 我发现了两个问题,并给出了一些整体层面的反馈:
- 在
picture、richText、audio|voice和file分支中,提取并校验download_code/robot_code的重复模式可以抽取成一个小的辅助函数,以减少重复,让各个match分支更多地聚焦在各消息类型自身的逻辑上。 - 在
download_ding_file中,你现在同时支持顶层和嵌套的downloadUrl,但代码假定resp.json()一定返回dict;建议在访问键之前增加一个防御性检查,比如isinstance(resp_data, dict)(或类似方式),以避免对格式异常响应导致的运行时错误。
供 AI 代理使用的提示词
Please address the comments from this code review:
## Overall Comments
- The repeated pattern of extracting and validating `download_code`/`robot_code` across `picture`, `richText`, `audio|voice`, and `file` branches could be refactored into a small helper to reduce duplication and keep the `match` cases more focused on message-type-specific logic.
- In `download_ding_file`, you now support both top-level and nested `downloadUrl`, but the code assumes `resp.json()` returns a `dict`; consider adding a defensive `isinstance(resp_data, dict)` check (or similar) before accessing keys to avoid unexpected runtime errors on malformed responses.
## Individual Comments
### Comment 1
<location path="astrbot/core/platform/sources/dingtalk/dingtalk_adapter.py" line_range="209-210" />
<code_context>
+ robot_code,
+ "jpg",
+ )
+ if f_path:
+ abm.message.append(Image.fromFileSystem(f_path))
+ else:
+ logger.warning("钉钉图片消息下载失败,无法解析为图片")
</code_context>
<issue_to_address>
**suggestion:** Consider logging or handling the case where rich-text image download fails.
For `picture` messages we already log image download failures, but rich-text images are silently skipped when `f_path` is falsy. Aligning this path by at least emitting a warning would improve observability and keep behavior consistent across message types.
Suggested implementation:
```python
elif "type" in content and content["type"] == "picture":
download_code = cast(str, content.get("downloadCode") or "")
if not download_code:
logger.warning(
"钉钉富文本图片消息缺少 downloadCode,已跳过"
)
else:
f_path = await self.download_ding_file(
download_code,
robot_code,
"jpg",
)
if f_path:
abm.message.append(Image.fromFileSystem(f_path))
else:
logger.warning("钉钉富文本图片消息下载失败,无法解析为图片")
```
1. Make sure the closing parenthesis for `logger.warning(` in the original code is present if it was omitted in the snippet.
2. Confirm this code is inside an `async` context (it appears to be, since `await` is already used earlier in the same method); if not, adjust accordingly (e.g., by making the method async or changing how `download_ding_file` is invoked).
3. Ensure `robot_code`, `abm`, `Image`, and `self.download_ding_file` are in scope, as they are for the non-rich-text picture handling path.
</issue_to_address>
### Comment 2
<location path="astrbot/core/platform/sources/dingtalk/dingtalk_adapter.py" line_range="185" />
<code_context>
+ raw_content = cast(dict, message.extensions.get("content") or {})
+ if not isinstance(raw_content, dict):
+ raw_content = {}
match message_type:
case "text":
abm.message_str = message.text.content.strip()
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the repeated media download-and-append logic into small helper functions so that convert_msg mainly dispatches and stays simpler to read.
The new branches for `picture`, `richText` images, `audio`/`voice`, and `file` do introduce noticeable duplication and branching. You can reduce complexity without changing behavior by:
1. Extracting the repeated “download-and-append” pattern into small helpers.
2. Using those helpers in the `match` branches to keep `convert_msg` mostly as a dispatcher.
3. Flattening nested conditionals via early returns/continues in the helpers.
### 1) Centralize download-and-append logic
All media types share this pattern:
- Validate `download_code` and `robot_code`
- Normalize extension
- Call `download_ding_file`
- If path exists, append appropriate message component
You can encapsulate that in small helpers:
```python
async def _append_image_from_download(
self,
*,
download_code: str,
robot_code: str,
ext: str,
abm: AstrBotMessage,
missing_code_msg: str,
missing_robot_msg: str,
download_failed_msg: str,
) -> None:
if not download_code:
logger.warning(missing_code_msg)
return
if not robot_code:
logger.error(missing_robot_msg)
return
ext = ext.lstrip(".") or "jpg"
f_path = await self.download_ding_file(download_code, robot_code, ext)
if not f_path:
logger.warning(download_failed_msg)
return
abm.message.append(Image.fromFileSystem(f_path))
```
```python
async def _append_voice_from_download(
self,
*,
download_code: str,
robot_code: str,
ext: str | None,
abm: AstrBotMessage,
) -> None:
if not download_code:
logger.warning("钉钉语音消息缺少 downloadCode,已跳过")
return
if not robot_code:
logger.error("钉钉语音消息解析失败: 回调中缺少 robotCode")
return
voice_ext = (ext or "amr").lstrip(".") or "amr"
f_path = await self.download_ding_file(download_code, robot_code, voice_ext)
if not f_path:
logger.warning("钉钉语音消息下载失败,无法解析为语音")
return
abm.message.append(Record.fromFileSystem(f_path))
```
```python
async def _append_file_from_download(
self,
*,
download_code: str,
robot_code: str,
file_name: str | None,
file_ext: str | None,
abm: AstrBotMessage,
) -> None:
if not download_code:
logger.warning("钉钉文件消息缺少 downloadCode,已跳过")
return
if not robot_code:
logger.error("钉钉文件消息解析失败: 回调中缺少 robotCode")
return
ext = (file_ext or Path(file_name or "").suffix.lstrip(".") or "file").lstrip(".")
f_path = await self.download_ding_file(download_code, robot_code, ext)
if not f_path:
logger.warning("钉钉文件消息下载失败,无法解析为文件")
return
final_name = file_name or Path(f_path).name
abm.message.append(File(name=final_name, file=f_path))
```
These helpers contain the logging, validation, and extension normalization, and they are easier to test in isolation.
### 2) Simplify `convert_msg` branches by delegating
With the helpers, your `match` in `convert_msg` can be trimmed down to just parameter extraction and delegation:
```python
match message_type:
case "text":
abm.message_str = message.text.content.strip()
abm.message.append(Plain(abm.message_str))
case "picture":
image_content = cast(
dingtalk_stream.ImageContent | None,
message.image_content,
)
download_code = cast(str, (image_content.download_code if image_content else "") or "")
await self._append_image_from_download(
download_code=download_code,
robot_code=robot_code,
ext="jpg",
abm=abm,
missing_code_msg="钉钉图片消息缺少 downloadCode,已跳过",
missing_robot_msg="钉钉图片消息解析失败: 回调中缺少 robotCode",
download_failed_msg="钉钉图片消息下载失败,无法解析为图片",
)
case "richText":
rtc: dingtalk_stream.RichTextContent = cast(
dingtalk_stream.RichTextContent, message.rich_text_content
)
contents: list[dict] = cast(list[dict], rtc.rich_text_list)
for content in contents:
if "text" in content:
abm.message.append(Plain(content["text"]))
elif content.get("type") == "picture":
download_code = cast(str, content.get("downloadCode") or "")
await self._append_image_from_download(
download_code=download_code,
robot_code=robot_code,
ext="jpg",
abm=abm,
missing_code_msg="钉钉富文本图片消息缺少 downloadCode,已跳过",
missing_robot_msg="钉钉富文本图片消息解析失败: 回调中缺少 robotCode",
download_failed_msg="钉钉富文本图片消息下载失败,无法解析为图片",
)
case "audio" | "voice":
download_code = cast(str, raw_content.get("downloadCode") or "")
voice_ext = cast(str | None, raw_content.get("fileExtension"))
await self._append_voice_from_download(
download_code=download_code,
robot_code=robot_code,
ext=voice_ext,
abm=abm,
)
case "file":
download_code = cast(str, raw_content.get("downloadCode") or "")
file_name = cast(str | None, raw_content.get("fileName"))
file_ext = cast(str | None, raw_content.get("fileExtension"))
await self._append_file_from_download(
download_code=download_code,
robot_code=robot_code,
file_name=file_name,
file_ext=file_ext,
abm=abm,
)
```
This keeps `convert_msg` focused on:
- figuring out `message_type`
- extracting `robot_code` / `raw_content` / specific content fields
- delegating to media-specific helpers
while the helpers own validation, logging, and download details.
### 3) Optional: tiny helper for `downloadUrl` extraction
You already simplified `download_ding_file`; to further declutter it, you could push the `downloadUrl` fallback logic into a dedicated helper:
```python
def _extract_download_url(self, resp_data: dict) -> str:
return cast(
str,
(
resp_data.get("downloadUrl")
or resp_data.get("data", {}).get("downloadUrl")
or ""
),
)
```
and use it as:
```python
resp_data = await resp.json()
download_url = self._extract_download_url(resp_data)
if not download_url:
logger.error(f"下载钉钉文件失败: 未找到 downloadUrl, 响应: {resp_data}")
return ""
```
This keeps `download_ding_file` more declarative and avoids inlining the fallback logic.
</issue_to_address>帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据反馈改进后续的 Review。
Original comment in English
Hey - I've found 2 issues, and left some high level feedback:
- The repeated pattern of extracting and validating
download_code/robot_codeacrosspicture,richText,audio|voice, andfilebranches could be refactored into a small helper to reduce duplication and keep thematchcases more focused on message-type-specific logic. - In
download_ding_file, you now support both top-level and nesteddownloadUrl, but the code assumesresp.json()returns adict; consider adding a defensiveisinstance(resp_data, dict)check (or similar) before accessing keys to avoid unexpected runtime errors on malformed responses.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The repeated pattern of extracting and validating `download_code`/`robot_code` across `picture`, `richText`, `audio|voice`, and `file` branches could be refactored into a small helper to reduce duplication and keep the `match` cases more focused on message-type-specific logic.
- In `download_ding_file`, you now support both top-level and nested `downloadUrl`, but the code assumes `resp.json()` returns a `dict`; consider adding a defensive `isinstance(resp_data, dict)` check (or similar) before accessing keys to avoid unexpected runtime errors on malformed responses.
## Individual Comments
### Comment 1
<location path="astrbot/core/platform/sources/dingtalk/dingtalk_adapter.py" line_range="209-210" />
<code_context>
+ robot_code,
+ "jpg",
+ )
+ if f_path:
+ abm.message.append(Image.fromFileSystem(f_path))
+ else:
+ logger.warning("钉钉图片消息下载失败,无法解析为图片")
</code_context>
<issue_to_address>
**suggestion:** Consider logging or handling the case where rich-text image download fails.
For `picture` messages we already log image download failures, but rich-text images are silently skipped when `f_path` is falsy. Aligning this path by at least emitting a warning would improve observability and keep behavior consistent across message types.
Suggested implementation:
```python
elif "type" in content and content["type"] == "picture":
download_code = cast(str, content.get("downloadCode") or "")
if not download_code:
logger.warning(
"钉钉富文本图片消息缺少 downloadCode,已跳过"
)
else:
f_path = await self.download_ding_file(
download_code,
robot_code,
"jpg",
)
if f_path:
abm.message.append(Image.fromFileSystem(f_path))
else:
logger.warning("钉钉富文本图片消息下载失败,无法解析为图片")
```
1. Make sure the closing parenthesis for `logger.warning(` in the original code is present if it was omitted in the snippet.
2. Confirm this code is inside an `async` context (it appears to be, since `await` is already used earlier in the same method); if not, adjust accordingly (e.g., by making the method async or changing how `download_ding_file` is invoked).
3. Ensure `robot_code`, `abm`, `Image`, and `self.download_ding_file` are in scope, as they are for the non-rich-text picture handling path.
</issue_to_address>
### Comment 2
<location path="astrbot/core/platform/sources/dingtalk/dingtalk_adapter.py" line_range="185" />
<code_context>
+ raw_content = cast(dict, message.extensions.get("content") or {})
+ if not isinstance(raw_content, dict):
+ raw_content = {}
match message_type:
case "text":
abm.message_str = message.text.content.strip()
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the repeated media download-and-append logic into small helper functions so that convert_msg mainly dispatches and stays simpler to read.
The new branches for `picture`, `richText` images, `audio`/`voice`, and `file` do introduce noticeable duplication and branching. You can reduce complexity without changing behavior by:
1. Extracting the repeated “download-and-append” pattern into small helpers.
2. Using those helpers in the `match` branches to keep `convert_msg` mostly as a dispatcher.
3. Flattening nested conditionals via early returns/continues in the helpers.
### 1) Centralize download-and-append logic
All media types share this pattern:
- Validate `download_code` and `robot_code`
- Normalize extension
- Call `download_ding_file`
- If path exists, append appropriate message component
You can encapsulate that in small helpers:
```python
async def _append_image_from_download(
self,
*,
download_code: str,
robot_code: str,
ext: str,
abm: AstrBotMessage,
missing_code_msg: str,
missing_robot_msg: str,
download_failed_msg: str,
) -> None:
if not download_code:
logger.warning(missing_code_msg)
return
if not robot_code:
logger.error(missing_robot_msg)
return
ext = ext.lstrip(".") or "jpg"
f_path = await self.download_ding_file(download_code, robot_code, ext)
if not f_path:
logger.warning(download_failed_msg)
return
abm.message.append(Image.fromFileSystem(f_path))
```
```python
async def _append_voice_from_download(
self,
*,
download_code: str,
robot_code: str,
ext: str | None,
abm: AstrBotMessage,
) -> None:
if not download_code:
logger.warning("钉钉语音消息缺少 downloadCode,已跳过")
return
if not robot_code:
logger.error("钉钉语音消息解析失败: 回调中缺少 robotCode")
return
voice_ext = (ext or "amr").lstrip(".") or "amr"
f_path = await self.download_ding_file(download_code, robot_code, voice_ext)
if not f_path:
logger.warning("钉钉语音消息下载失败,无法解析为语音")
return
abm.message.append(Record.fromFileSystem(f_path))
```
```python
async def _append_file_from_download(
self,
*,
download_code: str,
robot_code: str,
file_name: str | None,
file_ext: str | None,
abm: AstrBotMessage,
) -> None:
if not download_code:
logger.warning("钉钉文件消息缺少 downloadCode,已跳过")
return
if not robot_code:
logger.error("钉钉文件消息解析失败: 回调中缺少 robotCode")
return
ext = (file_ext or Path(file_name or "").suffix.lstrip(".") or "file").lstrip(".")
f_path = await self.download_ding_file(download_code, robot_code, ext)
if not f_path:
logger.warning("钉钉文件消息下载失败,无法解析为文件")
return
final_name = file_name or Path(f_path).name
abm.message.append(File(name=final_name, file=f_path))
```
These helpers contain the logging, validation, and extension normalization, and they are easier to test in isolation.
### 2) Simplify `convert_msg` branches by delegating
With the helpers, your `match` in `convert_msg` can be trimmed down to just parameter extraction and delegation:
```python
match message_type:
case "text":
abm.message_str = message.text.content.strip()
abm.message.append(Plain(abm.message_str))
case "picture":
image_content = cast(
dingtalk_stream.ImageContent | None,
message.image_content,
)
download_code = cast(str, (image_content.download_code if image_content else "") or "")
await self._append_image_from_download(
download_code=download_code,
robot_code=robot_code,
ext="jpg",
abm=abm,
missing_code_msg="钉钉图片消息缺少 downloadCode,已跳过",
missing_robot_msg="钉钉图片消息解析失败: 回调中缺少 robotCode",
download_failed_msg="钉钉图片消息下载失败,无法解析为图片",
)
case "richText":
rtc: dingtalk_stream.RichTextContent = cast(
dingtalk_stream.RichTextContent, message.rich_text_content
)
contents: list[dict] = cast(list[dict], rtc.rich_text_list)
for content in contents:
if "text" in content:
abm.message.append(Plain(content["text"]))
elif content.get("type") == "picture":
download_code = cast(str, content.get("downloadCode") or "")
await self._append_image_from_download(
download_code=download_code,
robot_code=robot_code,
ext="jpg",
abm=abm,
missing_code_msg="钉钉富文本图片消息缺少 downloadCode,已跳过",
missing_robot_msg="钉钉富文本图片消息解析失败: 回调中缺少 robotCode",
download_failed_msg="钉钉富文本图片消息下载失败,无法解析为图片",
)
case "audio" | "voice":
download_code = cast(str, raw_content.get("downloadCode") or "")
voice_ext = cast(str | None, raw_content.get("fileExtension"))
await self._append_voice_from_download(
download_code=download_code,
robot_code=robot_code,
ext=voice_ext,
abm=abm,
)
case "file":
download_code = cast(str, raw_content.get("downloadCode") or "")
file_name = cast(str | None, raw_content.get("fileName"))
file_ext = cast(str | None, raw_content.get("fileExtension"))
await self._append_file_from_download(
download_code=download_code,
robot_code=robot_code,
file_name=file_name,
file_ext=file_ext,
abm=abm,
)
```
This keeps `convert_msg` focused on:
- figuring out `message_type`
- extracting `robot_code` / `raw_content` / specific content fields
- delegating to media-specific helpers
while the helpers own validation, logging, and download details.
### 3) Optional: tiny helper for `downloadUrl` extraction
You already simplified `download_ding_file`; to further declutter it, you could push the `downloadUrl` fallback logic into a dedicated helper:
```python
def _extract_download_url(self, resp_data: dict) -> str:
return cast(
str,
(
resp_data.get("downloadUrl")
or resp_data.get("data", {}).get("downloadUrl")
or ""
),
)
```
and use it as:
```python
resp_data = await resp.json()
download_url = self._extract_download_url(resp_data)
if not download_url:
logger.error(f"下载钉钉文件失败: 未找到 downloadUrl, 响应: {resp_data}")
return ""
```
This keeps `download_ding_file` more declarative and avoids inlining the fallback logic.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| if f_path: | ||
| abm.message.append(Image.fromFileSystem(f_path)) |
There was a problem hiding this comment.
suggestion: 建议在富文本图片下载失败时进行日志记录或处理。
对于 picture 类型的消息,我们已经在图片下载失败时记录了日志,但对于富文本图片,当 f_path 为假值时会被静默跳过。通过至少增加一条 warning 日志来对齐这条分支的行为,可以提升可观测性,并保持各消息类型间行为的一致性。
建议实现如下:
elif "type" in content and content["type"] == "picture":
download_code = cast(str, content.get("downloadCode") or "")
if not download_code:
logger.warning(
"钉钉富文本图片消息缺少 downloadCode,已跳过"
)
else:
f_path = await self.download_ding_file(
download_code,
robot_code,
"jpg",
)
if f_path:
abm.message.append(Image.fromFileSystem(f_path))
else:
logger.warning("钉钉富文本图片消息下载失败,无法解析为图片")- 确认原始代码中
logger.warning(的右括号没有遗漏(如果只是片段里没展示出来,可以忽略)。 - 确认这段代码当前位于
async上下文中(看起来是的,因为同一个方法中前面已经使用了await);如果不是,请相应调整(比如将方法改为 async,或调整调用download_ding_file的方式)。 - 确保
robot_code、abm、Image和self.download_ding_file在这里都在作用域内,就像非富文本图片处理分支中那样。
Original comment in English
suggestion: Consider logging or handling the case where rich-text image download fails.
For picture messages we already log image download failures, but rich-text images are silently skipped when f_path is falsy. Aligning this path by at least emitting a warning would improve observability and keep behavior consistent across message types.
Suggested implementation:
elif "type" in content and content["type"] == "picture":
download_code = cast(str, content.get("downloadCode") or "")
if not download_code:
logger.warning(
"钉钉富文本图片消息缺少 downloadCode,已跳过"
)
else:
f_path = await self.download_ding_file(
download_code,
robot_code,
"jpg",
)
if f_path:
abm.message.append(Image.fromFileSystem(f_path))
else:
logger.warning("钉钉富文本图片消息下载失败,无法解析为图片")- Make sure the closing parenthesis for
logger.warning(in the original code is present if it was omitted in the snippet. - Confirm this code is inside an
asynccontext (it appears to be, sinceawaitis already used earlier in the same method); if not, adjust accordingly (e.g., by making the method async or changing howdownload_ding_fileis invoked). - Ensure
robot_code,abm,Image, andself.download_ding_fileare in scope, as they are for the non-rich-text picture handling path.
| raw_content = cast(dict, message.extensions.get("content") or {}) | ||
| if not isinstance(raw_content, dict): | ||
| raw_content = {} | ||
| match message_type: |
There was a problem hiding this comment.
issue (complexity): 建议将重复的“媒体下载并追加”逻辑抽取为小的辅助函数,这样 convert_msg 主要负责分发,整体可读性更好。
针对 picture、richText 图片、audio/voice 和 file 的新分支确实引入了明显的重复和分支。你可以在不改变行为的前提下通过以下方式降低复杂度:
- 将重复的“download 并 append”模式抽取成几个小的 helper。
- 在
match分支中使用这些 helper,让convert_msg更多地作为一个 dispatcher。 - 在 helper 中通过提前返回(early return)/continue 来压平嵌套条件。
1) 统一“下载并追加”的逻辑
所有媒体类型都遵循相同的模式:
- 校验
download_code和robot_code - 规范化扩展名
- 调用
download_ding_file - 如果路径存在,则追加合适的消息组件
可以将这些逻辑封装到小 helper 中:
async def _append_image_from_download(
self,
*,
download_code: str,
robot_code: str,
ext: str,
abm: AstrBotMessage,
missing_code_msg: str,
missing_robot_msg: str,
download_failed_msg: str,
) -> None:
if not download_code:
logger.warning(missing_code_msg)
return
if not robot_code:
logger.error(missing_robot_msg)
return
ext = ext.lstrip(".") or "jpg"
f_path = await self.download_ding_file(download_code, robot_code, ext)
if not f_path:
logger.warning(download_failed_msg)
return
abm.message.append(Image.fromFileSystem(f_path))async def _append_voice_from_download(
self,
*,
download_code: str,
robot_code: str,
ext: str | None,
abm: AstrBotMessage,
) -> None:
if not download_code:
logger.warning("钉钉语音消息缺少 downloadCode,已跳过")
return
if not robot_code:
logger.error("钉钉语音消息解析失败: 回调中缺少 robotCode")
return
voice_ext = (ext or "amr").lstrip(".") or "amr"
f_path = await self.download_ding_file(download_code, robot_code, voice_ext)
if not f_path:
logger.warning("钉钉语音消息下载失败,无法解析为语音")
return
abm.message.append(Record.fromFileSystem(f_path))async def _append_file_from_download(
self,
*,
download_code: str,
robot_code: str,
file_name: str | None,
file_ext: str | None,
abm: AstrBotMessage,
) -> None:
if not download_code:
logger.warning("钉钉文件消息缺少 downloadCode,已跳过")
return
if not robot_code:
logger.error("钉钉文件消息解析失败: 回调中缺少 robotCode")
return
ext = (file_ext or Path(file_name or "").suffix.lstrip(".") or "file").lstrip(".")
f_path = await self.download_ding_file(download_code, robot_code, ext)
if not f_path:
logger.warning("钉钉文件消息下载失败,无法解析为文件")
return
final_name = file_name or Path(f_path).name
abm.message.append(File(name=final_name, file=f_path))这些 helper 集中处理了日志、校验以及扩展名规范化,而且更容易独立测试。
2) 通过委托简化 convert_msg 中的分支
在引入这些 helper 后,你在 convert_msg 中的 match 分支就可以精简为仅做参数提取和委托调用:
match message_type:
case "text":
abm.message_str = message.text.content.strip()
abm.message.append(Plain(abm.message_str))
case "picture":
image_content = cast(
dingtalk_stream.ImageContent | None,
message.image_content,
)
download_code = cast(str, (image_content.download_code if image_content else "") or "")
await self._append_image_from_download(
download_code=download_code,
robot_code=robot_code,
ext="jpg",
abm=abm,
missing_code_msg="钉钉图片消息缺少 downloadCode,已跳过",
missing_robot_msg="钉钉图片消息解析失败: 回调中缺少 robotCode",
download_failed_msg="钉钉图片消息下载失败,无法解析为图片",
)
case "richText":
rtc: dingtalk_stream.RichTextContent = cast(
dingtalk_stream.RichTextContent, message.rich_text_content
)
contents: list[dict] = cast(list[dict], rtc.rich_text_list)
for content in contents:
if "text" in content:
abm.message.append(Plain(content["text"]))
elif content.get("type") == "picture":
download_code = cast(str, content.get("downloadCode") or "")
await self._append_image_from_download(
download_code=download_code,
robot_code=robot_code,
ext="jpg",
abm=abm,
missing_code_msg="钉钉富文本图片消息缺少 downloadCode,已跳过",
missing_robot_msg="钉钉富文本图片消息解析失败: 回调中缺少 robotCode",
download_failed_msg="钉钉富文本图片消息下载失败,无法解析为图片",
)
case "audio" | "voice":
download_code = cast(str, raw_content.get("downloadCode") or "")
voice_ext = cast(str | None, raw_content.get("fileExtension"))
await self._append_voice_from_download(
download_code=download_code,
robot_code=robot_code,
ext=voice_ext,
abm=abm,
)
case "file":
download_code = cast(str, raw_content.get("downloadCode") or "")
file_name = cast(str | None, raw_content.get("fileName"))
file_ext = cast(str | None, raw_content.get("fileExtension"))
await self._append_file_from_download(
download_code=download_code,
robot_code=robot_code,
file_name=file_name,
file_ext=file_ext,
abm=abm,
)这样可以让 convert_msg 主要聚焦在:
- 确定
message_type - 提取
robot_code/raw_content/ 各类内容字段 - 将处理委托给媒体类型相关的 helper
而具体的校验、日志记录和下载细节则由 helper 自己负责。
3) 可选:为 downloadUrl 抽一个小的提取 helper
你已经简化了 download_ding_file;要进一步减轻它的负担,可以把 downloadUrl 的兜底逻辑放进一个专门的 helper 中:
def _extract_download_url(self, resp_data: dict) -> str:
return cast(
str,
(
resp_data.get("downloadUrl")
or resp_data.get("data", {}).get("downloadUrl")
or ""
),
)使用方式如下:
resp_data = await resp.json()
download_url = self._extract_download_url(resp_data)
if not download_url:
logger.error(f"下载钉钉文件失败: 未找到 downloadUrl, 响应: {resp_data}")
return ""这样可以让 download_ding_file 更加声明式,并避免在函数内部内联这些兜底逻辑。
Original comment in English
issue (complexity): Consider extracting the repeated media download-and-append logic into small helper functions so that convert_msg mainly dispatches and stays simpler to read.
The new branches for picture, richText images, audio/voice, and file do introduce noticeable duplication and branching. You can reduce complexity without changing behavior by:
- Extracting the repeated “download-and-append” pattern into small helpers.
- Using those helpers in the
matchbranches to keepconvert_msgmostly as a dispatcher. - Flattening nested conditionals via early returns/continues in the helpers.
1) Centralize download-and-append logic
All media types share this pattern:
- Validate
download_codeandrobot_code - Normalize extension
- Call
download_ding_file - If path exists, append appropriate message component
You can encapsulate that in small helpers:
async def _append_image_from_download(
self,
*,
download_code: str,
robot_code: str,
ext: str,
abm: AstrBotMessage,
missing_code_msg: str,
missing_robot_msg: str,
download_failed_msg: str,
) -> None:
if not download_code:
logger.warning(missing_code_msg)
return
if not robot_code:
logger.error(missing_robot_msg)
return
ext = ext.lstrip(".") or "jpg"
f_path = await self.download_ding_file(download_code, robot_code, ext)
if not f_path:
logger.warning(download_failed_msg)
return
abm.message.append(Image.fromFileSystem(f_path))async def _append_voice_from_download(
self,
*,
download_code: str,
robot_code: str,
ext: str | None,
abm: AstrBotMessage,
) -> None:
if not download_code:
logger.warning("钉钉语音消息缺少 downloadCode,已跳过")
return
if not robot_code:
logger.error("钉钉语音消息解析失败: 回调中缺少 robotCode")
return
voice_ext = (ext or "amr").lstrip(".") or "amr"
f_path = await self.download_ding_file(download_code, robot_code, voice_ext)
if not f_path:
logger.warning("钉钉语音消息下载失败,无法解析为语音")
return
abm.message.append(Record.fromFileSystem(f_path))async def _append_file_from_download(
self,
*,
download_code: str,
robot_code: str,
file_name: str | None,
file_ext: str | None,
abm: AstrBotMessage,
) -> None:
if not download_code:
logger.warning("钉钉文件消息缺少 downloadCode,已跳过")
return
if not robot_code:
logger.error("钉钉文件消息解析失败: 回调中缺少 robotCode")
return
ext = (file_ext or Path(file_name or "").suffix.lstrip(".") or "file").lstrip(".")
f_path = await self.download_ding_file(download_code, robot_code, ext)
if not f_path:
logger.warning("钉钉文件消息下载失败,无法解析为文件")
return
final_name = file_name or Path(f_path).name
abm.message.append(File(name=final_name, file=f_path))These helpers contain the logging, validation, and extension normalization, and they are easier to test in isolation.
2) Simplify convert_msg branches by delegating
With the helpers, your match in convert_msg can be trimmed down to just parameter extraction and delegation:
match message_type:
case "text":
abm.message_str = message.text.content.strip()
abm.message.append(Plain(abm.message_str))
case "picture":
image_content = cast(
dingtalk_stream.ImageContent | None,
message.image_content,
)
download_code = cast(str, (image_content.download_code if image_content else "") or "")
await self._append_image_from_download(
download_code=download_code,
robot_code=robot_code,
ext="jpg",
abm=abm,
missing_code_msg="钉钉图片消息缺少 downloadCode,已跳过",
missing_robot_msg="钉钉图片消息解析失败: 回调中缺少 robotCode",
download_failed_msg="钉钉图片消息下载失败,无法解析为图片",
)
case "richText":
rtc: dingtalk_stream.RichTextContent = cast(
dingtalk_stream.RichTextContent, message.rich_text_content
)
contents: list[dict] = cast(list[dict], rtc.rich_text_list)
for content in contents:
if "text" in content:
abm.message.append(Plain(content["text"]))
elif content.get("type") == "picture":
download_code = cast(str, content.get("downloadCode") or "")
await self._append_image_from_download(
download_code=download_code,
robot_code=robot_code,
ext="jpg",
abm=abm,
missing_code_msg="钉钉富文本图片消息缺少 downloadCode,已跳过",
missing_robot_msg="钉钉富文本图片消息解析失败: 回调中缺少 robotCode",
download_failed_msg="钉钉富文本图片消息下载失败,无法解析为图片",
)
case "audio" | "voice":
download_code = cast(str, raw_content.get("downloadCode") or "")
voice_ext = cast(str | None, raw_content.get("fileExtension"))
await self._append_voice_from_download(
download_code=download_code,
robot_code=robot_code,
ext=voice_ext,
abm=abm,
)
case "file":
download_code = cast(str, raw_content.get("downloadCode") or "")
file_name = cast(str | None, raw_content.get("fileName"))
file_ext = cast(str | None, raw_content.get("fileExtension"))
await self._append_file_from_download(
download_code=download_code,
robot_code=robot_code,
file_name=file_name,
file_ext=file_ext,
abm=abm,
)This keeps convert_msg focused on:
- figuring out
message_type - extracting
robot_code/raw_content/ specific content fields - delegating to media-specific helpers
while the helpers own validation, logging, and download details.
3) Optional: tiny helper for downloadUrl extraction
You already simplified download_ding_file; to further declutter it, you could push the downloadUrl fallback logic into a dedicated helper:
def _extract_download_url(self, resp_data: dict) -> str:
return cast(
str,
(
resp_data.get("downloadUrl")
or resp_data.get("data", {}).get("downloadUrl")
or ""
),
)and use it as:
resp_data = await resp.json()
download_url = self._extract_download_url(resp_data)
if not download_url:
logger.error(f"下载钉钉文件失败: 未找到 downloadUrl, 响应: {resp_data}")
return ""This keeps download_ding_file more declarative and avoids inlining the fallback logic.
fixes: #5916 #5786
Modifications / 改动点
Screenshots or Test Results / 运行截图或测试结果
Checklist / 检查清单
requirements.txt和pyproject.toml文件相应位置。/ I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations inrequirements.txtandpyproject.toml.Summary by Sourcery
在 DingTalk 适配器中处理更多钉钉消息类型,并提升文件下载的健壮性。
Bug 修复:
robotCode或downloadCode的处理而导致无法解析和下载图片、音频/语音以及文件消息的问题。downloadUrl,并在缺失时记录详细错误日志。增强功能:
Original summary in English
Summary by Sourcery
Handle additional DingTalk message types and improve file download robustness in the DingTalk adapter.
Bug Fixes:
Enhancements: