Skip to content

feat: wechatpadpro 触发tts时 添加对mp3格式音频支持 #1830

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

zhx8702
Copy link

@zhx8702 zhx8702 commented Jun 17, 2025

解决了 #1616

Motivation

多个tts服务商 提供的音频除了wav 还有mp3, wechatpadpro 添加发送语音的mp3适配

Modifications

如果文件不是wav 默认走mp3转wav逻辑

Check

  • 😊 我的 Commit Message 符合良好的规范
  • 👀 我的更改经过良好的测试
  • 🤓 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到了 requirements.txtpyproject.toml 文件相应位置。
  • 😮 我的更改没有引入恶意代码

好的,这是将 pull request 总结翻译成中文的结果:

Sourcery 总结

添加即时音频格式检测和转换,以支持 WeChatPadPro 上 TTS 语音消息的 MP3(和其他格式),并更新消息事件以使用通用转换器

新功能:

  • 通过在为语音消息进行 Tencent Silk 编码之前将 MP3 和其他非 WAV 音频格式转换为 PCM WAV,从而启用对这些格式的支持

增强功能:

  • 引入使用 FFmpeg 或 pyffmpeg 的实用回退转换器,以实现更广泛的音频格式兼容性
  • 重构辅助函数,将 wav_to_tencent_silk_base64 替换为 audio_to_tencent_silk_base64,并处理临时文件清理
Original summary in English

好的,这是将 pull request 总结翻译成中文的结果:

Sourcery 总结

在 WeChatPadPro TTS 中启用 MP3 和其他音频格式支持,方法是在腾讯 Silk 编码之前将输入转换为 PCM WAV 格式。

新特性:

  • 支持 MP3 和其他非 WAV 音频输入,方法是在腾讯 Silk 编码之前将它们转换为 24000Hz PCM WAV 格式。

增强功能:

  • 引入 convert_to_pcm_wav 实用程序,该程序具有 pyffmpeg 和 ffmpeg CLI 后备方案。
  • 重构音频转换,方法是用 audio_to_tencent_silk_base64 替换 wav_to_tencent_silk_base64 并处理临时文件清理。
  • 更新 WeChatPadPro 消息事件以使用新的 audio_to_tencent_silk_base64 发送语音消息。
Original summary in English

Summary by Sourcery

Enable MP3 and other audio format support in WeChatPadPro TTS by converting inputs to PCM WAV before Tencent Silk encoding.

New Features:

  • Support MP3 and other non-WAV audio inputs by converting them to 24000Hz PCM WAV before Tencent Silk encoding.

Enhancements:

  • Introduce convert_to_pcm_wav utility with pyffmpeg and ffmpeg CLI fallback.
  • Refactor audio conversion by replacing wav_to_tencent_silk_base64 with audio_to_tencent_silk_base64 and handling temporary file cleanup.
  • Update WeChatPadPro message event to use the new audio_to_tencent_silk_base64 for voice messages.

Copy link
Contributor

sourcery-ai bot commented Jun 17, 2025

## 审查者指南

通过引入到 PCM WAV 的即时转换(通过 pyffmpeg 或 ffmpeg)、重构 Silk 编码助手以及更新消息事件以使用通用转换器,在 WeChatPadPro TTS 中添加对 MP3 和其他非 WAV 音频格式的支持。

#### 在 WeChatPadPro 中发送支持 MP3/WAV 的 TTS 语音消息的序列图

```mermaid
sequenceDiagram
    participant WeChatPadProMessageEvent as WeChatPadProMessageEvent
    participant Record as Record
    participant TencentRecordHelper as tencent_record_helper
    participant FFmpeg as FFmpeg/pyffmpeg
    participant SilkEncoder as pysilk

    WeChatPadProMessageEvent->>Record: convert_to_file_path()
    WeChatPadProMessageEvent->>TencentRecordHelper: audio_to_tencent_silk_base64(record_path)
    alt input is not WAV
        TencentRecordHelper->>FFmpeg: convert_to_pcm_wav(input_path, temp_wav)
        FFmpeg-->>TencentRecordHelper: temp_wav
    end
    TencentRecordHelper->>SilkEncoder: Encode WAV to Silk
    SilkEncoder-->>TencentRecordHelper: silk_b64, duration
    TencentRecordHelper-->>WeChatPadProMessageEvent: silk_b64, duration
    WeChatPadProMessageEvent->>WeChatServer: Send voice message (silk_b64)

更新后的腾讯录音助手音频转换的类图

classDiagram
    class tencent_record_helper {
        +async convert_to_pcm_wav(input_path: str, output_path: str) str
        +async audio_to_tencent_silk_base64(audio_path: str) tuple[str, float]
        -async wav_to_tencent_silk_base64(wav_path: str) str (removed)
    }
    tencent_record_helper : Uses pyffmpeg or ffmpeg for conversion
    tencent_record_helper : Handles temp file cleanup
Loading

文件级别变更

变更 详情 文件
引入 convert_to_pcm_wav 以实现通用的音频格式转换
  • 添加了 convert_to_pcm_wav(input_path, output_path) 以生成 24kHz 的 PCM 16bit WAV
  • 尝试使用 pyffmpeg 进行转换,并回退到通过 asyncio 子进程的 ffmpeg CLI
  • 验证输出的存在/大小,并在失败时引发错误
astrbot/core/utils/tencent_record_helper.py
将 Silk 助手重构为具有临时文件管理的 audio_to_tencent_silk_base64
  • 用返回 (base64, duration) 的 audio_to_tencent_silk_base64 替换了 wav_to_tencent_silk_base64
  • 检测非 WAV 输入,通过 convert_to_pcm_wav 进行转换,并删除原始文件
  • 更新了缺少 pysilk 的错误消息,并处理临时 WAV/Silk 文件的清理
astrbot/core/utils/tencent_record_helper.py
切换 WeChatPadPro 语音发送以使用新的转换器
  • 导入了 audio_to_tencent_silk_base64 而不是 wav_to_tencent_silk_base64
  • 更新了 _send_voice 以调用新的转换器并解包返回的 base64 和 duration
astrbot/core/platform/sources/wechatpadpro/wechatpadpro_message_event.py

可能相关的 issue


提示和命令

与 Sourcery 互动

  • 触发新的审查: 在 pull request 上评论 @sourcery-ai review
  • 继续讨论: 直接回复 Sourcery 的审查评论。
  • 从审查评论生成 GitHub issue: 通过回复审查评论,要求 Sourcery 从审查评论创建一个 issue。您也可以回复审查评论并使用 @sourcery-ai issue 从中创建一个 issue。
  • 生成 pull request 标题: 在 pull request 标题中的任何位置写入 @sourcery-ai 以随时生成标题。您也可以在 pull request 上评论 @sourcery-ai title 以随时(重新)生成标题。
  • 生成 pull request 摘要: 在 pull request 正文中的任何位置写入 @sourcery-ai summary 以随时在您想要的位置生成 PR 摘要。您也可以在 pull request 上评论 @sourcery-ai summary 以随时(重新)生成摘要。
  • 生成审查者指南: 在 pull request 上评论 @sourcery-ai guide 以随时(重新)生成审查者指南。
  • 解决所有 Sourcery 评论: 在 pull request 上评论 @sourcery-ai resolve 以解决所有 Sourcery 评论。如果您已经解决了所有评论并且不想再看到它们,这将非常有用。
  • 驳回所有 Sourcery 审查: 在 pull request 上评论 @sourcery-ai dismiss 以驳回所有现有的 Sourcery 审查。如果您想重新开始新的审查,这将特别有用 - 不要忘记评论 @sourcery-ai review 以触发新的审查!

自定义您的体验

访问您的 仪表板 以:

  • 启用或禁用审查功能,例如 Sourcery 生成的 pull request 摘要、审查者指南等。
  • 更改审查语言。
  • 添加、删除或编辑自定义审查说明。
  • 调整其他审查设置。

获取帮助

```
Original review guide in English

Reviewer's Guide

Adds support for MP3 and other non-WAV audio formats in WeChatPadPro TTS by introducing on-the-fly conversion to PCM WAV (via pyffmpeg or ffmpeg), refactoring the Silk encoding helper, and updating the message event to use the generalized converter.

Sequence diagram for sending TTS voice message with MP3/WAV support in WeChatPadPro

sequenceDiagram
    participant WeChatPadProMessageEvent as WeChatPadProMessageEvent
    participant Record as Record
    participant TencentRecordHelper as tencent_record_helper
    participant FFmpeg as FFmpeg/pyffmpeg
    participant SilkEncoder as pysilk

    WeChatPadProMessageEvent->>Record: convert_to_file_path()
    WeChatPadProMessageEvent->>TencentRecordHelper: audio_to_tencent_silk_base64(record_path)
    alt input is not WAV
        TencentRecordHelper->>FFmpeg: convert_to_pcm_wav(input_path, temp_wav)
        FFmpeg-->>TencentRecordHelper: temp_wav
    end
    TencentRecordHelper->>SilkEncoder: Encode WAV to Silk
    SilkEncoder-->>TencentRecordHelper: silk_b64, duration
    TencentRecordHelper-->>WeChatPadProMessageEvent: silk_b64, duration
    WeChatPadProMessageEvent->>WeChatServer: Send voice message (silk_b64)
Loading

Class diagram for updated Tencent record helper audio conversion

classDiagram
    class tencent_record_helper {
        +async convert_to_pcm_wav(input_path: str, output_path: str) str
        +async audio_to_tencent_silk_base64(audio_path: str) tuple[str, float]
        -async wav_to_tencent_silk_base64(wav_path: str) str (removed)
    }
    tencent_record_helper : Uses pyffmpeg or ffmpeg for conversion
    tencent_record_helper : Handles temp file cleanup
Loading

File-Level Changes

Change Details Files
Introduce convert_to_pcm_wav for versatile audio format conversion
  • Added convert_to_pcm_wav(input_path, output_path) to generate PCM 16bit WAV at 24kHz
  • Attempt conversion with pyffmpeg and fallback to ffmpeg CLI via asyncio subprocess
  • Validate output existence/size and raise error on failure
astrbot/core/utils/tencent_record_helper.py
Refactor Silk helper into audio_to_tencent_silk_base64 with temp management
  • Replaced wav_to_tencent_silk_base64 with audio_to_tencent_silk_base64 returning (base64, duration)
  • Detect non-WAV input, convert via convert_to_pcm_wav, and remove originals
  • Update error message for missing pysilk and handle cleanup of temp WAV/Silk files
astrbot/core/utils/tencent_record_helper.py
Switch WeChatPadPro voice sending to use the new converter
  • Imported audio_to_tencent_silk_base64 instead of wav_to_tencent_silk_base64
  • Updated _send_voice to call new converter and unpack returned base64 & duration
astrbot/core/platform/sources/wechatpadpro/wechatpadpro_message_event.py

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhx8702 - 我已经查看了你的更改 - 这里有一些反馈:

  • 避免删除 audio_to_tencent_silk_base64 中的原始 audio_path——只删除你创建的文件(临时 WAV 或 SILK),以防止意外的数据丢失。
  • 当 pyffmpeg 导入和 ffmpeg 子进程转换都失败时,添加显式的错误处理或清晰的异常,以便调用者获得有意义的失败消息。
AI 代理的提示
请解决此代码审查中的评论:
## 总体评论
- 避免删除 `audio_path` 中的原始 `audio_to_tencent_silk_base64`——只删除你创建的文件(临时 WAV 或 SILK),以防止意外的数据丢失。
- 当 pyffmpeg 导入和 ffmpeg 子进程转换都失败时,添加显式的错误处理或清晰的异常,以便调用者获得有意义的失败消息。

## 单独评论

### 评论 1
<location> `astrbot/core/utils/tencent_record_helper.py:98` </location>
<code_context>
+        stdout, stderr = await p.communicate()
+        logger.info(f"[FFmpeg] stdout: {stdout.decode().strip()}")
+        logger.debug(f"[FFmpeg] stderr: {stderr.decode().strip()}")
+        logger.info(f"[FFmpeg] return code: {p.returncode}")
+
+    if os.path.exists(output_path) and os.path.getsize(output_path) > 0:
</code_context>

<issue_to_address>
在非零 ffmpeg 退出代码上引发异常

在 ffmpeg 执行后检查 `p.returncode`,如果它非零,则引发错误,以便及时捕获失败的转换。
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
        logger.info(f"[FFmpeg] stdout: {stdout.decode().strip()}")
        logger.debug(f"[FFmpeg] stderr: {stderr.decode().strip()}")
        logger.info(f"[FFmpeg] return code: {p.returncode}")

    if os.path.exists(output_path) and os.path.getsize(output_path) > 0:
        return output_path
    else:
        raise RuntimeError("生成的WAV文件不存在或为空")
=======
        logger.info(f"[FFmpeg] stdout: {stdout.decode().strip()}")
        logger.debug(f"[FFmpeg] stderr: {stderr.decode().strip()}")
        logger.info(f"[FFmpeg] return code: {p.returncode}")

        if p.returncode != 0:
            raise RuntimeError(
                f"FFmpeg 进程失败,返回码: {p.returncode}\n"
                f"stderr: {stderr.decode().strip()}"
            )

    if os.path.exists(output_path) and os.path.getsize(output_path) > 0:
        return output_path
    else:
        raise RuntimeError("生成的WAV文件不存在或为空")
>>>>>>> REPLACE

</suggested_fix>

Sourcery 对开源是免费的 - 如果你喜欢我们的评论,请考虑分享它们 ✨
帮助我更有用!请点击每个评论上的 👍 或 👎,我将使用反馈来改进你的评论。
Original comment in English

Hey @zhx8702 - I've reviewed your changes - here's some feedback:

  • Avoid deleting the original audio_path in audio_to_tencent_silk_base64—only remove files you create (temp WAV or SILK) to prevent unexpected data loss.
  • Add explicit error handling or a clear exception when both the pyffmpeg import and the ffmpeg subprocess conversion fail, so callers get a meaningful failure message.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Avoid deleting the original `audio_path` in `audio_to_tencent_silk_base64`—only remove files you create (temp WAV or SILK) to prevent unexpected data loss.
- Add explicit error handling or a clear exception when both the pyffmpeg import and the ffmpeg subprocess conversion fail, so callers get a meaningful failure message.

## Individual Comments

### Comment 1
<location> `astrbot/core/utils/tencent_record_helper.py:98` </location>
<code_context>
+        stdout, stderr = await p.communicate()
+        logger.info(f"[FFmpeg] stdout: {stdout.decode().strip()}")
+        logger.debug(f"[FFmpeg] stderr: {stderr.decode().strip()}")
+        logger.info(f"[FFmpeg] return code: {p.returncode}")
+
+    if os.path.exists(output_path) and os.path.getsize(output_path) > 0:
</code_context>

<issue_to_address>
Raise on non-zero ffmpeg exit code

Check `p.returncode` after ffmpeg execution and raise an error if it is non-zero to catch failed conversions promptly.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
        logger.info(f"[FFmpeg] stdout: {stdout.decode().strip()}")
        logger.debug(f"[FFmpeg] stderr: {stderr.decode().strip()}")
        logger.info(f"[FFmpeg] return code: {p.returncode}")

    if os.path.exists(output_path) and os.path.getsize(output_path) > 0:
        return output_path
    else:
        raise RuntimeError("生成的WAV文件不存在或为空")
=======
        logger.info(f"[FFmpeg] stdout: {stdout.decode().strip()}")
        logger.debug(f"[FFmpeg] stderr: {stderr.decode().strip()}")
        logger.info(f"[FFmpeg] return code: {p.returncode}")

        if p.returncode != 0:
            raise RuntimeError(
                f"FFmpeg 进程失败,返回码: {p.returncode}\n"
                f"stderr: {stderr.decode().strip()}"
            )

    if os.path.exists(output_path) and os.path.getsize(output_path) > 0:
        return output_path
    else:
        raise RuntimeError("生成的WAV文件不存在或为空")
>>>>>>> REPLACE

</suggested_fix>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant