[fix] 将Minimax TTS默认输出格式改为wav以解决RIFF错误#7797
Conversation
## 问题 在 QQ 官方平台插件中,处理来自 Minimax TTS 的语音时,会抛出错误:`处理语音时出错: file does not start with RIFF id`。 ## 原因 Minimax TTS 提供商 (`minimax_tts_api_source.py`) 默认配置的音频输出格式为 `mp3`,而 `qqofficial_message_event.py` 中的 `wav_to_tencent_silk` 函数要求输入为 WAV 格式(具有 RIFF 文件头)。 ## 解决方案 将 `minimax_tts_api_source.py` 文件中 `ProviderMiniMaxTTSAPI` 类的 `audio_setting` 字典的 `format` 键值,从 `"mp3"` 修改为 `"wav"`。 ## 结果 修改后,Minimax TTS 生成的音频文件将直接为 WAV 格式,从而被下游函数正确识别和处理,修复上述错误。
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- Consider making the Minimax TTS output format configurable (e.g., via settings or constructor arguments) instead of hardcoding
wav, so other integrations that still expectmp3can opt in or out without code changes. - Now that the audio format is
wav, double-check that any places relying on file extensions or MIME types (e.g., upload APIs or response headers) use this value from a single source of truth rather than assumingmp3elsewhere.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider making the Minimax TTS output format configurable (e.g., via settings or constructor arguments) instead of hardcoding `wav`, so other integrations that still expect `mp3` can opt in or out without code changes.
- Now that the audio format is `wav`, double-check that any places relying on file extensions or MIME types (e.g., upload APIs or response headers) use this value from a single source of truth rather than assuming `mp3` elsewhere.
## Individual Comments
### Comment 1
<location path="astrbot/core/provider/sources/minimax_tts_api_source.py" line_range="148-150" />
<code_context>
temp_dir = get_astrbot_temp_path()
os.makedirs(temp_dir, exist_ok=True)
- path = os.path.join(temp_dir, f"minimax_tts_api_{uuid.uuid4()}.mp3")
+ path = os.path.join(temp_dir, f"minimax_tts_api_{uuid.uuid4()}.wav")
try:
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Derive the file extension from `audio_setting["format"]` to avoid future mismatches.
The container format is currently defined both in `audio_setting["format"]` and in the hardcoded `.wav` suffix. If they ever diverge, files will be mislabelled. Building the filename from `self.audio_setting["format"]` (e.g. `f"minimax_tts_api_{uuid.uuid4()}.{self.audio_setting['format']}"`) keeps them consistent and avoids this risk.
```suggestion
temp_dir = get_astrbot_temp_path()
os.makedirs(temp_dir, exist_ok=True)
path = os.path.join(
temp_dir,
f"minimax_tts_api_{uuid.uuid4()}.{self.audio_setting['format']}",
)
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| temp_dir = get_astrbot_temp_path() | ||
| os.makedirs(temp_dir, exist_ok=True) | ||
| path = os.path.join(temp_dir, f"minimax_tts_api_{uuid.uuid4()}.mp3") | ||
| path = os.path.join(temp_dir, f"minimax_tts_api_{uuid.uuid4()}.wav") |
There was a problem hiding this comment.
suggestion (bug_risk): Derive the file extension from audio_setting["format"] to avoid future mismatches.
The container format is currently defined both in audio_setting["format"] and in the hardcoded .wav suffix. If they ever diverge, files will be mislabelled. Building the filename from self.audio_setting["format"] (e.g. f"minimax_tts_api_{uuid.uuid4()}.{self.audio_setting['format']}") keeps them consistent and avoids this risk.
| temp_dir = get_astrbot_temp_path() | |
| os.makedirs(temp_dir, exist_ok=True) | |
| path = os.path.join(temp_dir, f"minimax_tts_api_{uuid.uuid4()}.mp3") | |
| path = os.path.join(temp_dir, f"minimax_tts_api_{uuid.uuid4()}.wav") | |
| temp_dir = get_astrbot_temp_path() | |
| os.makedirs(temp_dir, exist_ok=True) | |
| path = os.path.join( | |
| temp_dir, | |
| f"minimax_tts_api_{uuid.uuid4()}.{self.audio_setting['format']}", | |
| ) |
There was a problem hiding this comment.
Code Review
This pull request updates the audio output format from mp3 to wav in the MiniMax TTS API source. The reviewer suggests removing the bitrate parameter from the audio settings because it is only applicable to the mp3 format according to the API documentation.
| self.audio_setting: dict = { | ||
| "sample_rate": 32000, | ||
| "bitrate": 128000, | ||
| "format": "mp3", | ||
| "format": "wav", | ||
| } |
问题
在 QQ 官方平台插件中,处理来自 Minimax TTS 的语音时,会抛出错误:
处理语音时出错: file does not start with RIFF id。原因
Minimax TTS 提供商 (
minimax_tts_api_source.py) 默认配置的音频输出格式为mp3,而qqofficial_message_event.py中的wav_to_tencent_silk函数要求输入为 WAV 格式(具有 RIFF 文件头)。解决方案
将
minimax_tts_api_source.py文件中ProviderMiniMaxTTSAPI类的audio_setting字典的format键值,从"mp3"修改为"wav"。结果
修改后,Minimax TTS 生成的音频文件将直接为 WAV 格式,从而被下游函数正确识别和处理,修复上述错误。
相关 Issue
修复 issue #7144, issue #7795
Modifications / 改动点
minimax_tts_api_source.py"format": "mp3"改为"format": "wav"Screenshots or Test Results / 运行截图或测试结果
[2026-04-25 12:29:05.097] [Core] [INFO] [result_decorate.stage:301]: TTS 结果: /home/astrbot/AstrBot/data/temp/minimax_tts_api_31e97f25-bf49-45cb-a9ae-4fbcac7e93d1.wav

[2026-04-25 12:29:05.098] [Core] [INFO] [respond.stage:183]: Prepare to send - /DD74A1EA717091E5085E2CF5606E1FFC: [ComponentType.Record]
Checklist / 检查清单
😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
/ 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。
👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
/ 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”。
🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in
requirements.txtandpyproject.toml./ 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到
requirements.txt和pyproject.toml文件相应位置。😮 My changes do not introduce malicious code.
/ 我的更改没有引入恶意代码。
Summary by Sourcery
Bug Fixes: