Skip to content

feat(tts):增加tts(阿里云)提供商CosyVoice TTS(API),Qwen TTS Realtime(API)的支持,增加过滤 TTS 文本中的内容功能#7651

Open
yuxwd wants to merge 7 commits intoAstrBotDevs:masterfrom
yuxwd:master
Open

feat(tts):增加tts(阿里云)提供商CosyVoice TTS(API),Qwen TTS Realtime(API)的支持,增加过滤 TTS 文本中的内容功能#7651
yuxwd wants to merge 7 commits intoAstrBotDevs:masterfrom
yuxwd:master

Conversation

@yuxwd
Copy link
Copy Markdown

@yuxwd yuxwd commented Apr 18, 2026

Modifications / 改动点

tts提供商添加

项目原生tts阿里云提供商tts支持不全面,改动添加了CosyVoice TTS(API),Qwen TTS Realtime(API)的支持

过滤 TTS 文本中的内容

bot发送tts优化,增加了,过滤 TTS 文本中的内容的功能,可以让tts不读()的内容,支持正则过滤

代码测试

进行了macos系统和linux(Alibaba Cloud Linux 3.2104 LTS 64位)测试无问题
image
image
image
7879e091e4a7a7467ffcae7fc383027e

  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果


Checklist / 检查清单

  • [✅ ] 😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
    / 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。

  • [✅ ] 👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
    / 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”

  • [ ✅ ] 🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
    / 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到 requirements.txtpyproject.toml 文件相应位置。

  • [ ✅ ] 😮 My changes do not introduce malicious code.
    / 我的更改没有引入恶意代码。

yuxwd and others added 6 commits April 18, 2026 22:08
Add two new TTS providers using Alibaba Cloud DashScope SDK:
- Qwen TTS Realtime: WebSocket streaming TTS with low latency, supports qwen3-tts-flash-realtime and qwen3-tts-instruct-flash-realtime models
- CosyVoice TTS: Non-streaming TTS with multiple voice options, supports cosyvoice-v3.5/v3/v2 models

Includes config templates, provider manager integration, and i18n translations (zh-CN, en-US, ru-RU).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@auto-assign auto-assign bot requested review from Raven95676 and Soulter April 18, 2026 15:04
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, we are unable to review this pull request

The GitHub API does not allow us to fetch diffs exceeding 300 files, and this pull request has 600

@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels Apr 18, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a TTS text filtering mechanism to strip markers like brackets and asterisks from text before synthesis, and adds support for Qwen TTS Realtime and CosyVoice TTS providers. The review feedback points out several critical issues: a logic error in the Qwen streaming implementation that causes audio duplication, a blocking call in an asynchronous function that could impact responsiveness, and incorrect usage of the DashScope SDK in the CosyVoice provider. Additionally, the FilteredQueue implementation requires a call to the base class constructor to ensure all inherited methods function correctly.

Comment on lines +327 to +332
if accumulated_text:
await loop.run_in_executor(
None,
qwen_tts.append_text,
accumulated_text,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

get_audio_stream 方法中,text_part 已经在循环中通过 qwen_tts.append_text(text_part) 实时发送给了 API。当 text_partNone 时,再次发送 accumulated_text 会导致整个文本被重复发送给 TTS 引擎,从而导致生成的音频包含重复的内容。建议移除这段逻辑。

callback.complete_event.set()

await loop.run_in_executor(None, _connect_and_send)
finished = callback.wait_for_finished(timeout=self.timeout)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

callback.wait_for_finished(timeout=self.timeout) 是一个阻塞同步调用(内部使用 threading.Event.wait),在异步函数中直接调用会阻塞 asyncio 事件循环,导致机器人响应变慢。建议使用 loop.run_in_executor 将其放入线程池执行。

Suggested change
finished = callback.wait_for_finished(timeout=self.timeout)
finished = await loop.run_in_executor(None, callback.wait_for_finished, self.timeout)

Comment on lines +87 to +92
audio_bytes = await loop.run_in_executor(
None,
synthesizer.call,
text,
self.timeout_ms,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

dashscope.audio.tts_v2.SpeechSynthesizer.call 返回的是 SpeechSynthesisResult 对象,直接将其作为音频字节写入文件会导致 TypeError。此外,call 方法的签名通常不接受 timeout 作为位置参数。建议修正调用方式并使用 result.get_audio_data() 获取音频字节。

        result = await loop.run_in_executor(\n            None,\n            synthesizer.call,\n            text,\n        )\n        audio_bytes = result.get_audio_data()

Comment on lines +49 to +55
self,
real_queue: asyncio.Queue[T | None],
custom_rules: list[str] | None = None,
) -> None:
self._real_queue = real_queue
self._custom_rules = custom_rules

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

FilteredQueue 继承自 asyncio.Queue 但未调用 super().__init__()。虽然该类目前通过代理模式重写了主要方法,但未初始化的基类会导致 get_nowaitput_nowait 等未被重写的方法在调用时因缺少内部状态(如 _getters, _putters)而失败。建议显式调用 super().__init__() 以确保对象状态完整。

    def __init__(\n        self,\n        real_queue: asyncio.Queue[T | None],\n        custom_rules: list[str] | None = None,\n    ) -> None:\n        super().__init__(maxsize=real_queue.maxsize)\n        self._real_queue = real_queue\n        self._custom_rules = custom_rules

@yuxwd yuxwd closed this Apr 18, 2026
@yuxwd yuxwd reopened this Apr 18, 2026
@yuxwd
Copy link
Copy Markdown
Author

yuxwd commented Apr 18, 2026

服务器测试情况

image image image

@Soulter Soulter force-pushed the master branch 2 times, most recently from faf411f to 0068960 Compare April 19, 2026 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant