TTS Extension for SillyTavern

这是一个为 SillyTavern（酒馆）开发的文本转语音（TTS）插件系统。

主要功能

核心功能

多种 TTS 提供商支持：支持 OpenAI、ElevenLabs、Edge TTS 等多种 TTS 服务
智能消息读取：自动读取聊天消息中的 mes_text 内容进行语音合成
悬浮按钮控制：提供快速访问的悬浮按钮，点击即可朗读最新消息
多语音支持：为不同角色分配不同的语音
HTML 标签自动过滤：智能去除消息中的 HTML 标签，确保语音合成质量

使用方法

基本设置

在 TTS 设置中选择您喜欢的 TTS 提供商
配置 API 端点和密钥（如需要）
为角色分配语音

悬浮按钮使用

位置：页面右下角的紫色圆形按钮，带有音量图标
功能：
- 点击按钮：自动朗读最新的聊天消息
- 播放时再次点击：停止当前播放
- 播放时按钮会变为粉色并有脉冲动画效果
智能处理：
- 自动从 .mes_text 元素提取文本
- 智能去除所有 HTML 标签（,   等）
- 只朗读纯净的文本内容
状态提示：
- 通过 toastr 通知显示操作状态
- 按钮颜色和动画反映播放状态

OpenAI Compatible 提供商

端点配置：设置您的 TTS API 端点
模型选择：选择 TTS 模型（如 tts-1）
可用语音：配置可用的语音列表（逗号分隔）
语速控制：调整语音播放速度（0.25-4.0）

技术说明

消息文本提取逻辑

系统会从 div.mes_text 中提取文本内容：

查找页面上的消息块（.mes_block）
定位其中的 .mes_text 元素
提取文本内容并移除所有 HTML 标签（包括 ,   等）
将清理后的文本发送给 TTS 引擎

HTML 标签过滤

使用 textContent 属性自动去除所有 HTML 标签，确保：

无  段落标签
无   换行标签
无其他任何 HTML 格式化标签
保留纯文本内容和自然的空格

实现细节

1. OpenAI Compatible 提供商修改（`openai-compatible.js`）

新增了 extractTextFromMesBlock() 方法：

// 从 DOM 中提取最新消息的 mes_text 内容
// 使用 querySelector 定位元素
// 使用 textContent 自动去除 HTML 标签

修改了 fetchTtsGeneration() 方法：

在发送 API 请求前，先从 DOM 提取文本
如果提取成功，使用提取的文本；否则回退到原始的 inputText 参数
确保向后兼容性，不会破坏现有功能

2. 悬浮按钮功能（`index.js`）

新增功能：

extractLatestMesText() - 提取最新消息文本的通用函数
onFloatingButtonClick() - 处理悬浮按钮点击事件
addFloatingButton() - 创建并添加悬浮按钮到页面

交互逻辑：

点击按钮检查 TTS 是否启用
提取最新消息的纯文本内容
如果正在播放，停止播放；否则开始新的播放
添加视觉反馈（按钮颜色变化和动画）
播放结束自动移除播放状态

3. 样式设计（`style.css`）

悬浮按钮样式特点：

固定定位：position: fixed 在右下角
渐变背景：紫色渐变，播放时切换为粉色渐变
交互反馈：
- 悬停时放大（scale(1.1)）
- 点击时缩小（scale(0.95)）
- 播放时脉冲动画
高层级：z-index: 9999 确保始终可见
响应式设计：圆形按钮，60x60px

代码质量保证

✅ 所有代码包含中文注释，便于理解
✅ 遵循最小修改原则，不破坏现有功能
✅ 添加了详细的 JSDoc 注释
✅ 包含错误处理和边界情况检查
✅ 通过 linter 检查，无语法错误
✅ 向后兼容，保持原有 API 接口不变

Provider Requirements

Because I don't know how, or if you can, and/or maybe I am just too lazy to implement interfaces in JS, here's the requirements of a provider that the extension needs to operate.

class YourTtsProvider

Required

Exported for use in extension index.js, and added to providers list in index.js

generateTts(text, voiceId)
fetchTtsVoiceObjects()
onRefreshClick()
checkReady()
loadSettings(settingsObject)
settings field
settingsHtml field

Optional

previewTtsVoice()
separator field
processText(text)
dispose()

Requirement Descriptions

generateTts(text, voiceId)

Must return audioData.type in ['audio/mpeg', 'audio/wav', 'audio/x-wav', 'audio/wave', 'audio/webm'] Must take text to be rendered and the voiceId to identify the voice to be used

fetchTtsVoiceObjects()

Required. Used by the TTS extension to get a list of voice objects from the provider. Must return an list of voice objects representing the available voices.

name: a friendly user facing name to assign to characters. Shows in dropdown list next to user.
voice_id: the provider specific id of the voice used in fetchTtsGeneration() call
preview_url: a URL to a local audio file that will be used to sample voices
lang: OPTIONAL language string

getVoice(voiceName)

Required. Must return a single voice object matching the provided voiceName. The voice object must have the following at least:

name: a friendly user facing name to assign to characters. Shows in dropdown list next to user.
voice_id: the provider specific id of the voice used in fetchTtsGeneration() call
preview_url: a URL to a local audio file that will be used to sample voices
lang: OPTIONAL language indicator

onRefreshClick()

Required. Users click this button to reconnect/reinit the selected provider. Responds to the user clicking the refresh button, which is intended to re-initialize the Provider into a working state, like retrying connections or checking if everything is loaded.

checkReady()

Required. Return without error to let TTS extension know that the provider is ready. Return an error to block the main TTS extension for initializing the provider and UI. The error will be put in the TTS extension UI directly.

loadSettings(settingsObject)

Required. Handle the input settings from the TTS extension on provider load. Put code in here to load your provider settings.

settings field

Required, used for storing any provider state that needs to be saved. Anything stored in this field is automatically persisted under extension_settings[providerName] by the main extension in saveTtsProviderSettings(), as well as loaded when the provider is selected in loadTtsProvider(provider). TTS extension doesn't expect any specific contents.

settingsHtml field

Required, injected into the TTS extension UI. Besides adding it, not relied on by TTS extension directly.

previewTtsVoice()

Optional. Function to handle playing previews of voice samples if no direct preview_url is available in fetchTtsVoiceObjects() response

separator field

Optional. Used when narrate quoted text is enabled. Defines the string of characters used to introduce separation between between the groups of extracted quoted text sent to the provider. The provider will use this to introduce pauses by default using ...

processText(text)

Optional. A function applied to the input text before passing it to the TTS generator. Can be async.

dispose()

Optional. Function to handle cleanup of provider resources when the provider is switched.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
css		css
lib		lib
alltalk.js		alltalk.js
azure.js		azure.js
chatterbox.js		chatterbox.js
coqui.js		coqui.js
coqui_api_models_settings.json		coqui_api_models_settings.json
coqui_api_models_settings_full.json		coqui_api_models_settings_full.json
cosyvoice.js		cosyvoice.js
edge.js		edge.js
elevenlabs.js		elevenlabs.js
google-native.js		google-native.js
google-translate.js		google-translate.js
gpt-sovits-v2.js		gpt-sovits-v2.js
gsvi.js		gsvi.js
index.js		index.js
kokoro-worker.js		kokoro-worker.js
kokoro.js		kokoro.js
manifest.json		manifest.json
minimax.js		minimax.js
novel.js		novel.js
openai-compatible.js		openai-compatible.js
openai.js		openai.js
pollinations.js		pollinations.js
readme.md		readme.md
sbvits2.js		sbvits2.js
settings.html		settings.html
silerotts.js		silerotts.js
speecht5.js		speecht5.js
style.css		style.css
system.js		system.js
tts-webui.js		tts-webui.js
vits.js		vits.js
xtts.js		xtts.js

roywangdev/tts

Folders and files

Latest commit

History

Repository files navigation

TTS Extension for SillyTavern

主要功能

核心功能

最新更新（自定义功能）

使用方法

基本设置

悬浮按钮使用

OpenAI Compatible 提供商

技术说明

消息文本提取逻辑

HTML 标签过滤

实现细节

1. OpenAI Compatible 提供商修改（openai-compatible.js）

2. 悬浮按钮功能（index.js）

3. 样式设计（style.css）

代码质量保证

Provider Requirements

class YourTtsProvider

Required

Optional

Requirement Descriptions

generateTts(text, voiceId)

fetchTtsVoiceObjects()

getVoice(voiceName)

onRefreshClick()

checkReady()

loadSettings(settingsObject)

settings field

settingsHtml field

previewTtsVoice()

separator field

processText(text)

dispose()

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. OpenAI Compatible 提供商修改（`openai-compatible.js`）

2. 悬浮按钮功能（`index.js`）

3. 样式设计（`style.css`）

Packages