Add support for voice styles to Text-to-Speech #1182
-
ContextText-to-Speech models can often generate the voice in different styles. Happy, friendly, angry, sad etc. Home Assistant is currently only able to expose a single style for each voice. The Text-to-Speech entities currently allow listing the supported languages, and per language get the supported voices (docs). The selected voice is passed as the DecisionWe extend the diff --git a/homeassistant/components/tts/models.py b/homeassistant/components/tts/models.py
index 2d693571a0f..0193f955646 100644
--- a/homeassistant/components/tts/models.py
+++ b/homeassistant/components/tts/models.py
@@ -9,3 +9,4 @@ class Voice:
voice_id: str
name: str
+ variants: list[str]Variants can only be picked if The variant is passed in the options dict passed to ConsequencesThe number of available voices/styles that a user can choose from for Text-to-Speech providers will greatly increase. Example integrations that will benefit:
AlternativesAs an alternative, we could list all styles of a voice as their own voice. For example, we would list AmyNeural:friendly, AmyNeural:sad etc. The downside is that this will result in very long lists and difficult to browse. Updates
|
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 3 replies
-
|
For clarification: is the provided style set in the configuration of the TTS in voice assistant settings or can it also be set as part of the action to augment a message? It would be nice to include a style as an parameter. |
Beta Was this translation helpful? Give feedback.
-
|
What can this be used for? Why do we need it?
Those seem to be 18 different voices, not styles. I mean OK, technically they may be styles, but from an application standpoint, they can't really be used as such.
That's a test env on your tenant which can't be accessed by anyone who doesn't have the right. |
Beta Was this translation helpful? Give feedback.
-
|
We have discussed this one in the architectural core meeting last week. The idea/concept is OK to add. However, we think it should be part of the objects of voices we already return. This existing dataclass could, in our opinion, be extended with a property that holds these styles. Also: Maybe use "variants" or "moods" instead of styles? 🤷 |
Beta Was this translation helpful? Give feedback.
-
|
In addition to the voice styles, it would also be useful to be able to set the voice rate/speed. |
Beta Was this translation helpful? Give feedback.
This is exactly as suggested and pre-approved. 👍
So, with that, this is a go 🚀
../Frenck