-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What happens when SSML alphabet not specified #1706
Comments
Lentrance supports SSML and is a member of the Japanese DAISY Consortium. I will ask. |
I'm assuming if a reading system is voicing the html it's sending the text content to the engine, so if there isn't an alphabet or it knows the engine doesn't support the grammar, the reading system should just send the actual text instead of the the value of the But it would definitely be good to know what someone who has actually tried to implement this has made of our instructions. Another oddity is that we don't even say to use the That wasn't my experience in the past generating tts from html, as we had to inject the phonemes into the text content of the files prior to voicing. The question I keep having when I look at these is are we trying to work with known tts engines or is this trying to model a new type of tts engine? If we're trying to create the latter, we need a lot more detail. |
Interactions of TTS engines and browsers/RSes are implementation-dependent gray area. Different guys appear to do different things. JDC is studying this topic (especially for ruby) and should continue to do so. Having said that, I do not think the first edition of the planned note can entirely solve this hard problem. Hopefully, we can say what implementations do about ssml:alphabet. |
Right, this is what makes our requirements confusing.
Agree. I'd just like to see the requirements allow adoption to different approaches. Here probably all we need to say is that the reading system should use the text content when an alphabet isn't specified but allow it to supply a default. Similarly, to use the supplied phonemes in place of the text content when an alphabet is specified. We don't need to force a single solution or get bogged down in minutiae. An intro that clarifies that there are different models wouldn't hurt, either. |
@okayama247 Do you know what will happen when the alphabet is not specified? I guess that implementations in Japan always assume the x-jeita. |
I was looking at the SSML definition today, and it defines processing behaviours, including leaving it to processors to handle when not specified:
https://www.w3.org/TR/speech-synthesis11/#g9 To avoid inconsistencies with ssml, we should probably also adopt all processing behaviours for the two elements, not just inherit their semantics as we currently have. |
Another of the holes in our TTS definitions is we don't define what to make of an ssml:ph attribute that doesn't have an ssml:alphabet defined for it. Do we default to... IPA? To whatever the actual TTS engine supports by default?
It's not even required that there be an in-scope alphabet definition.
Also, what happens if the alphabet is defined but not supported? Default to IPA again?
We need to figure out what are really reading system requirements and what information is just passed to a tts engine.
The text was updated successfully, but these errors were encountered: