What happens when SSML alphabet not specified #1706

mattgarrish · 2021-06-17T15:58:47Z

Another of the holes in our TTS definitions is we don't define what to make of an ssml:ph attribute that doesn't have an ssml:alphabet defined for it. Do we default to... IPA? To whatever the actual TTS engine supports by default?

It's not even required that there be an in-scope alphabet definition.

Also, what happens if the alphabet is defined but not supported? Default to IPA again?

We need to figure out what are really reading system requirements and what information is just passed to a tts engine.

murata2makoto · 2021-06-17T20:47:36Z

Lentrance supports SSML and is a member of the Japanese DAISY Consortium. I will ask.

mattgarrish · 2021-06-17T21:46:49Z

I'm assuming if a reading system is voicing the html it's sending the text content to the engine, so if there isn't an alphabet or it knows the engine doesn't support the grammar, the reading system should just send the actual text instead of the the value of the ssml:ph attribute.

But it would definitely be good to know what someone who has actually tried to implement this has made of our instructions.

Another oddity is that we don't even say to use the ssml:ph value in place of the text when having it voiced. There seems to be an assumption that the text and markup are sent to the tts engine and it makes sense of what to do with these.

That wasn't my experience in the past generating tts from html, as we had to inject the phonemes into the text content of the files prior to voicing. The question I keep having when I look at these is are we trying to work with known tts engines or is this trying to model a new type of tts engine? If we're trying to create the latter, we need a lot more detail.

murata2makoto · 2021-06-17T23:16:21Z

Interactions of TTS engines and browsers/RSes are implementation-dependent gray area. Different guys appear to do different things. JDC is studying this topic (especially for ruby) and should continue to do so. Having said that, I do not think the first edition of the planned note can entirely solve this hard problem. Hopefully, we can say what implementations do about ssml:alphabet.

mattgarrish · 2021-06-18T00:25:59Z

Different guys appear to do different things.

Right, this is what makes our requirements confusing.

I do not think the first edition of the planned note can entirely solve this hard problem.

Agree. I'd just like to see the requirements allow adoption to different approaches.

Here probably all we need to say is that the reading system should use the text content when an alphabet isn't specified but allow it to supply a default. Similarly, to use the supplied phonemes in place of the text content when an alphabet is specified. We don't need to force a single solution or get bogged down in minutiae.

An intro that clarifies that there are different models wouldn't hurt, either.

murata2makoto · 2021-06-22T13:10:18Z

@okayama247 Do you know what will happen when the alphabet is not specified? I guess that implementations in Japan always assume the x-jeita.

mattgarrish · 2021-06-22T14:59:10Z

I was looking at the SSML definition today, and it defines processing behaviours, including leaving it to processors to handle when not specified:

It is an error if a value for alphabet is specified that is not known or cannot be applied by a synthesis processor. The default behavior when the alphabet attribute is left unspecified is processor-specific.

https://www.w3.org/TR/speech-synthesis11/#g9

To avoid inconsistencies with ssml, we should probably also adopt all processing behaviours for the two elements, not just inherit their semantics as we currently have.

mattgarrish added the Spec-TTS The issue affects the EPUB 3 Text-to-Speech Enhancements 1.0 WG Note label Jun 17, 2021

mattgarrish mentioned this issue Jun 25, 2021

SSML and PLS clarifications #1723

Merged

iherman closed this as completed in #1723 Jun 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What happens when SSML alphabet not specified #1706

What happens when SSML alphabet not specified #1706

mattgarrish commented Jun 17, 2021

murata2makoto commented Jun 17, 2021

mattgarrish commented Jun 17, 2021 •

edited

Loading

murata2makoto commented Jun 17, 2021

mattgarrish commented Jun 18, 2021

murata2makoto commented Jun 22, 2021 •

edited

Loading

mattgarrish commented Jun 22, 2021

What happens when SSML alphabet not specified #1706

What happens when SSML alphabet not specified #1706

Comments

mattgarrish commented Jun 17, 2021

murata2makoto commented Jun 17, 2021

mattgarrish commented Jun 17, 2021 • edited Loading

murata2makoto commented Jun 17, 2021

mattgarrish commented Jun 18, 2021

murata2makoto commented Jun 22, 2021 • edited Loading

mattgarrish commented Jun 22, 2021

mattgarrish commented Jun 17, 2021 •

edited

Loading

murata2makoto commented Jun 22, 2021 •

edited

Loading