Issue 0195 audio descriptions #349

skynavga · 2017-05-30T22:20:25Z

This is a preliminary PR for enhancements to audio functionality in order to more fully specify audio related behavior as well as add missing functionality needed to support common audio usage scenarios, such as text to speech and audio descriptions. Additional work in this branch is required prior to merger.

…span, tta:speak (#195).

nigelmegitt

Great work in progress @skynavga , thank you for this. I've added some constructive comments. In my view it is very nearly complete already.

If you would like me to draft a section explaining the mapping to a speech and web audio graph for the purpose of explaining the "presentation" model I can give that a go. It would be an annex most likely.

nigelmegitt · 2017-05-31T15:16:39Z

spec/ttml2.xml

+<div3 id="audio-style-attribute-gain">
+<head>tta:gain</head>
+<p>The <att>tta:gain</att> attribute is used to specify an audio style property that
+determines a <emph>gain</emph> multiplier to be applied to the the sum of all active audio content during


I think this should not be all active audio content, but the active audio content in the context of the element to which it applies. Indeed this is confirmed by the example and note below, which are exactly what I would expect.

nigelmegitt · 2017-05-31T15:17:36Z

spec/ttml2.xml

+<tr>
+<td><emph>Values:</emph></td>
+<td>
+<code><loc href="#style-value-percentage">&lt;number&gt;</loc></code>


In the Web Audio gain node this is a float. I don't see why we should not also make it a float here. Limiting to [-1,1] seems unnecessarily limiting.

I would like to add that we base the semantics of this on the GainNode as linked above.

Having checked how we handle floats, we just need to remove the [-1,1] interval restriction because <number> is already effectively a float.

nigelmegitt · 2017-05-31T16:05:06Z

spec/ttml2.xml

+</tr>
+</tbody>
+</table>
+<p>For the purpose of determining applicability of this audio style property,


This is actually weird here isn't it?

nigelmegitt · 2017-06-01T07:55:47Z

spec/ttml2.xml

+then the computed value of the property associated with this attribute is clamped to this bounded interval.</p>
+<p>If the computed value of the property associated with this attribute is negative, then gain is set to
+the absolute value of the computed value and a phase inversion is applied.</p>
+<p>If gain is 0, then active audio content is fully muted. If gain is 1, then the amplitude of active


s/active/the applicable

nigelmegitt · 2017-06-01T07:56:34Z

spec/ttml2.xml

+  and this child <code>p</code> combines a second active audio track to form the output to its children.
+  Furthermore, distinct gains are specified on each source audio as well as on the output of <code>p</code>,
+  such that the final output is <code>0.3[0.5(track1) + 0.8(track2)]</code>.</p>
+</note>


Good example and description - this is exactly what I was expecting/intending. It shows why gain needs to be applicable to p also.

nigelmegitt · 2017-06-01T09:49:55Z

spec/ttml2.xml

+</table>
+<p>For the purpose of determining applicability of this audio style property,
+each character child of a <el>p</el> element is considered to be enclosed in an anonymous
+span.</p>


Again not sure what value this adds here.

nigelmegitt · 2017-06-01T09:57:38Z

spec/ttml2.xml

+<p>For the purpose of determining applicability of this audio style property,
+each character child of a <el>p</el> element is considered to be enclosed in an anonymous
+span.</p>
+<p>If the specified value of this attribute is not contained in the interval <code>[-1,1]</code>,


I'd like to state that we base the semantics for this on StereoPanner. This deals with how many input channels are processed and how many output channels there are, i.e. two in each case, with a requirement to up- or down-mix the input to 2 channels if necessary.

this has been done now.

nigelmegitt · 2017-06-01T10:02:30Z

spec/ttml2.xml

+  The <code>div</code> element provides one active audio track as an output to its child <code>p</code>,
+  and this child <code>p</code> combines a second active audio track to form the output to its children.
+  Furthermore, distinct pans are specified on each source audio as well as on the output of <code>p</code>,
+  such that the final output pan is <code>0.3[0.5(track1) + 0.8(track2)]</code>.</p>


I like the example and the derivation note, however the last part is really unclear - what does the mathematical expression mean here? I think we need to understand the equivalent resulting positions of track1 and track2.

nigelmegitt · 2017-06-01T10:14:13Z

spec/ttml2.xml

+</div3>
+<div3 id="audio-style-attribute-pitch">
+<head>tta:pitch</head>
+<p>The <att>tta:pitch</att> attribute is used to specify an audio style property that


I see that we reference SSML for this - specifically the reference is to §3.2.4, the SSML prosody element's pitch attribute.

nigelmegitt · 2017-06-01T10:51:03Z

spec/ttml2.xml

+</tbody>
+</table>
+<note role="derivation">
+<p>The semantics of the style property represented by this attribute are based upon 


Specifically the presence of a tta:speak attribute produces the semantic of the SSML §3.1.1 speak element whose p/s contents are the span's character content with a SSML §3.2.4 prosody element whose rate attribute is set to the tta:speak attribute's value.

while still excluding pitch and speak from embedded audio elements.

* Make `tta:gain` an unconstrained `<number>` * `tta:gain` and `tta:pan` apply additionally to `p` and `div` elements. * Change “[all] active audio” to “applicable audio” in advance of adding a normative section on audio processing semantics.

* Add a term for “audio generating element” * Clarify the semantic derivation of gain and pan as they relate do audio generating elements vs non audio generating elements * Add reference to the WD of WebAudio with an Editorial Note to indicate the basis in doing so is the expectation that WebAudio will be a Recommendation prior to TTML2, otherwise we will need to refactor.

In line with spec modification, remove the constraint restricting `tta:gain` values - they are permitted to be any number from -infinity to +infinity.

nigelmegitt · 2017-06-14T16:15:20Z

I've generated pull request #393 into this branch which, if merged, will address my review comments. @skynavga I've requested your review - if it's okay please go ahead and merge it into here.

Issue 0195 add ad xsds

nigelmegitt · 2017-06-20T16:41:55Z

spec/ttml2.xml

 <p>If the computed value of the property associated with this attribute is negative, then gain is set to
 the absolute value of the computed value and a phase inversion is applied.</p>
-<p>If gain is 0, then the applicable audio content is fully muted. If gain is 1, then the amplitude of 
-the applicable audio content is not modified.</p>
+<p>If gain is 0, then active audio content is fully muted. If gain is 1, then the amplitude of active


@skynavga why restrict the gain to the range [-1,1] when gains greater than 1 are reasonable things to apply, and are supported e.g. by GainNode?

skynavga added 5 commits March 2, 2017 02:07

Add initial equipment for AD, namely, clip{Begin,End}, tta:gain, tta:…

a423f29

…span, tta:speak (#195).

Regenerate ED.

3811a7e

Progress work on AD features (#195).

621d86b

Regenerate ED.

346b135

Merge branch 'gh-pages' into issue-0195-audio-descriptions

6eaa13a

nigelmegitt reviewed Jun 1, 2017

View reviewed changes

Tom Rosier and others added 12 commits June 13, 2017 17:23

added attributes to xsd

b3a4adc

added audio attributes to span

7ae5c4b

fixed validation issues with nigels guidence

76c4300

added in audio element work

cb43b6e

Revert over-enthusiastic optimisation

cec8755

Import audio attributes

1f4d0ff

added audio attributes to p tag

96bce79

Add audio attributes to RelaxNG schema

a449c6e

Allow all audio attributes on p and span

d9599c3

while still excluding pitch and speak from embedded audio elements.

Address some review comments

cf3a7ce

* Make `tta:gain` an unconstrained `<number>` * `tta:gain` and `tta:pan` apply additionally to `p` and `div` elements. * Change “[all] active audio” to “applicable audio” in advance of adding a normative section on audio processing semantics.

Regenerate schema zip

43d4573

nigelmegitt mentioned this pull request Jun 14, 2017

Issue 0195 add ad xsds #393

Merged

Remove tta:gain constraints

721932c

In line with spec modification, remove the constraint restricting `tta:gain` values - they are permitted to be any number from -infinity to +infinity.

nigelmegitt and others added 5 commits June 14, 2017 17:52

Add "none" to tta:speak enumeration

7cc2623

Merge pull request #393 from w3c/issue-0195-add-ad-xsds

7b90178

Issue 0195 add ad xsds

Revert or revise audio properties as needed; fix schema issues (#393).

32fc17c

Revert ED and schema archives.

ee1e115

Merge branch 'gh-pages' into issue-0195-audio-descriptions

587187d

skynavga merged commit f8f06de into gh-pages Jun 18, 2017

nigelmegitt reviewed Jun 20, 2017

View reviewed changes

skynavga deleted the issue-0195-audio-descriptions branch August 21, 2017 16:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue 0195 audio descriptions #349

Issue 0195 audio descriptions #349

skynavga commented May 30, 2017 •

edited

Loading

nigelmegitt left a comment

nigelmegitt May 31, 2017

nigelmegitt May 31, 2017

nigelmegitt Jun 14, 2017

nigelmegitt May 31, 2017

nigelmegitt Jun 1, 2017

nigelmegitt Jun 1, 2017

nigelmegitt Jun 1, 2017

nigelmegitt Jun 1, 2017

nigelmegitt Jun 20, 2017

nigelmegitt Jun 1, 2017

nigelmegitt Jun 1, 2017

nigelmegitt Jun 1, 2017

nigelmegitt commented Jun 14, 2017

nigelmegitt Jun 20, 2017

Issue 0195 audio descriptions #349

Issue 0195 audio descriptions #349

Conversation

skynavga commented May 30, 2017 • edited Loading

nigelmegitt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nigelmegitt commented Jun 14, 2017

Choose a reason for hiding this comment

skynavga commented May 30, 2017 •

edited

Loading