Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarified TAB character presentation semantics #258

Merged
merged 8 commits into from
Dec 12, 2017
5 changes: 5 additions & 0 deletions spec/ttml1.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3320,6 +3320,11 @@ above if possible.</p>
<p>The semantics of the above four cited XSL-FO properties are defined by
by <bibref ref="xsl11"/>, &sect; 7.17.3, 7.16.7, 7.16.12, and 7.16.8, respectively.</p>
</note>
<note role="clarification">
Copy link
Contributor

@skynavga skynavga Nov 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change as follows:

Add normative text:

"If an unnormalized U+0009 Horizontal Tab (HT) character appears in a character information item of a Reduced XML Infoset, then, when presented visually, that HT character may be treated as if it were a single U+0020 (SPACE) character."

then change note to read

"Because the presentation semantics of U+0009 Horizontal Tab (HT) are not fixed, the use of HT in <code>#PCDATA</code> content is not recommended."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed note does not include the observation that the TAB (U+0009) character is not collapsed when xml:space="default", which is was led to this issue in the first place.

Copy link
Contributor

@skynavga skynavga Nov 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, the reason it does not include that observation is that it is incorrect. In fact, xml:space='default' maps to XSL-FO semantics white-space-collapse='true', which, for mixed elements, non-line-initial HT are required to be mapped SPACE. It is only the edge case of a mixed element line-initial HT where XSL-FO is unclear on whether it is mapped to SPACE or not, and there is some (indirect) evidence in XSL-FO that it is.

Therefore, there is no question that HT in non-line-initial positions in mixed content are mapped to SPACE (and therefore subject to normal SPACE collapse rules). Only in the case of line-initial HT in mixed content does a possible ambiguity arise. Furthermore, it would be very odd indeed for an implementation that collapses non-line-initial HTs (which is required) to not also collapse line-initial HTs. In fact, I know of no such implementation.

Even if an implementation did not collapse line-initial HT to SPACE, there are other presentation semantics in XSL-FO that essentially imply that XML white space at a line break are not presented, i.e., do not consume IPD space, except when white-space-collapse='false'.

In TTML2 we clear up the ambiguity by requiring that line-initial XML whitespace be collapsed when white-space-collapse='true' applies. In other words, we explicitly make the semantics equivalent to the CSS2.1 clarified semantics. We can do this with some confidence that it will not cause an interoperability problem because we know with reasonable certainty that no TTML1 implementation fails to collapse line-initial XML white space.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make the same change to clear up the ambiguity in TTML1 Third Edition then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my original proposal (above), but @palemieux desired to not mandate collapse of HT as specified in TTML2, but instead (effectively) make it optional. Personally, I have no problem making it mandatory in TTML1 3Ed, which IMO, has very little likelihood of requiring a change in any implementation, since they are already doing the "right thing".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about: No presentation semantics are specified for the TAB (U+0009) character. Furthermore, the TAB (U+0009) character can generate a glyph area. As a result, the use of the TAB (U+0009) character in #PCDATA content within p and span elements is not recommended.

<p>No presentation semantics are specified for the TAB (U+0009) character. Furthermore, the TAB (U+0009) character
can generate a glyph area. As a result, the use of the TAB (U+0009) character
in <code>#PCDATA</code> content within <code>p</code> and <code>span</code> elements is not recommended.</p>
</note>
</div3>
</div2>
</div1>
Expand Down