Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incomplete ruby implementation [I18N-ISSUE-431] #264

Closed
r12a opened this issue Nov 16, 2015 · 55 comments
Closed

incomplete ruby implementation [I18N-ISSUE-431] #264

r12a opened this issue Nov 16, 2015 · 55 comments
Labels
i18n-needs-resolution Issue the Internationalization Group has raised and looks for a response on.

Comments

@r12a
Copy link

r12a commented Nov 16, 2015

[moving to github from https://www.w3.org/Bugs/Public/show_bug.cgi?id=28265]

i was just reviewing this with a mind to close our i18n tracker issue for now (and reopen as support for the other aspects of the html5 markup model spreads), when it occurred to me that webvtt doesn't support the rb element.

This is widely supported in browsers (see the test results at http://www.w3.org/International/tests/repo/results/ruby-html#position), and i think it should therefore be supported by webvtt. Adding it allows for additional styling options, as well as reducing the potential for confusion in authors, who are used to using it in HTML.

Note that HTML5 supports all of the following:

<ruby><rb>...</rb><rt>...</rt>...
<ruby><rb>...<rt>...
<ruby>...<rt>...</rt>...
@zcorpan
Copy link
Member

zcorpan commented Nov 16, 2015

In WHATWG HTML, ruby, rt and rp are conforming, the others are not. rb doesn't have any special styling by default in WebKit and Blink. In particular see http://software.hixie.ch/utilities/js/live-dom-viewer/saved/3749

To style the ruby base in VTT, you can use the <c> element (or I suppose in most cases you can just style the <ruby>). Similarly in HTML, you can use a <span>.

I think adding rb to VTT should wait until all browsers have adopted the necessary ruby styling for HTML.

I think we could add rp now though. Since it appears that ruby is not supported in some VTT implementations, it might be desirable to have fallback parens there. The only issue would be legacy ruby-supporting clients that don't support rp in VTT, would render the parens also.

@zcorpan
Copy link
Member

zcorpan commented Nov 17, 2015

The only issue would be legacy ruby-supporting clients that don't support rp in VTT, would render the parens also.

We could use markup like

<ruby>base<(><rt>text<)></ruby>

but it would be different from HTML and it wouldn't render the parens in existing ruby-unsupporting VTT implementations. It would also only allow simple parens.

@silviapfeiffer
Copy link
Member

I wouldn't complicate things with special parens tags - I'd expect browsers will implement uniform ruby support instead. Also, I agree with waiting with <rb> - it seems it's only used for styling and there's no support in most browsers for that yet.

@dwsinger
Copy link

We should re-assess the status of Ruby implementations in all browsers and consider whether we need to adapt VTT in the light of current practice. Re-opening.

@silviapfeiffer
Copy link
Member

Looking at Richard's test page at https://www.w3.org/International/tests/repo/results/ruby-html and http://caniuse.com/#search=ruby makes me think we should now support <ruby>, <rb>, <rt>. What happened to <rp>?

@zcorpan
Copy link
Member

zcorpan commented Oct 19, 2016

As far as I can tell, the situation has not changed since November 2015. Again, see http://software.hixie.ch/utilities/js/live-dom-viewer/saved/3749

@r12a r12a added the i18n-needs-resolution Issue the Internationalization Group has raised and looks for a response on. label Oct 21, 2016
@dwsinger
Copy link

dwsinger commented Nov 16, 2016

After asking experts, I believe we have the following state:
<ruby> and <rt>:
are in the specification now
<rb>:
is in by implication (it's a normal part of <ruby> if I understand it correctly), but we should be explicit
<rtc> and <rbc>:
might be needed for some cases; should we include them? I am not aware of need, myself. I suggest we leave to future for the specification and implementer choice for now.
<rp>:
is only in HTML for backwards compatibility and is not needed.

Can we mention <rb> as being part of <ruby> and leave <rtc> and <rbc> for implementer discretion and the future?

@r12a
Copy link
Author

r12a commented Nov 17, 2016

@TPS I made those edits for @dwsinger.

For a discussion of ruby markup and how to use it in HTML, with live code examples and explanations of when rb and rtc become useful, see https://www.w3.org/International/articles/ruby/markup

The rb element is explicitly included in https://www.w3.org/TR/html5/text-level-semantics.html#the-rb-element

@dwsinger
Copy link

I am lost; have we addressed this issue and can now close it?

@silviapfeiffer
Copy link
Member

@r12a so what's your recommendation?

@zcorpan
Copy link
Member

zcorpan commented Apr 18, 2017

Since rb is still not properly supported in browsers other than Firefox (per #264 (comment)), and it's not conforming in WHATWG HTML, I think it should not be included in WebVTT. We can add a note saying WebVTT only supports simple/limited ruby.

@silviapfeiffer
Copy link
Member

Hmm, this was in relation to feedback by the i18n W3C group, so we have two choices:

1/ delay to next version of WebVTT, hoping rb will then be implemented (as per discussion at FOMS also, see https://www.w3.org/Bugs/Public/show_bug.cgi?id=28265#c21 )

2/ add rb now to follow the recommendation of the i18n group, but put it as a feature at risk for this version because it's not interoperably implemented

Given this, I would suggest we go with 1/

@r12a
Copy link
Author

r12a commented Apr 18, 2017

The test pointed to by #264 (comment) only shows part of the story. It uses a tabular content model, which is mostly needed for double-sided ruby.

This test http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=5030 shows that rb tags work fine in ruby that isn't double-sided. It uses an interleaved content model.

The advantage of allowing the rb element is that it enables certain types of styling that can't be done without an element (such as background styling, or some accessibility related styling). One could use a span element, but it's more semantically meaningful and intuitive to use an rb element. Besides, much of the existing ruby markup in HTML pages in the wild uses rb elements, so this will also be more familiar for content authors.

Furthermore, looking to the future, once CSS Ruby is ready to ship, there will be additional reasons to support rb, since there will be certain styling features besides double-sided ruby that will rely on the tabular markup model (for which rb is needed).

My recommendation would be that if WebVTT is looking for a staged approach to supporting ruby, it should initially provide support for single-sided ruby (sometimes referred to as 'simple ruby'), that allows use of the rb element. Later it can provide support for double-sided ruby and the tabular content model.

@zcorpan
Copy link
Member

zcorpan commented Apr 18, 2017

Indeed in webvtt you can use <c> instead of rb for styling the base in single-sided ruby.

@dwsinger
Copy link

So we have a very confused situation. It's deprecated in whatwg html, there in w3c html 5.1, supported in some but not all use cases in browsers. Is the WhatWG workaround enough ("Providing the ruby base directly inside the ruby element or using nested ruby elements is sufficient.")? If so, why do we keep it in W3C HTML? Richard, help?

@zcorpan
Copy link
Member

zcorpan commented Apr 18, 2017

I agree it is a confused situation. But I think it is not for webvtt to sort out. We just need to wait for ruby in HTML to reach a stable equilibrium and then re-evaluate complex ruby in webvtt.

@r12a
Copy link
Author

r12a commented Apr 18, 2017

I don't think it's confused, although I have to admit that i'm a little confused about why the WhatWG spec hasn't yet been updated to align with the HTML spec wrt support for the rb tag.

As i said above, providing the ruby base directly inside the ruby element does not allow for all the kinds of styling you might need to implement, such as background/border styling of the individual base texts, or simple removal of kanji characters for accessibility. You really need an element sometimes. (And in fact, if rts close rbs and vice versa, it's actually simpler markup if you have <rb>東<rt>とう rather than 東<rt>とう</rt>.)

As i also said above, last time i checked, well over 50% of the ruby out there in the wild uses rb tags, so this is something authors feel comfortable with. These tags also provide better semantics for the markup: <rb>東<rt>とう is more meaningful than <c>東<rt>とう.

The rb tags currently work fine for simple ruby in all major browsers.

What doesn't work (yet) is just that a second annotation in complex ruby isn't correct placed (even in Firefox!). This is because you really need CSS to do double-sided ruby, and the CSS Ruby spec is still under development. My understanding is that the browsers are waiting on stabilisation of the CSS spec before implementing the correct rendering of double-sided ruby. (Note that i said 'implementing double-sided ruby', not 'implementing the rb element'.)

I don't know who wrote that nested ruby is sufficient, but if you've ever actually tried doing double-sided ruby in a serious way using nested ruby, it's not at all easy to create or read, compared to the tabular content model. And if you have double-sided ruby that doesn't overlap the same base characters, it's a real headache.

As i also said above, some of the things you will want to do with CSS in the future, such as inlining mono-ruby after a compound word, eg. 東京(とうきょ)rather than 東(とう)京(きょ), can't be done without an rb tag used for in the tabular model.

So, in conclusion, i believe that if you are reliant on current HTML/CSS support in browsers it's probably fine that WebVTT supports only simple ruby for now. For that, i believe the rb tag is useful, is what users will expect, and is supported by all major browsers.

Later, when you want to add complex ruby to WebVTT, you'll also need the properties in the CSS spec, and i think you'll find that you'll need the rb tag then anyway as part of the tabular content model. So why not start with the rb tag now for the simple cases, rather than make people change the way they use the markup in the future?

@fantasai, @aphillips, @kojiishi, any additional comments?

@dwsinger
Copy link

@zcorpan "But I think it is not for webvtt to sort out." I completely agree, and I wonder whether we can simplify by saying that the ruby element, and its contents, are permitted in VTT cues just as they are in HTML and to the same extent and with the same meaning. People reading WhatWG HTML will then avoid rb if they can, people reading either HTML will be aware that some things don't work too well. In other words, simply defer to HTML rather than trying to duplicate, fix, (spindle, fold, or mutilate) etc.?

@zcorpan
Copy link
Member

zcorpan commented Apr 18, 2017

That doesn't work though. If we are to allow rb in VTT then we need to change at least the VTT cue text parser. VTT is not HTML and doesn't have the same set of elements as HTML.

@dwsinger
Copy link

well, I get what you say, but let's keep thinking (ok, maybe I am far outside the box). We have exactly one section that defines parsing a ruby element, where we parse a " WebVTT cue ruby span ", and that in turn exports the concept of "WebVTT cue internal text". The HTML spec. exports a concept of the "base text" which I think is the same thing. Are we sure we can't say "parse between and according to HTML, setting WebVTT cue internal text to the base text as defined by HTML" (and probably a few more concepts)?

@zcorpan
Copy link
Member

zcorpan commented Apr 19, 2017

It seems you are reading the Syntax section, which defines how to write WebVTT, not how to parse WebVTT. http://w3c.github.io/webvtt/#cue-text-parsing-rules defines the latter. There is no object for ruby base. An <rb> tag will be dropped on the floor, and so can't be styled at all in current implementations.

See this test

A few things to note here:

  • rb element is dropped
  • The rt is displayed inline in Safari TP and Chrome canary/Opera. In Firefox it is displayed as proper ruby (but the ruby text has transparent background, so it is invisible; Ruby text has no background by default #234).
  • Firefox does not support styling for cues.

@dwsinger
Copy link

I searched the entire spec. http://w3c.github.io/webvtt/ for "ruby".

4.2.2 defines "WebVTT cue ruby span" and "WebVTT cue ruby text span"
5.4 defines defines "WebVTT Ruby Objects" == spans of WebVTT cue ruby span and "WebVTT Ruby text objects" == spans of WebVTT cue ruby text span
5.4 also has a definition that "WebVTT Ruby Text Object " is set by the text in

Everywhere else depends on these four definitions, which in turn rest mostly on the two definitions and the parsing rules in 4.2.2, and on the definition in 5.4

so if we can abstract the parsing rules in 4.2.2 to simply refer to "parse the contents of the ruby according to HTML" and define those two concepts, and similarly define WebVTT Ruby Text Object by reference to HTML, in 5.4, I don't see anything else in the spec. that relies on the actual tags.

@zcorpan
Copy link
Member

zcorpan commented Apr 19, 2017

"parse the contents of the ruby according to HTML"

This makes no sense to me. I also don't understand what you're trying to solve by deferring to HTML.

If rb should work in VTT then we need to change the VTT parser and introduce a new object for it (and mapping to Selectors and DOM convertion rule). And we need new tests. And then all browsers need to implement it. Is there interest from implementors?

@dwsinger
Copy link

dwsinger commented Apr 19, 2017 via email

@zcorpan
Copy link
Member

zcorpan commented Apr 20, 2017

I’m trying to find a way we can make it not our problem to sort out, as you said "I think it is not for webvtt to sort out”. I am wondering whether we can defer to HTML parsing and so on, for everything between and , rather than defining in VTT exactly how to parse, and having to choose (more tightly than did HTML) which ruby features/elements work and should be supported.

OK. I think it's not really possible to do that. Or at least extremely unpractical.

Somehow, I am feeling that if that’s not a question for HTML, we should be able to make it not a question for VTT. Maybe I am wrong; it certainly might call into question “VTT is implementable independently”.

A relevant difference between HTML and VTT is that HTML will create elements from unknown tags, whereas VTT will not. So VTT needs to decide which elements are part of the language.

For as long as VTT only supports simple ruby, not supporting rb does not leave out any use cases, because c can be used in its place.

@zcorpan
Copy link
Member

zcorpan commented Apr 20, 2017

@r12a

As i also said above, some of the things you will want to do with CSS in the future, such as inlining mono-ruby after a compound word, eg. 東京(とうきょ)rather than 東(とう)京(きょ), can't be done without an rb tag used for in the tabular model.

This still couldn't be done in VTT because it restricts which CSS properties apply; 'display' is not in the list. Also rp is not an element in VTT.

@r12a
Copy link
Author

r12a commented Apr 20, 2017

@zcorpan there are still some questions about how this would be done, but i don't think it would rely on rp being available, rather it would be done via styling.

@r12a
Copy link
Author

r12a commented Apr 20, 2017

@dwsinger WebVTT currently supports only a very simple ruby content model. I don't think we need to worry too much at this stage about what comes beyond that. I'm just proposing that we allow users to add an optional rb tag around the base text. That will make it easier to extend ruby in WebVTT when the time is right to add complex ruby support, and make it more consistent with the current HTML model.

@silviapfeiffer
Copy link
Member

Can I ask for comments by browsers on whether there is interest in implementing "RB" support, i.e. parsing "RB" as a valid element in WebVTT cues? @eric-carlson @kentuckyfriedtakahe may be able to comment

@zcorpan on a side note: all non browsers that implement WebVTT would already support basic "rb" because browsers support it in HTML. So it already passes the implementation test.

@zcorpan
Copy link
Member

zcorpan commented May 8, 2017

I don't follow. Which test?

@dwsinger
Copy link

dwsinger commented May 8, 2017 via email

@zcorpan
Copy link
Member

zcorpan commented May 9, 2017

“is <rb> implemented?” test;

Can you link to a test case and tell me which implementation passes the test?

I think she’s saying that for any VTT implementation that transforms VTT to HTML for rendering, the answer is yes, as the browsers do support rb ?

That would be a non-conforming implementation since the VTT parser requires to drop unknown elements. If you mean that non-browser VTT implementations do not implement the VTT parser algorithm, that seems like a problem?

@dwsinger
Copy link

dwsinger commented May 9, 2017 via email

@zcorpan
Copy link
Member

zcorpan commented May 9, 2017

Maybe @silviapfeiffer can clarify this with different words

@zcorpan on a side note: all non browsers that implement WebVTT would already support basic "rb" because browsers support it in HTML. So it already passes the implementation test.

This is what I am trying to understand. Being specific and precise (like "Safari TP passes the test at this URL") would be appreciated. 😊

@zcorpan
Copy link
Member

zcorpan commented May 9, 2017

Thanks. That shows what DOM is constructed from parsing HTML.

I still don't see how to get from there to rb support in non-browsers that implement WebVTT. It seems to me that would need a VTT test case to run in a non-browser that implements WebVTT (which one?).

@silviapfeiffer
Copy link
Member

@zcorpan what I was referring to are the JavaScript libraries that support WebVTT. Pretty sure that e.g. videojs and jwplayer would support it syncs they merely render HTML.

@dwsinger
Copy link

dwsinger commented May 11, 2017

I asked a colleague who has access to our non-browser implementation. He modified a .vtt file with Japanese <ruby></ruby> and <rt></rt> elements and added <rb></rb> elements around the base text portion of the <ruby> element. The ruby annotation displays OK despite the (unsupported) <rb></rb> elements.

Here are two cues with additional <rb> elements:

00:00:26.000 --> 00:00:29.000 position:60%
pos60%もし何かすごい<ruby><rb>能力</rb><rt>のうりょく</rt></ruby>が持てるとしたら、何を<ruby><rb>選</rb><rt>えら</rt></ruby>びますか?

00:00:32.000 --> 00:00:33.500 line:25% position:25%
line25%pos25%見えなくなることを<ruby><rb>選</rb><rt>えら</rt></ruby>ぶと思います。

If he misapplied this <rb> element, let me know.

So, the result seems to be "parsed and ignored", as hoped for.

@zcorpan
Copy link
Member

zcorpan commented May 11, 2017

(I updated your comment to fix formatting.)

@zcorpan
Copy link
Member

zcorpan commented May 11, 2017

@dwsinger excellent, thanks for testing. That would be compliant with the current VTT parser spec -- unknown tags get dropped on the floor.

@dwsinger
Copy link

Cool, so we could note that falls into that category as it's not needed for the simpler cases currently supported in VTT, i.e. like any unknown tag, it may be present and can/should be ignored?

@silviapfeiffer
Copy link
Member

silviapfeiffer commented May 11, 2017

@dwsinger are you suggesting we don't apply the patch and reply to the I18N group (Richard's advice) that these tags can currently be used in WebVTT, but are ignored because they are unsupported, which is sufficient? I don't think that was the intention of Richard's advice...

Also, I don't agree with the analysis that we can assume that your non-browser implementation ignores the <rb> tag - do you have a parsed version of the WebVTT cues? A shadow DOM or whatever is rendered rather than just the original file and a visual inspection?

[sorry, had to update my comment]

@zcorpan
Copy link
Member

zcorpan commented May 19, 2017

I still think we should WONTFIX this, and revisit when ruby rendering in VTT is supported in more user agents than just Gecko, and at least one of the following has taken place:

  • STYLE blocks are implemented, so we can tell the difference in non-browser implementations.
  • CSS ruby is interoperably implemented in browsers. (This will likely result in rb being made conforming in whatwg/html.)

@dwsinger
Copy link

dwsinger commented May 19, 2017 via email

@silviapfeiffer
Copy link
Member

FWIW I think Simon's position makes sense.

Adding my patch would require everyone to implement parsing support for without getting any rendering advantage from it. That typically means that users will not use it. The point about STYLE blocks is a further good argument against it.

@r12a
Copy link
Author

r12a commented May 24, 2017

Sorry, been a little distracted lately.

Richard, can you opine as the original commenter?

I can't think of anything more to add to my case. Am i right to understand that we're not ruling out the eventual use of the rb tag for WebVTT, just deferring its introduction? If so, i guess i'm mostly concerned for the case that Sylvia mentioned: "re-training users to do it differently in WebVTT with a "c" element seems counter-intuitive". Perhaps it would be better not to push too strongly on the use of c, which i guess is just for application of styling, but mention that a future version of WebVTT is likely to introduce the rb element ??

@zcorpan i had a couple of clarification questions on the following - just trying to understand the conversation above a bit better. I'm not as well versed in the WebVTT technology as you guys, sorry to lag a bit:

I still think we should WONTFIX this, and revisit when ruby rendering in VTT is supported in more user agents than just Gecko, and at least one of the following has taken place:

  • STYLE blocks are implemented, so we can tell the difference in non-browser implementations.
  • CSS ruby is interoperably implemented in browsers. (This will likely result in rb being made conforming in whatwg/html.)

I followed the link to the test page, and looked at it in Firefox, Chrome & Safari (on a Mac). I wasn't completely sure what the intended conclusion was. What i saw was that the orange colour doesn't stick to the base text in Chrome/Safari when rb is used, but does when c is used. Firefox (Gecko) doesn't colour either (which seemed the opposite of what you said(?). And none of the rendered results show ruby text being positioned above the base text. Is that consistent with what you are seeing?

When you say 'CSS ruby is interoperably implemented in browsers', i assume that you mean that tabular use of the markup (rb rb rt rt) is supported(?) (Since HTML already positions the rt above the rb in all browsers when using rb rt elements.)

@zcorpan
Copy link
Member

zcorpan commented May 24, 2017

Perhaps it would be better not to push too strongly on the use of c, which i guess is just for application of styling, but mention that a future version of WebVTT is likely to introduce the rb element ??

I think that is very reasonable.

I followed the link to the test page, and looked at it in Firefox, Chrome & Safari (on a Mac). I wasn't completely sure what the intended conclusion was. What i saw was that the orange colour doesn't stick to the base text in Chrome/Safari when rb is used, but does when c is used. Firefox (Gecko) doesn't colour either (which seemed the opposite of what you said(?). And none of the rendered results show ruby text being positioned above the base text. Is that consistent with what you are seeing?

It's consistent with what I see. There are several things going on here, so I understand that it's confusing.

  • Since <rb> is dropped during parsing, it cannot be styled, so the background is not applied.
  • The page applies styling with a <style> block in HTML, rather than with STYLE in VTT. STYLE in VTT is not supported anywhere yet AFAIK. Styling VTT from HTML is supported in Chrome/Safari, but not Firefox, which is why Firefox doesn't style the c element either.
  • Chrome/Safari render the rt inline, while Firefox renders rt above the base text (so Firefox supports ruby rendering). The reason the ruby text is invisible in Firefox is because the text is white and the video background is also white, and backgrounds don't inherit. (You can see the ruby text in Firefox in http://software.hixie.ch/utilities/js/live-dom-viewer/saved/5202 .) I've fixed this in the spec in Fix #234: Use a default background for <rt> #239 but it's not yet implemented.

When you say 'CSS ruby is interoperably implemented in browsers', i assume that you mean that tabular use of the markup (rb rb rt rt) is supported(?) (Since HTML already positions the rt above the rb in all browsers when using rb rt elements.)

Yes, or, I guess that's part of it. As I understand it, the styling of ruby in non-Gecko engines is a bit hard-coded to the HTML element, rather than being implemented in terms of CSS properties that are applied. If CSS ruby is implemented as specified, it should work to style arbitrary elements as ruby and have it work. Test case: http://software.hixie.ch/utilities/js/live-dom-viewer/saved/5203 (This is relevant for VTT because it doesn't use HTML elements for rendering, so an HTML-only implementation may not be applicable to VTT, c.f. ruby rendering in VTT not working in Chrome/Safari today despite them supporting ruby rendering in HTML.)

@dwsinger
Copy link

did we reach a conclusion here? Richard? Simon? Silvia? I don't mind how, but we should close this somehow!

@silviapfeiffer
Copy link
Member

Yes, we're converging. I'll make a new PR at the weekend with just a note about the future introduction of rb.

zcorpan added a commit that referenced this issue Jun 15, 2017
@zcorpan
Copy link
Member

zcorpan commented Jun 15, 2017

PR at #348

zcorpan added a commit that referenced this issue Jun 16, 2017
@r12a
Copy link
Author

r12a commented Jun 16, 2017

Thanks all.

@silviapfeiffer
Copy link
Member

Thanks for your patience, Richard!

@dwsinger
Copy link

dwsinger commented Jun 16, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
i18n-needs-resolution Issue the Internationalization Group has raised and looks for a response on.
Projects
None yet
Development

No branches or pull requests

4 participants