Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IR] entities test is failing #464

Closed
gkatsev opened this issue Jul 3, 2019 · 7 comments
Closed

[IR] entities test is failing #464

gkatsev opened this issue Jul 3, 2019 · 7 comments
Labels
IR issues that arose during the implementation report phase

Comments

@gkatsev
Copy link
Collaborator

gkatsev commented Jul 3, 2019

In Firefox and Safari, the html entities parsing tests are only partially passing 14 of the 26 tests are passing. A previous version of the spec only called out several entities but the current spec calls out being able to parse all html entities.
Chrome fully supports all HTML entities.

@gkatsev gkatsev added the IR issues that arose during the implementation report phase label Sep 11, 2019
@css-meeting-bot
Copy link
Member

The Timed Text Working Group just discussed [IR] entities test is failing webvtt#464, and agreed to the following:

  • RESOLUTION: No spec change, when exiting CR, for this particular test, lack of 2 passing tests does not signify non-implementation of any feature.
The full IRC log of that discussion <nigel> Topic: [IR] entities test is failing webvtt#464
<nigel> github: https://github.com//issues/464
<nigel> Gary: This is similar to regions in that Safari has the older spec implementation.
<nigel> .. Safari and Firefox have it. Only some HTML character entities are whitelisted.
<nigel> .. More recently all HTML character entities are allowed.
<nigel> .. I guess the question here is, even given that not all character entities are supported, but the most popular ones are,
<nigel> .. would we be okay to move forward without waiting for implementation updates in this area?
<nigel> Nigel: What was the motivation for the spec change to allow all entities? That's important for this discussion.
<nigel> Gary: To align better with HTML and to allow people to use those character entities, especially since WebVTT has HTML
<nigel> .. syntax so if you are relying on that for special characters then you want to make sure your text is rendered appropriately.
<nigel> Nigel: Are there workarounds for authors?
<nigel> Gary: You'd be able to put the UTF-8 value in.
<nigel> David: We changed to align with HTML on character entities. Is that a problem?
<nigel> Gary: The implementations do not currently do that.
<nigel> Eric: We haven't implemented the spec change.
<nigel> Glenn: Named or character entities?
<nigel> David: Named. We've always required support for numbered ones.
<nigel> .. There were I think 6 named ones before the change.
<nigel> Eric: Yes, 6 exactly.
<nigel> [discussion about numerical character references]
<nigel> Glenn: That's strange, not to support all numeric character references.
<nigel> Gary: Yes it was strange.
<nigel> Glenn: There should be a way to get every numerical character reference into the document.
<nigel> Gary: WebVTT is defined as UTF-8 so you can always include the actual character.
<nigel> Eric: It's a bug in the implementation, it should be fixed.
<nigel> Gary: Should this block CR exit?
<nigel> Glenn: In TTML we only require only the XML spec-defined named character references, so having all the named ones
<nigel> .. from HTML seems redundant in my opinion, so I don't have any issue with limiting it to a small number.
<nigel> .. But there is an HTML compatibility issue if people think they should be able to use all the same entities.
<nigel> .. There may be a surprise factor if HTML named entities don't work.
<nigel> Nigel: +1
<nigel> Andreas: Your proposal is to wait for implementations, Gary?
<nigel> Gary: My question was actually should we not wait for implementations but just move forward with the spec as is.
<nigel> Andreas: So there will be spec text where you don't expect implementation?
<nigel> .. Why would you not revert the spec to match what is implemented?
<nigel> .. If you leave it in the spec then implementers will expect to use it.
<nigel> Gary: Chrome does do the new spec behaviour.
<nigel> Eric: Either leaving as is or reverting will require a change in the browser.
<nigel> .. It's more useful to allow them.
<nigel> .. The question is if it is worth blocking CR exit based on incomplete implementation?
<nigel> Andreas: I understand, the change is reasonable, it's just not fully implemented yet.
<nigel> Nigel: What is the behaviour for an unrecognised named character entity?
<nigel> Eric: It gets output as is, e.g. &foo; comes out just like that.
<nigel> Andreas: I don't think it is worth blocking for this.
<nigel> Eric: I agree it is not worth blocking for this.
<nigel> Gary: There are bugs against Firefox and Safari for this.
<gkatsev> -> https://bugs.webkit.org/show_bug.cgi?id=176225 webkit entities bug
<gkatsev> -> https://bugzilla.mozilla.org/show_bug.cgi?id=1395924 firefox entities bug
<nigel> PROPOSAL: No spec change, when exiting CR, for this particular test, lack of 2 passing tests does not signify non-implementation of any feature.
<nigel> Gary: Any objections?
<nigel> .. No objections
<nigel> RESOLUTION: No spec change, when exiting CR, for this particular test, lack of 2 passing tests does not signify non-implementation of any feature.
<nigel> Andreas: Question: if you use Unicode code points as the reference for correct rendering, sometimes if the font is not available at authoring time, the safest way is to type the character reference, otherwise you cannot specify what you want.
<nigel> .. Is there a test for numerical code point support?
<nigel> Nigel: That's a question about the test suite.
<nigel> Eric: Yes, and about the spec too.
<nigel> Gary: Does the spec need to specify conversion of code points directly?
<nigel> Nigel: The spec uses "HTML Character Reference" which includes numerical ones. Are you saying there needs to be a test for that?
<nigel> Andreas: The Rec should match implementations, but if implementations don't support numerical character references
<nigel> .. then should we add tests?
<nigel> Nigel: There's already a test for it
<nigel> -> https://github.com/web-platform-tests/wpt/blob/master/webvtt/parsing/cue-text-parsing/tests/entities.html Entity parsing tests
<nigel> .. It currently fails except on Chrome
<nigel> .. 7 entities tests pass on stable builds right now.

@silviapfeiffer
Copy link
Member

Are we not counting vlc or vtt.js for compatible implementations? Surely vtt.js and Chrome as supporting all entities should be sufficient.

@gkatsev
Copy link
Collaborator Author

gkatsev commented Sep 21, 2019

Vttjs doesn't support this currently. Last time I checked vlc it didn't support it but it is definitely possible it does. I think the agreement that this shouldn't hold up exit from cr us good. Hopefully, vlc already has support so we don't even need to worry about it.

@silviapfeiffer
Copy link
Member

So is vttjs filtering the html entities it supports? Surely that's a simple fix...?

@gkatsev
Copy link
Collaborator Author

gkatsev commented Sep 21, 2019

It's parsing out the old allowed list. The issue is that it's putting the text as text content. Putting it into innerHTML has some security implications.

@gkatsev
Copy link
Collaborator Author

gkatsev commented Jan 7, 2020

Video.js's version of vtt.js now has support for all HTML Character Entities and it's available in Video.js 7.7.4. Thus, with Chrome and Video.js/vtt.js we have 2 distinct implementations.

@gkatsev gkatsev closed this as completed Jan 7, 2020
@silviapfeiffer
Copy link
Member

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IR issues that arose during the implementation report phase
Projects
None yet
Development

No branches or pull requests

3 participants