Wide Review Comment 2017: serialisation and parsing #367

nigelmegitt · 2017-09-27T11:00:24Z

Copy/paste from https://lists.w3.org/Archives/Public/public-tt/2017Sep/0080.html - raising as an issue for tracking/disposition purposes.

The WebVTT syntax is similar to (but incompatible with) SRT but otherwise distinct from all other syntaxes, and includes a subsection that is effectively CSS syntax. I consider the serialisation and parsing of a document format to be an architectural layer in its own right, ideally with tests, tools and support for the format. In the case of WebVTT the fact that it has a unique format means that the benefits of referencing an independent serialisation and parsing layer are absent. For internal business to business transactions this creates some hurdles: it is costlier to develop a syntax checker for example to validate that received files are well formed, or to quality check the content; writing custom parser code becomes a security risk since issues like buffer overflow are more commonly, though not uniquely, found in less mature code. The tool support for e.g. JSON, HTML or XML serialisation is much more mature and less likely to suffer from these problems.

It is unclear what action could resolve this with WebVTT in its current form, without taking seemingly extreme steps. For example if WebVTT were a semantic model plus an API, and alternative representations were defined, and at least one of those alternative representations were a more commonly used one, that would help, though at the expense of adding an initial step for every WebVTT import or export, which is to work out which representation to use.

From this perspective, the syntax of WebVTT seems better suited to direct writing and editing in text editors by humans than by software, though obviously it is ultimately feasible to use either. For an organisation like the BBC authoring and distributing subtitle documents at scale it would be better to optimise for machine reading and writing instead of human reading and writing, since we expect subtitle authors and editors to use specialist software rather than tweaking files directly.

dwsinger · 2017-09-28T20:29:04Z

I'm not sure that I agree that the model means it's easier to write by hand. But though this is an interesting comment, it doesn't seem actionable.("It is unclear what action could resolve this with WebVTT").

nigelmegitt · 2017-09-29T09:30:18Z

The possible line of action I propose is to split the WebVTT semantic model from the representation. I (probably) expect a "won't fix" response to that, but it's at least a route to consider.

dwsinger · 2017-09-29T17:52:54Z

This is an architectural question that was resolved by the WG long ago, and such comments should have been made in the WG prior to asking for a first, let alone second, external wide review.

nigelmegitt · 2017-09-29T19:13:35Z

Perhaps that was before my time. I had the impression it was an architectural decision made by WHATWG in their fork of HTML before WebVTT was split out into a separate spec. I certainly don't recall it being discussed in the WG.

silviapfeiffer · 2017-10-15T06:12:11Z

The comment is that the line-based format of WebVTT should be abandoned and be replaced by a XML or JSON based format. While it is appreciated that JSON parsing is already built into the browser and an alternative version of WebVTT could be specified in JSON, this is not a practical suggestion in the context of the current status of WebVTT, which has a well established usage and community of use. The WG may wish to discuss further, but I see no other resolution than "works for me".

Also note that both XML and JSON require "end tags" for creating valid files. When first discussing the ability to specify time-aligned text and an encapsulation format for it (about 8 years ago), a file format with end tags was deemed unusable because of the need for progressive delivery, particularly in live streaming and when muxed into media files where you cannot wait for the end of the caption file before rendering captions. Thus, not having end tags is actually an advantage of WebVTT, since it avoids the need for file flattening which hierarchical file formats require.

nigelmegitt · 2017-10-30T17:24:27Z

a file format with end tags was deemed unusable because of the need for progressive delivery

Doesn't it depend on the unit of information that requires tags to be closed? I mean, you could define an "update message" format with close tags that modifies a previous entity. My main point was that it is helpful to separate the serialisation and parsing layer from the format, and to make that layer reusable.

In any case the general idea that live use cases cannot be achieved if end tags are required has been shown to be false by counter-example, in that there exist now live subtitled streams whose data format contains close tags.

I'm not sure why this is labelled "WR-commenter-rejected" since I haven't received a disposition to reject or accept yet, so I'm going to remove that label.

silviapfeiffer · 2017-11-10T23:00:43Z

I admit I don't fully understand what the labels mean ;-)

silviapfeiffer · 2017-11-10T23:05:07Z

@nigelmegitt You can most certainly use a format with end tags for live use cases, but that requires repackaging the format as you describe in the need for "serialisation" of other formats. That's not required for WebVTT because it is inherently serialised. So, the decision on creation of WebVTT (which happened jointly in the HTML and WHATWG groups) was that WebVTT should be serialised from the start. It's a bit moot now anyway.

silviapfeiffer · 2017-11-11T04:40:00Z

I'd like to close this as "works for me"

dwsinger · 2017-11-13T23:13:54Z

WFM too.

silviapfeiffer · 2017-12-21T13:21:07Z

@nigelmegitt now that the bug is closed, could you add your disposition?

nigelmegitt · 2018-02-14T14:18:24Z

I'm still not clear what the WG disposition is that I'm being asked to respond to. On the basis that it is "comment rejected, we will take no action", all I can do is recognise that is the case and move on. It hasn't changed my view or original comment.

dwsinger · 2018-02-20T00:35:49Z

It's hard to say what the disposition of this is, as it suggests that some other syntax might be better, but then says "It is unclear what action could resolve this with WebVTT in its current form, without taking seemingly extreme steps." And we all agree that extreme steps are...mildly undesirable. I hope I got the tags right.

nigelmegitt · 2018-02-27T09:32:39Z

Yes, they look about right, thanks.

silviapfeiffer added this to the WD-wide-review milestone Oct 1, 2017

dwsinger added WR-open WR-CG-open labels Oct 2, 2017

silviapfeiffer added WF-commenter-rejected works for me WR-commenter-rejected WR-CG-rejected and removed WF-commenter-rejected labels Oct 15, 2017

nigelmegitt removed the WR-commenter-rejected label Oct 30, 2017

silviapfeiffer added WR-pending and removed WR-CG-open WR-open labels Nov 11, 2017

dwsinger added the WR label Dec 11, 2017

silviapfeiffer closed this as completed Dec 21, 2017

dwsinger added WR-rejected WR-resolved WR-commenter-rejected and removed WR-pending labels Feb 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wide Review Comment 2017: serialisation and parsing #367

Wide Review Comment 2017: serialisation and parsing #367

nigelmegitt commented Sep 27, 2017

dwsinger commented Sep 28, 2017

nigelmegitt commented Sep 29, 2017

dwsinger commented Sep 29, 2017

nigelmegitt commented Sep 29, 2017

silviapfeiffer commented Oct 15, 2017

nigelmegitt commented Oct 30, 2017

silviapfeiffer commented Nov 10, 2017

silviapfeiffer commented Nov 10, 2017

silviapfeiffer commented Nov 11, 2017

dwsinger commented Nov 13, 2017

silviapfeiffer commented Dec 21, 2017

nigelmegitt commented Feb 14, 2018

dwsinger commented Feb 20, 2018

nigelmegitt commented Feb 27, 2018

Wide Review Comment 2017: serialisation and parsing #367

Wide Review Comment 2017: serialisation and parsing #367

Comments

nigelmegitt commented Sep 27, 2017

dwsinger commented Sep 28, 2017

nigelmegitt commented Sep 29, 2017

dwsinger commented Sep 29, 2017

nigelmegitt commented Sep 29, 2017

silviapfeiffer commented Oct 15, 2017

nigelmegitt commented Oct 30, 2017

silviapfeiffer commented Nov 10, 2017

silviapfeiffer commented Nov 10, 2017

silviapfeiffer commented Nov 11, 2017

dwsinger commented Nov 13, 2017

silviapfeiffer commented Dec 21, 2017

nigelmegitt commented Feb 14, 2018

dwsinger commented Feb 20, 2018

nigelmegitt commented Feb 27, 2018