Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpreting the pixel measure #27

plehegar opened this issue Nov 4, 2015 · 1 comment


Copy link

commented Nov 4, 2015

We have defined the pixel measure with reference to XSL, namely:

The actual distance covered by the largest integer number of device dots (the size of a device dot is measured as the distance between dot centers) that spans a distance less-than-or-equal-to the distance specified by the arc-span rule in or superceding errata.


a fixed conversion factor, treating 'px' as an absolute unit of measurement (such as 1/92" or 1/72").

This is leading to problems in practice as we use TTML with actual delivered video. I recall that during the development of the specification we discussed the concept that px would equate to video pixels, so that authors could align elements precisely with elements in the video however this seems to have been lost.

I suggest that we need to clarify the pixel behaviour; in particular we should explain that where extent is used on the tt element this effectively defines the size of a pixel in the sense that the root area extent is still mapped to a video overlay and divided into 'logical pixels'. If such an extent is not defined on the tt element, then the px measure should probably not be used (or even expressly forbidden)

Example, lets say that the author has a video nominally[1] 1024x768 in extent. This is being displayed however fullscreen on a monitor 1920x2000 (with 280px of black above and below the video). If we use device dots as XSL suggests, the captions will not be aligned correctly with the video.

If the px unit is used on the extent attribute on the tt element extent="1024px 768px" I believe the expectation was that the root element is scaled to 1920x1440 along with the video and placed in correspondence with the video, and so the px metric actually means 1.875 device dots.

[1] Note also that the pixel extent is not the actual delivered pixel density of the video either, since in an adaptive streaming model the actual frame size may vary depending on bandwidth. It needs to be an authoring concept based on the original coding size of the video.

(raised by Pierre-Anthony Lemieux on 2012-08-24)
From tracker issue

@skynavga skynavga modified the milestone: TTML2WR Feb 23, 2017
@skynavga skynavga self-assigned this Apr 20, 2017
@skynavga skynavga removed their assignment May 11, 2017

This comment has been minimized.

Copy link

commented May 11, 2017

Incorporate resolution of these comments into #30 and closing this issue.

@skynavga skynavga closed this May 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
2 participants
You can’t perform that action at this time.