Appendix N assumption that root temporal extent corresponds with the beginning of a related media object #76

plehegar · 2015-11-04T18:47:01Z

From Appendix N.2:

"The above formalisms assumes that the Root Temporal Extent corresponds with the beginning of a related media object. If this assumption doesn't hold, then an additional offset that accounts for the difference may be introduced when computing media time M."

In a streaming environment it is very likely, especially for live-created subtitles, that a set of TTML samples will be produced each with a different Root Temporal Extent that starts at a different point relative to a single Related Media Object (which may itself be packaged up into an independent set of samples). However I would not in general account for this with an offset and assume (or hope) that every sample's begin and end times are zero-based; rather I'd expect the begin of the TT to be coincident with the point within the Related Media Object at which that sample begins. As a side-effect this should reduce the complexity of the transformations needed to accumulate the [short] samples together to create a longer one.

For this reason I propose that we reverse this assumption.

(raised by Nigel Megitt on 2013-08-12)
From tracker issue http://www.w3.org/AudioVideo/TT/tracker/issues/270

nigelmegitt · 2016-08-15T16:24:14Z

(admin point: at the current time, Appendix N in TTML1 maps to Appendix H in TTML2)

Having looked at this more, and after discussions, a couple of points need to be added - some but not all of this is restating other discussions elsewhere, for completeness here:

Firstly, the definition of Root Temporal Extent affects the treatment of the statement. If it is taken that the default for begin if unstated is effectively zero, then one might surmise that the Root Temporal Extent for e.g. the tt element always begins at zero. There are (at least) two other options however:

An undefined begin can be resolved when the document instance playback begins, which if it is on a media timeline might involve using some externally provided value, e.g. from an ISOBMFF box. In other words in this case the document processing context provides a syncbase which defines the beginning of the Root Temporal Extent, which may not be at the origin of the temporal coordinates. See below for a use case.
Treat the earliest computed begin time in the document as the beginning of the Root Temporal Extent.

A use case for 1 above is when the document is one of a sequence of "live" TTML documents, where the sender cannot provide a reliable time coordinate that can be successfully compared to a clock at the receiver, and where the author wants to signal "display this as soon as you can, but not for longer than 5s after beginning". To support this use case (which for reference is present in EBU Tech3370) the active begin time needs to be resolved downstream, where the resolution for that begin provides a syncbase value. An example document with no timing on regions might have:

<body dur="5s">

Then the effective begin value for the simple duration that everyone can agree has a 5 second duration is the document resolved begin time.

Secondly, the document processing context might provide a syncbase and also a playback entry time. For example, a segmentation algorithm might result in a document like:

<tt ttp:timeBase="media" ...>
<body>
<div>
<p begin="12345:00:00" ...>
...

where the processing context defines that the segment begin time is actually the equivalent of "12345:00:02". The syncbase is unchanged but the entry time is later (or earlier) than the earliest begin time in the document. This is not a scenario where signalling is required in the document instance, but it could affect the understanding of Root Temporal Extent, which may be considered to begin at the playback entry time. In fact, signalling processing context in the document would almost certainly be a very bad thing to do.

skynavga · 2016-08-17T00:58:38Z

On Mon, Aug 15, 2016 at 10:24 AM, Nigel Megitt notifications@github.com
wrote:

(admin point: at the current time, Appendix N in TTML1 maps to Appendix H
in TTML2)

Having looked at this more, and after discussions, a couple of points need
to be added - some but not all of this is restating other discussions
elsewhere, for completeness here:

Firstly, the definition of Root Temporal Extent affects the treatment
of the statement. If it is taken that the default for begin if unstated
is effectively zero, then one might surmise that the Root Temporal Extent
for e.g. the tt element always begins at zero. There are (at least) two
other options however:

An undefined begin can be resolved when the document instance
playback begins, which if it is on a media timeline might involve using
some externally provided value, e.g. from an ISOBMFF box. In other words in
this case the document processing context provides a syncbase which defines
the beginning of the Root Temporal Extent, which may not be at the origin
of the temporal coordinates. See below for a use case.

I'm not sure what is mean by "an undefined begin", since there is always a
value for begin. It is either the specified value or zero, where the latter
is deduced from SMIL3 semantics.

Since one can't explicitly specify begin on tt, then it must always be
zero. That only leaves the question what is the syncbase of the tt
element? i.e., what establishes the origin against which this value zero is
resolved? *or to put it another way, *what is the time container parent of
tt?

Treat the earliest computed begin time in the document as the
beginning of the Root Temporal Extent.

This seems to make sense if and only if the document is using smpte time
base in discontinuous mode. It certainly does not make sense for media time
base. The question in my mind is whether this can make sense for smpte time
base in continuous mode or not.

A use case for 1 above is when the document is one of a sequence of "live"
TTML documents, where the sender cannot provide a reliable time coordinate
that can be successfully compared to a clock at the receiver, and where the
author wants to signal "display this as soon as you can, but not for longer
than 5s after beginning".

This sounds like either an event or interactive based timing model using
SMIL terminology. I think TTML presently does not support this model.

To support this use case (which for reference is present in EBU Tech3370)
the active begin time needs to be resolved downstream, where the
resolution for that begin provides a syncbase value. An example document
with no timing on regions might have:

Then the effective begin value for the simple duration that everyone can
agree has a 5 second duration is the document resolved begin time.

Secondly, the document processing context might provide a syncbase and
also a playback entry time. For example, a segmentation algorithm might
result in a document like:

<tt ttp:timeBase="media" ...>

...
where the processing context defines that the segment begin time is
actually the equivalent of "12345:00:02".

Since this uses media time base, the only way this can work (IMO) is by
specifying ttp:mediaOffset='-44442002s'.

The syncbase is unchanged but the entry time is later (or earlier) than
the earliest begin time in the document. This is not a scenario where
signalling is required in the document instance, but it could affect the
understanding of Root Temporal Extent, which may be considered to begin at
the playback entry time. In fact, signalling processing context in the
document would almost certainly be a very bad thing to do.

nigelmegitt · 2016-08-17T15:47:50Z

I'm not sure what is mean by "an undefined begin", since there is always a
value for begin. It is either the specified value or zero, where the latter
is deduced from SMIL3 semantics.

Thanks for picking me up on this - I need to be so careful with language. dur specifies the simple duration. Computing the active duration for an element according to Computing the active duration requires that the begin time be resolved. So I should have said "an unresolved begin".

There seems to be scope in SMIL for making dur relative to the resolved begin rather than the specified begin, and from previous readings I had convinced myself that it is permitted to use the treatment I set out, but, looking again now, I'm not sure that we actually can take advantage of those capabilities given the constraints of TTML, or why I thought we could (I need to hunt down some old notes).

This sounds like either an event or interactive based timing model using
SMIL terminology. I think TTML presently does not support this model.

I'm beginning to agree, to my frustration!

I still think it is unclear that the tt element should be considered a time container that defines a syncbase, or if doing so actually has any practical impact. One could take the view that this interpretation simply adds zero to whatever syncbase is provided by the processing context, which makes no difference.

We should probably be careful about the processing context providing syncbase vs epoch for clock times by the way, since they may not be coincident.

where the processing context defines that the segment begin time is actually the equivalent of "12345:00:02".
Since this uses media time base, the only way this can work (IMO) is by specifying ttp:mediaOffset='-44442002s'.

I think the problem I set out is orthogonal to any media offset, which can only move the content on the timeline. Effectively ttp:mediaOffset is equivalent to a special kind of begin attribute that can take a negative value. I don't think that's either necessary or helpful. Here's why: in general the document processing context can and will define the flow of time. Given a document instance that includes content on a defined timeline, a processor is at liberty to begin and end presentation at any two points on that timeline. The processing context constitutes a system that the document instance cannot impact. Similarly the document cannot define the change in rate of play during that presentation. Attempting to move this kind of information into the TTML document will just generate a new timeline against which the same consideration will apply; so the exercise will fail recursively.

In practice other processing systems already include the semantics that define presentation begin and end. Attempting to duplicate that within the TTML documents makes those document instances less reusable and also may lead to a scenario where the two 'layers' disagree with each other, to nobody's benefit. We should not (IMO) introduce this possibility.

skynavga · 2016-08-17T21:14:49Z

On Wed, Aug 17, 2016 at 9:47 AM, Nigel Megitt notifications@github.com
wrote:

I'm not sure what is mean by "an undefined begin", since there is always a
value for begin. It is either the specified value or zero, where the latter
is deduced from SMIL3 semantics.

Thanks for picking me up on this - I need to be so careful with language.
dur specifies the simple duration. Computing the active duration for
an element according to Computing the active duration
https://www.w3.org/TR/2008/REC-SMIL3-20081201/smil-timing.html#Timing-ComputingActiveDur
requires that the begin time be resolved. So I should have said "an
unresolved begin".

There seems to be scope in SMIL for making dur relative to the resolved
begin rather than the specified begin, and from previous readings I had
convinced myself that it is permitted to use the treatment I set out, but,
looking again now, I'm not sure that we actually can take advantage of
those capabilities given the constraints of TTML, or why I thought we could
(I need to hunt down some old notes).

I'm not sure what you mean by dur in "making dur relative to ...". There
are a number of interpretations here:

explicit duration, i.e., the value of @Dur attribute
implicit duration
active duration

I'm guessing you probably mean active duration here as well? Also, I
don't know what it means to make a duration "relative to the resolved
begin", since in order to compute the active duration one must first
determine the active begin (which you probably mean by resolved begin).

It might be useful for you to review the code in TTX [1], to see a full
implementation if the SMIL timing model as required by TTML1.

[1]
https://github.com/skynav/ttt/blob/master/ttt-ttx/src/main/java/com/skynav/ttx/transformer/isd/TimingState.java

Here, a TimingState (TS) object is created (and cached) for each timed
element by this process:

perform _pre-_order traversal source tree from root (tt) element,
performing the following:
- resolve explicit timing state
  - durExplicit
  - beginExplicit
  - endExplicit
- perform _post-_order traversal source tree from root (tt) element,
  performing the following:
- resolve implicit timing state
  - durImplicit
- perform _pre-_order traversal source tree from root (tt) element,
  performing the following:
- resolve active timing state
  - beginActive
  - endActive

Derived timing state is available from the above as follows:

this.{beginExplicit,endExplicit,durExplicit}
- before step (1) completes is UNSPECIFIED
- after step (1) completes is one of:
  - UNSPECIFIED
  - DEFINITE(value)
- getSimpleDuration()
- before step (2) completes is UNRESOLVED
- after step (2) completes is one of:
  - UNRESOLVED
  - INDEFINITE
  - DEFINITE(value)
- getActiveDuration() - available after step (3) completes
before step (3) completes is getSimpleDuration()
- after step (3) completes is one of:
  - UNRESOLVED
  - INDEFINITE
  - DEFINITE(value)

This sounds like either an event or interactive based timing model using

SMIL terminology. I think TTML presently does not support this model.

I'm beginning to agree, to my frustration!

I still think it is unclear that the tt element should be considered a
time container that defines a syncbase, or if doing so actually has any
practical impact. One could take the view that this interpretation simply
adds zero to whatever syncbase is provided by the processing context, which
makes no difference.

Consider:

Here, the implicit duration of is 15s, not 10s, and this implicit duration (of tt) is used in resolving the active duration of , which in turn bounds the active duration of its children.

N.B. Timing is resolved on the original tt source tree,and not the transformed ISD tree, i.e., body is not scoped by region during timing resolution. The reason for this is obvious: in order to construct the ISD tree it is necessary to know the set of active temporal intervals. We should probably be careful about the processing context providing syncbase vs epoch for clock times by the way, since they may not be coincident. where the processing context defines that the segment begin time is actually the equivalent of "12345:00:02". Since this uses media time base, the only way this can work (IMO) is by specifying ttp:mediaOffset='-44442002s'. I think the problem I set out is orthogonal to any media offset, which can only move the content on the timeline. Effectively ttp:mediaOffset is equivalent to a special kind of begin attribute that can take a negative value. I don't think that's either necessary or helpful. Here's why: in general the document processing context can and will define the flow of time. Given a document instance that includes content on a defined timeline, a processor is at liberty to begin and end presentation at any two points on that timeline. The processing context constitutes a system that the document instance cannot impact. Similarly the document cannot define the change in rate of play during that presentation. Attempting to move this kind of information into the TTML document will just generate a new timeline against which the same consideration will apply; so the exercise will fail recursively. In practice other processing systems already include the semantics that define presentation begin and end. Attempting to duplicate that within the TTML documents makes those document instances less reusable and also may lead to a scenario where the two 'layers' disagree with each other, to nobody's benefit. We should not (IMO) introduce this possibility. — You are receiving this because you commented. Reply to this email directly, view it on GitHub #76 (comment), or mute the thread https://github.com/notifications/unsubscribe-auth/AAXCb7xEs1n3mXSn5dUqCJqhEW609ZbJks5qgy0mgaJpZM4Gb_XY .

skynavga · 2018-01-06T00:32:54Z

@nigelmegitt what do we need to do here, given the removal of @ttp:mediaOffset (PR #536)? I have marked this for agenda for upcoming F2F

nigelmegitt · 2018-01-09T06:16:01Z

@skynavga I think we need to make any further fixes required to #486 and then merge it, and that will probably resolve this issue too.

nigelmegitt · 2018-01-10T17:23:47Z

I've also raised #549 for removal of ttp:mediaDuration which is also needed.

css-meeting-bot · 2018-01-10T17:24:50Z

The Working Group just discussed Appendix N assumption that root temporal extent corresponds with the beginning of a related media object ttml2#76, and agreed to the following resolutions:

SUMMARY: Resolve pull #486 and remove mediaOffset and mediaDuration.

The full IRC log of that discussion

<nigel> Topic: Appendix N assumption that root temporal extent corresponds with the beginning of a related media object ttml2#76
<nigel> github: https://github.com//issues/76
<nigel> Glenn: We already decided not to include mediaOffset.
<nigel> Nigel: Yes, also mediaDuration.
<nigel> Glenn: pull #536 does that.
<nigel> Nigel: I recall from TPAC that we agreed to remove mediaDuration alongside mediaOffset
<nigel> .. as a pair.
<nigel> Glenn: I don't recall that and would like to consider it in a separate issue.
<nigel> .. Media Duration is needed to resolve indefinite end times.
<nigel> Nigel: At TPAC we agreed that the clipEnd semantics could be used for that place instead.
<nigel> Cyril: The pull request doesn't mention the issue.
<nigel> Glenn: I restored the TTML1 text saying that an offset may be needed if the assumption
<nigel> .. does not hold that the origin of media time corresponds to the begin time of a related media offset.
<nigel> Nigel: This text has a problem in that it is only a subset of the full set of assumptions.
<nigel> .. In fact the assumption more broadly is that the origin of the media time coordinates is
<nigel> .. the same as the origin coordinates of the document instance.
<nigel> .. I think I need to file an issue for this point.
<nigel> Pierre: If we have disagreement over the informative text then the pragmatic approach is
<nigel> .. to remove the text.
<nigel> Glenn: We don't use the word origin or formally define epoch.
<nigel> Nigel: I've raised #549 for removing mediaDuration.
<nigel> .. And #550 about the media offset note.
<nigel> SUMMARY: Resolve pull #486 and remove mediaOffset and mediaDuration.

Remove ttp:mediaDuration (#549). Since we have a WG resolution to proceed on this, and since nobody is proposing to keep ttp:mediaDuration, I am merging this early to facilitate closing #76. If someone wishes to object to the WG resolution https://github.com/w3c/ttml2/issues/549#issuecomment-356695431, feel free to reopen #549 with specific comments.

skynavga · 2018-01-21T00:35:59Z

Since #486 has been closed (deferred for future consideration), and since ttp:mediaOffset and ttp:mediaDuration have been removed (#536, #555), the WG resolution #76 (comment) is satisfied; therefore, I am closing this issue as no further action is required.

skynavga modified the milestone: TTML2WR Feb 23, 2017

skynavga self-assigned this Apr 20, 2017

skynavga removed their assignment May 11, 2017

skynavga self-assigned this May 29, 2017

skynavga removed this from the Editor's WR Work List milestone Aug 21, 2017

tmichel07 added the WR-pending label Oct 2, 2017

skynavga added the agenda label Jan 6, 2018

nigelmegitt added discussed and agreed and removed agenda labels Jan 10, 2018

skynavga removed the WR-pending label Jan 11, 2018

skynavga added the no spec change label Jan 21, 2018

skynavga added this to the Editor's CR Work List milestone Jan 21, 2018

skynavga removed their assignment Jan 21, 2018

skynavga mentioned this issue Jan 21, 2018

Remove ttp:mediaOffset (#323). #536

Merged

skynavga closed this as completed Jan 21, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Appendix N assumption that root temporal extent corresponds with the beginning of a related media object #76

Appendix N assumption that root temporal extent corresponds with the beginning of a related media object #76

plehegar commented Nov 4, 2015

nigelmegitt commented Aug 15, 2016

skynavga commented Aug 17, 2016 •

edited by nigelmegitt

Loading

nigelmegitt commented Aug 17, 2016

skynavga commented Aug 17, 2016

skynavga commented Jan 6, 2018 •

edited

Loading

nigelmegitt commented Jan 9, 2018

nigelmegitt commented Jan 10, 2018

css-meeting-bot commented Jan 10, 2018

skynavga commented Jan 21, 2018

Appendix N assumption that root temporal extent corresponds with the beginning of a related media object #76

Appendix N assumption that root temporal extent corresponds with the beginning of a related media object #76

Comments

plehegar commented Nov 4, 2015

nigelmegitt commented Aug 15, 2016

skynavga commented Aug 17, 2016 • edited by nigelmegitt Loading

nigelmegitt commented Aug 17, 2016

skynavga commented Aug 17, 2016

skynavga commented Jan 6, 2018 • edited Loading

nigelmegitt commented Jan 9, 2018

nigelmegitt commented Jan 10, 2018

css-meeting-bot commented Jan 10, 2018

skynavga commented Jan 21, 2018

skynavga commented Aug 17, 2016 •

edited by nigelmegitt

Loading

skynavga commented Jan 6, 2018 •

edited

Loading