Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Appendix N assumption that root temporal extent corresponds with the beginning of a related media object #76

Closed
plehegar opened this issue Nov 4, 2015 · 9 comments

Comments

@plehegar
Copy link
Member

plehegar commented Nov 4, 2015

From Appendix N.2:

"The above formalisms assumes that the Root Temporal Extent corresponds with the beginning of a related media object. If this assumption doesn't hold, then an additional offset that accounts for the difference may be introduced when computing media time M."

In a streaming environment it is very likely, especially for live-created subtitles, that a set of TTML samples will be produced each with a different Root Temporal Extent that starts at a different point relative to a single Related Media Object (which may itself be packaged up into an independent set of samples). However I would not in general account for this with an offset and assume (or hope) that every sample's begin and end times are zero-based; rather I'd expect the begin of the TT to be coincident with the point within the Related Media Object at which that sample begins. As a side-effect this should reduce the complexity of the transformations needed to accumulate the [short] samples together to create a longer one.

For this reason I propose that we reverse this assumption.

(raised by Nigel Megitt on 2013-08-12)
From tracker issue http://www.w3.org/AudioVideo/TT/tracker/issues/270

@nigelmegitt
Copy link
Contributor

(admin point: at the current time, Appendix N in TTML1 maps to Appendix H in TTML2)

Having looked at this more, and after discussions, a couple of points need to be added - some but not all of this is restating other discussions elsewhere, for completeness here:

Firstly, the definition of Root Temporal Extent affects the treatment of the statement. If it is taken that the default for begin if unstated is effectively zero, then one might surmise that the Root Temporal Extent for e.g. the tt element always begins at zero. There are (at least) two other options however:

  1. An undefined begin can be resolved when the document instance playback begins, which if it is on a media timeline might involve using some externally provided value, e.g. from an ISOBMFF box. In other words in this case the document processing context provides a syncbase which defines the beginning of the Root Temporal Extent, which may not be at the origin of the temporal coordinates. See below for a use case.
  2. Treat the earliest computed begin time in the document as the beginning of the Root Temporal Extent.

A use case for 1 above is when the document is one of a sequence of "live" TTML documents, where the sender cannot provide a reliable time coordinate that can be successfully compared to a clock at the receiver, and where the author wants to signal "display this as soon as you can, but not for longer than 5s after beginning". To support this use case (which for reference is present in EBU Tech3370) the active begin time needs to be resolved downstream, where the resolution for that begin provides a syncbase value. An example document with no timing on regions might have:

<body dur="5s">

Then the effective begin value for the simple duration that everyone can agree has a 5 second duration is the document resolved begin time.

Secondly, the document processing context might provide a syncbase and also a playback entry time. For example, a segmentation algorithm might result in a document like:

<tt ttp:timeBase="media" ...>
<body>
<div>
<p begin="12345:00:00" ...>
...

where the processing context defines that the segment begin time is actually the equivalent of "12345:00:02". The syncbase is unchanged but the entry time is later (or earlier) than the earliest begin time in the document. This is not a scenario where signalling is required in the document instance, but it could affect the understanding of Root Temporal Extent, which may be considered to begin at the playback entry time. In fact, signalling processing context in the document would almost certainly be a very bad thing to do.

@skynavga
Copy link
Collaborator

skynavga commented Aug 17, 2016

On Mon, Aug 15, 2016 at 10:24 AM, Nigel Megitt notifications@github.com
wrote:

(admin point: at the current time, Appendix N in TTML1 maps to Appendix H
in TTML2)

Having looked at this more, and after discussions, a couple of points need
to be added - some but not all of this is restating other discussions
elsewhere, for completeness here:

Firstly, the definition of Root Temporal Extent affects the treatment
of the statement. If it is taken that the default for begin if unstated
is effectively zero, then one might surmise that the Root Temporal Extent
for e.g. the tt element always begins at zero. There are (at least) two
other options however:

  1. An undefined begin can be resolved when the document instance
    playback begins, which if it is on a media timeline might involve using
    some externally provided value, e.g. from an ISOBMFF box. In other words in
    this case the document processing context provides a syncbase which defines
    the beginning of the Root Temporal Extent, which may not be at the origin
    of the temporal coordinates. See below for a use case.

I'm not sure what is mean by "an undefined begin", since there is always a
value for begin. It is either the specified value or zero, where the latter
is deduced from SMIL3 semantics.

Since one can't explicitly specify begin on tt, then it must always be
zero. That only leaves the question what is the syncbase of the tt
element? i.e., what establishes the origin against which this value zero is
resolved? *or to put it another way, *what is the time container parent of
tt?

  1. Treat the earliest computed begin time in the document as the
    beginning of the Root Temporal Extent.

This seems to make sense if and only if the document is using smpte time
base in discontinuous mode. It certainly does not make sense for media time
base. The question in my mind is whether this can make sense for smpte time
base in continuous mode or not.

A use case for 1 above is when the document is one of a sequence of "live"
TTML documents, where the sender cannot provide a reliable time coordinate
that can be successfully compared to a clock at the receiver, and where the
author wants to signal "display this as soon as you can, but not for longer
than 5s after beginning".

This sounds like either an event or interactive based timing model using
SMIL terminology. I think TTML presently does not support this model.

To support this use case (which for reference is present in EBU Tech3370)
the active begin time needs to be resolved downstream, where the
resolution for that begin provides a syncbase value. An example document
with no timing on regions might have:

Then the effective begin value for the simple duration that everyone can
agree has a 5 second duration is the document resolved begin time.

Secondly, the document processing context might provide a syncbase and
also a playback entry time. For example, a segmentation algorithm might
result in a document like:

<tt ttp:timeBase="media" ...>

...

where the processing context defines that the segment begin time is
actually the equivalent of "12345:00:02".

Since this uses media time base, the only way this can work (IMO) is by
specifying ttp:mediaOffset='-44442002s'.

The syncbase is unchanged but the entry time is later (or earlier) than
the earliest begin time in the document. This is not a scenario where
signalling is required in the document instance, but it could affect the
understanding of Root Temporal Extent, which may be considered to begin at
the playback entry time. In fact, signalling processing context in the
document would almost certainly be a very bad thing to do.

@nigelmegitt
Copy link
Contributor

I'm not sure what is mean by "an undefined begin", since there is always a
value for begin. It is either the specified value or zero, where the latter
is deduced from SMIL3 semantics.

Thanks for picking me up on this - I need to be so careful with language. dur specifies the simple duration. Computing the active duration for an element according to Computing the active duration requires that the begin time be resolved. So I should have said "an unresolved begin".

There seems to be scope in SMIL for making dur relative to the resolved begin rather than the specified begin, and from previous readings I had convinced myself that it is permitted to use the treatment I set out, but, looking again now, I'm not sure that we actually can take advantage of those capabilities given the constraints of TTML, or why I thought we could (I need to hunt down some old notes).

This sounds like either an event or interactive based timing model using
SMIL terminology. I think TTML presently does not support this model.

I'm beginning to agree, to my frustration!

I still think it is unclear that the tt element should be considered a time container that defines a syncbase, or if doing so actually has any practical impact. One could take the view that this interpretation simply adds zero to whatever syncbase is provided by the processing context, which makes no difference.

We should probably be careful about the processing context providing syncbase vs epoch for clock times by the way, since they may not be coincident.

where the processing context defines that the segment begin time is actually the equivalent of "12345:00:02".
Since this uses media time base, the only way this can work (IMO) is by specifying ttp:mediaOffset='-44442002s'.

I think the problem I set out is orthogonal to any media offset, which can only move the content on the timeline. Effectively ttp:mediaOffset is equivalent to a special kind of begin attribute that can take a negative value. I don't think that's either necessary or helpful. Here's why: in general the document processing context can and will define the flow of time. Given a document instance that includes content on a defined timeline, a processor is at liberty to begin and end presentation at any two points on that timeline. The processing context constitutes a system that the document instance cannot impact. Similarly the document cannot define the change in rate of play during that presentation. Attempting to move this kind of information into the TTML document will just generate a new timeline against which the same consideration will apply; so the exercise will fail recursively.

In practice other processing systems already include the semantics that define presentation begin and end. Attempting to duplicate that within the TTML documents makes those document instances less reusable and also may lead to a scenario where the two 'layers' disagree with each other, to nobody's benefit. We should not (IMO) introduce this possibility.

@skynavga
Copy link
Collaborator

On Wed, Aug 17, 2016 at 9:47 AM, Nigel Megitt notifications@github.com
wrote:

I'm not sure what is mean by "an undefined begin", since there is always a
value for begin. It is either the specified value or zero, where the latter
is deduced from SMIL3 semantics.

Thanks for picking me up on this - I need to be so careful with language.
dur specifies the simple duration. Computing the active duration for
an element according to Computing the active duration
https://www.w3.org/TR/2008/REC-SMIL3-20081201/smil-timing.html#Timing-ComputingActiveDur
requires that the begin time be resolved. So I should have said "an
unresolved begin".

There seems to be scope in SMIL for making dur relative to the resolved
begin rather than the specified begin, and from previous readings I had
convinced myself that it is permitted to use the treatment I set out, but,
looking again now, I'm not sure that we actually can take advantage of
those capabilities given the constraints of TTML, or why I thought we could
(I need to hunt down some old notes).

I'm not sure what you mean by dur in "making dur relative to ...". There
are a number of interpretations here:

  • explicit duration, i.e., the value of @Dur attribute
  • implicit duration
  • active duration

I'm guessing you probably mean active duration here as well? Also, I
don't know what it means to make a duration "relative to the resolved
begin", since in order to compute the active duration one must first
determine the active begin (which you probably mean by resolved begin).

It might be useful for you to review the code in TTX [1], to see a full
implementation if the SMIL timing model as required by TTML1.

[1]
https://github.com/skynav/ttt/blob/master/ttt-ttx/src/main/java/com/skynav/ttx/transformer/isd/TimingState.java

Here, a TimingState (TS) object is created (and cached) for each timed
element by this process:

  1. perform _pre-_order traversal source tree from root (tt) element,
    performing the following:
    • resolve explicit timing state
      • durExplicit
      • beginExplicit
      • endExplicit
    • perform _post-_order traversal source tree from root (tt) element,
      performing the following:
    • resolve implicit timing state
      • durImplicit
    • perform _pre-_order traversal source tree from root (tt) element,
      performing the following:
    • resolve active timing state
      • beginActive
      • endActive

Derived timing state is available from the above as follows:

  • this.{beginExplicit,endExplicit,durExplicit}
    • before step (1) completes is UNSPECIFIED
    • after step (1) completes is one of:
      • UNSPECIFIED
      • DEFINITE(value)
    • getSimpleDuration()
    • before step (2) completes is UNRESOLVED
    • after step (2) completes is one of:
      • UNRESOLVED
      • INDEFINITE
      • DEFINITE(value)
    • getActiveDuration() - available after step (3) completes
  • before step (3) completes is getSimpleDuration()
    • after step (3) completes is one of:
      • UNRESOLVED
      • INDEFINITE
      • DEFINITE(value)

This sounds like either an event or interactive based timing model using

SMIL terminology. I think TTML presently does not support this model.

I'm beginning to agree, to my frustration!

I still think it is unclear that the tt element should be considered a
time container that defines a syncbase, or if doing so actually has any
practical impact. One could take the view that this interpretation simply
adds zero to whatever syncbase is provided by the processing context, which
makes no difference.

Consider:

Here, the implicit duration of is 15s, not 10s, and this implicit
duration (of tt) is used in resolving the active duration of , which in
turn bounds the active duration of its children.

N.B. Timing is resolved on the original tt source tree,and not the
transformed ISD tree, i.e., body is not scoped by region during timing
resolution. The reason for this is obvious: in order to construct the ISD
tree it is necessary to know the set of active temporal intervals.

We should probably be careful about the processing context providing

syncbase vs epoch for clock times by the way, since they may not be
coincident.

where the processing context defines that the segment begin time is
actually the equivalent of "12345:00:02".
Since this uses media time base, the only way this can work (IMO) is by
specifying ttp:mediaOffset='-44442002s'.

I think the problem I set out is orthogonal to any media offset, which can
only move the content on the timeline. Effectively ttp:mediaOffset is
equivalent to a special kind of begin attribute that can take a negative
value. I don't think that's either necessary or helpful. Here's why: in
general the document processing context can and will define the flow of
time. Given a document instance that includes content on a defined
timeline, a processor is at liberty to begin and end presentation at any
two points on that timeline. The processing context constitutes a system
that the document instance cannot impact. Similarly the document cannot
define the change in rate of play during that presentation. Attempting to
move this kind of information into the TTML document will just generate a
new timeline against which the same consideration will apply; so the
exercise will fail recursively.

In practice other processing systems already include the semantics that
define presentation begin and end. Attempting to duplicate that within the
TTML documents makes those document instances less reusable and also may
lead to a scenario where the two 'layers' disagree with each other, to
nobody's benefit. We should not (IMO) introduce this possibility.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#76 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAXCb7xEs1n3mXSn5dUqCJqhEW609ZbJks5qgy0mgaJpZM4Gb_XY
.

@skynavga skynavga modified the milestone: TTML2WR Feb 23, 2017
@skynavga skynavga self-assigned this Apr 20, 2017
@skynavga skynavga removed their assignment May 11, 2017
@skynavga skynavga self-assigned this May 29, 2017
@skynavga skynavga removed this from the Editor's WR Work List milestone Aug 21, 2017
@skynavga skynavga added the agenda label Jan 6, 2018
@skynavga
Copy link
Collaborator

skynavga commented Jan 6, 2018

@nigelmegitt what do we need to do here, given the removal of @ttp:mediaOffset (PR #536)? I have marked this for agenda for upcoming F2F

@nigelmegitt
Copy link
Contributor

@skynavga I think we need to make any further fixes required to #486 and then merge it, and that will probably resolve this issue too.

@nigelmegitt
Copy link
Contributor

I've also raised #549 for removal of ttp:mediaDuration which is also needed.

@css-meeting-bot
Copy link
Member

The Working Group just discussed Appendix N assumption that root temporal extent corresponds with the beginning of a related media object ttml2#76, and agreed to the following resolutions:

  • SUMMARY: Resolve pull #486 and remove mediaOffset and mediaDuration.
The full IRC log of that discussion <nigel> Topic: Appendix N assumption that root temporal extent corresponds with the beginning of a related media object ttml2#76
<nigel> github: https://github.com//issues/76
<nigel> Glenn: We already decided not to include mediaOffset.
<nigel> Nigel: Yes, also mediaDuration.
<nigel> Glenn: pull #536 does that.
<nigel> Nigel: I recall from TPAC that we agreed to remove mediaDuration alongside mediaOffset
<nigel> .. as a pair.
<nigel> Glenn: I don't recall that and would like to consider it in a separate issue.
<nigel> .. Media Duration is needed to resolve indefinite end times.
<nigel> Nigel: At TPAC we agreed that the clipEnd semantics could be used for that place instead.
<nigel> Cyril: The pull request doesn't mention the issue.
<nigel> Glenn: I restored the TTML1 text saying that an offset may be needed if the assumption
<nigel> .. does not hold that the origin of media time corresponds to the begin time of a related media offset.
<nigel> Nigel: This text has a problem in that it is only a subset of the full set of assumptions.
<nigel> .. In fact the assumption more broadly is that the origin of the media time coordinates is
<nigel> .. the same as the origin coordinates of the document instance.
<nigel> .. I think I need to file an issue for this point.
<nigel> Pierre: If we have disagreement over the informative text then the pragmatic approach is
<nigel> .. to remove the text.
<nigel> Glenn: We don't use the word origin or formally define epoch.
<nigel> Nigel: I've raised #549 for removing mediaDuration.
<nigel> .. And #550 about the media offset note.
<nigel> SUMMARY: Resolve pull #486 and remove mediaOffset and mediaDuration.

skynavga added a commit that referenced this issue Jan 21, 2018
Remove ttp:mediaDuration (#549). Since we have a WG resolution to proceed on this, and since nobody is proposing to keep ttp:mediaDuration, I am merging this early to facilitate closing #76. If someone wishes to object to the WG resolution https://github.com/w3c/ttml2/issues/549#issuecomment-356695431, feel free to reopen #549 with specific comments.
@skynavga
Copy link
Collaborator

Since #486 has been closed (deferred for future consideration), and since ttp:mediaOffset and ttp:mediaDuration have been removed (#536, #555), the WG resolution #76 (comment) is satisfied; therefore, I am closing this issue as no further action is required.

@skynavga skynavga added this to the Editor's CR Work List milestone Jan 21, 2018
@skynavga skynavga removed their assignment Jan 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants