Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust sample timestamps in MergingMediaPeriod for periods with different timestamp ranges #6103

Closed
brAzzi64 opened this issue Jun 27, 2019 · 12 comments
Assignees

Comments

@brAzzi64
Copy link

Hi,

I have 3 video segments A, B, C with audio. I'd like to play them in sequence, while a different audio-only track plays seamlessly on top of the overall video (I want to discard the original audio of the video segments).

Is there a way to achieve this using ExoPlayer 2?

The documentation for MergingMediaSource states "The Timelines of the sources being merged must have the same number of periods", and I believe this poses a problem with taking the following approach:

a = ExtractorMediaSource(A-video-and-audio-source)
b = ExtractorMediaSource(B-video-and-audio-source)
c = ExtractorMediaSource(C-video-and-audio-source)
x = ExtractorMediaSource(X-audio-only-source)
overallSource = MergingMediaSource(ConcatenatingMediaSource(a, b ,c), x)

Is there a different approach that you think could work?

Do you think replacing the audio source by something like:

x = ConcatenatingMediaSource(ClippingMediaSource(X[0,1]), ClippingMediaSource(X[1,2]), ClippingMediaSource(X[2,3])))

where each ClippingMediaSource grabs a sequential segment of the original X-audio-only-source, should produce accurate results without any noticeable audio clipping?

Appreciate any guidance!

@tonihei
Copy link
Collaborator

tonihei commented Jul 1, 2019

The approach using ClippingMediaSource sounds feasible. To avoid noticeable clipping, you should ensure two things:

  1. The duration of the clipping should exactly coincide with the end of an audio sample. The specified duration is in microseconds, so that should be possible. It should also be as close as possible to the corresponding video duration, so you'd need to know that in advance too.
  2. The ClippingMediaSource has a constructor option to enableInitialDiscontinuity that needs to be set to false. This is safe for audio-only streams because every sample is a keyframe and no discontinuity is needed.

@brAzzi64
Copy link
Author

brAzzi64 commented Jul 1, 2019

Sounds good, thanks much for the pointers. I'll give this a try!

@tonihei
Copy link
Collaborator

tonihei commented Nov 22, 2019

Reopened as requested in #6649.

@tonihei tonihei reopened this Nov 22, 2019
@google google unlocked this conversation Nov 26, 2019
@brAzzi64
Copy link
Author

brAzzi64 commented Dec 2, 2019

Hi @tonihei, thanks for reopening. I finally got around to give this idea a try, but it seems like the approach of "re-assembling the audio track from its pieces" to then merge with the actual videos does not seem to be working.

I tried reducing this to a more basic, single-segment:

Merge(
  audioTrack=Clipping(Audio1, startTimeA, endTimeA)
  videoTrack=Clipping(Video1, startTimeB, endTimeB)
)  // making sure the clippings produce the same media length

but the clipping (of the same length for both tracks) does not seem to be working as expected. I'm thinking that my assumptions of how the combination of these components should behave might be wrong?

You can see my attempt on top of ExoPlayer's sample code, as a patch here, along with the results I'm observing (as comments in the code): https://gist.github.com/brAzzi64/d9aa2caedff9dbcc7b04d182add83897

Once I get this working, what I'd ultimately want to achieve is what I mentioned on my original post that you mentioned should work:

Merge(
  audioTrack=Concat(Clipping(Audio1, startTime0, endTime0), Clipping(Audio1, startTime1, endTime1), ...),
  videoTrack=Concat(Clipping(Video1), Clipping(Video2), ...)
)

@tonihei
Copy link
Collaborator

tonihei commented Dec 3, 2019

Thanks for the great reproduction steps!

I can see the same issues that seem to be caused by combining multiple clipping media sources together with different time offsets. Each clipping media source publishes media samples in the specified time range (e.g. audio from 3 to 6 seconds and video from 15 to 18 seconds). The player only "sees" the media structure of the first item in the list. So if the audio is first, it plays the audio, but doesn't show the video frames because they are way too early to be shown. If the video is first in the list, it shows the video, but continues playing the audio until it caught up with the apparent player position.

To make that work, we need to amend the sample timestamps to be in the same range. This is a generic problem that should be solved in MergingMediaPeriod to adapt time range offsets in the secondary sources. Marking as an enhancement to do that. Sorry it didn't work out of the box as described above!

@tonihei tonihei changed the title Is it possible to concatenate a sequence of videos A, B, C and play a single audio track X on top? Adjust sample timestamps in MergingMediaPeriod for periods with different timestamp ranges Dec 3, 2019
@brAzzi64
Copy link
Author

brAzzi64 commented Dec 9, 2019

Thanks for looking into it @tonihei!

It's a bit unfortunate though, because it would prevent us from using ExoPlayer altogether for our current feature :( is the fix you mention something that we could address without forking ExoPlayer? Maybe by forking MergingMediaPeriod alone, on the app-side?

Or even if forking ExoPlayer, would you be able to share some more detailed hints on how to achieve the change that would fix this?

@tonihei
Copy link
Collaborator

tonihei commented Dec 10, 2019

Haven't tried to implement it yet, but it should be fixable by forking MergingMediaSource and MergingMediaPeriod only.

Each Timeline.window will have an offset in the respective period and we need to ensure this offset is the same for all cases. The first MediaSource in the list is the master source, so the offsets of all other sources need to be adjusted. To adjust these offsets, the MergingMediaPeriod needs to manipulate all position timestamps in all of its methods before forwarding to the child MediaPeriod implementations. And it also needs to wrap the SampleStreams returned in selectTracks with a thin wrapper that applies the same offset to all samples read in readData as well.

This isn't particularly easy to implement, but feel free to give it a go. We will probably fix it as well in the near future (i.e. most likely beginning of next year). If you manage to implement this, it would be great if you can upload it as a pull request :)

@brAzzi64
Copy link
Author

After digging a bit deeper, it's going to make more sense for us to use a workaround for now instead, so we can hit our feature deadline.

We're reducing our problem to a case where we only have a single (audio, video) pair (vs a series of pairs, using ConcatenatingMediaSource). We'll use MergingMediaSource, pre-trim the audio file to the right segment, and use ClippingMediaSource to clip the video; this simplified scenario seems to work well.

Since it sounds like you guys will be fixing this soon-ish, we'll wait for your fix to land and then update and get rid of our workaround, to use the components in the way described originally.

@tonihei do you think that after fixing this, we should expect the ConcatenatingMediaSource approach you suggested at the top to work as described too? Or do you think the bad behavior there would be due to a different issue? Let me know if you think I should file this as a different ticket.

Thanks again!

@tonihei
Copy link
Collaborator

tonihei commented Dec 17, 2019

do you think that after fixing this, we should expect the ConcatenatingMediaSource approach you suggested at the top to work as described too?

I think so, yes. Please note that you need to set the clipping points carefully and disable the discontinuities if you want perfectly smooth audio playback (see #6103 (comment)), but otherwise that shouldn't be a problem.

@brAzzi64
Copy link
Author

brAzzi64 commented Feb 2, 2020

Hi @tonihei, I just wanted to check back on this and see if you guys got any chance to start looking into the fix for this issue. Any update would be really appreciated!

@tonihei
Copy link
Collaborator

tonihei commented Feb 3, 2020

Sorry haven't started working on this yet. We'll update this issue as soon as we've got something.

ojw28 pushed a commit that referenced this issue Mar 10, 2020
Without this option it's impossible to merge periods covering
different timestamps (at least not without playback issues).

Issue:issue:#6103
PiperOrigin-RevId: 299817540
@tonihei
Copy link
Collaborator

tonihei commented Mar 12, 2020

Closing because the feature is supported now. MergingMediaSource has a new optional flag to adjust the timestamp ranges. Please give it a try!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants