Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to enable exact (but inefficient) seeking into variable bitrate MP3s #6787

Open
natidykstein opened this issue Dec 19, 2019 · 11 comments
Assignees
Labels

Comments

@natidykstein
Copy link

@natidykstein natidykstein commented Dec 19, 2019

Issue description

When seeking to 1603000ms in the mp3 file it seems to seek approx. 3 seconds earlier.
Seeking the mp3 file using Audacity(or VLC) it seeks to the expected position (as you can hear the expected audio).
I'm not sure it's relevant but the mp3 was created by extracting the aac stream of an mp4 video file using ffmpeg. (The original mp4 is seeked correctly by ExoPlayer while the mp3 does not)

Reproduction steps

  1. Using ExoPlayer demo app seek to 1603000ms. (I manually changed the code)
  2. The expected audio should say "I'll a..." and instead it says "are you asking me..."
    This happens consistently across devices.

Link to test content

A link to the mp3 file was emailed to dev.exoplayer@gmail.com.

A full bug report captured from the device

Full bug reported was emailed to dev.exoplayer@gmail.com

Version of ExoPlayer being used

2.11.0

Device(s) and version(s) of Android being used

Was reproduced on -
OnePlus 6 running Android 10
Google Pixel 3 running Android 10
Virtual device Google Pixel 2 Running Android 9
Virtual device Nexus 5X running Android 7.1.1

@ojw28 ojw28 self-assigned this Dec 19, 2019
@ojw28

This comment has been minimized.

Copy link
Contributor

@ojw28 ojw28 commented Dec 19, 2019

I get "You are not authorized to download this file." when I try and download the file. Please could you make it available?

@natidykstein

This comment has been minimized.

Copy link
Author

@natidykstein natidykstein commented Dec 19, 2019

I've sent a new link

@ojw28

This comment has been minimized.

Copy link
Contributor

@ojw28 ojw28 commented Dec 20, 2019

Thanks! Unless you're going to use a constant bitrate, MP3 is fundamentally not well suited to use cases that require exact seeking. There are two reasons for this:

  1. For exact seeking, a container format will ideally provide a precise time-to-byte mapping in a header. This mapping allows a player to map a requested seek time to the corresponding byte offset, and start requesting/parsing/playing media from that offset. The headers available for specifying this mapping in MP3 are, unfortunately, often imprecise. The sample you've provided uses a XING header, which specifies the mapping for 100 points that have a byte granularity equal to 1/256th of the length of the file in bytes. For your sample, this means a time-to-byte mapping is specified for points approximately 18 seconds apart, and each of these mappings may be off by ~20KB. So the mapping is both quite sparse and limited in accuracy.
  2. For container formats that don't provide a precise time-to-byte mapping (or any time-to-byte mapping at all), it's still possible to perform an exact seek if the container includes absolute sample timestamps in the stream. In this case a player can map the seek time to a best guess of the corresponding byte offset, start requesting media from that offset, parse the first absolute sample timestamp, and effectively perform a guided binary search into the media until it finds the right sample. Unfortunately MP3 does not include absolute sample timestamps in the stream, so this approach is not possible.

Ultimately, this means that the only way to perform an exact seek into this type of MP3 is to scan the entire file and manually build up a time-to-byte mapping in the player. This obviously doesn't scale well to large MP3 files, particularly if the user tries to seek to near the end of the stream shortly after starting playback, which would require the player to wait until it's downloaded and indexed the entire stream before performing the seek. For ExoPlayer we decided to optimize for seeking speed over accuracy in this case.

We do have plans to support exact seeking by building up an index, however we'll most likely disable this option by default (if we do this, it'll be possible to enable it with a flag). I will keep this issue open to track this enhancement. If you control the media you're playing, I would suggest that you use a more suitable container format (i.e. MP4).

@ojw28 ojw28 added enhancement and removed bug needs triage labels Dec 20, 2019
@ojw28 ojw28 changed the title MP3: SeekTo does not seek to exact location Add option to enable exact (but inefficient) seeking into variable bitrate MP3s Dec 20, 2019
@ojw28 ojw28 assigned kim-vde and unassigned ojw28 Dec 20, 2019
@natidykstein

This comment has been minimized.

Copy link
Author

@natidykstein natidykstein commented Dec 20, 2019

Thanks a lot for your detailed explanation.

Since we are the one to extract the audio stream from the original mp4 video we control the media.
If the requirement for the seek of the streamed audio is to be both fast and exact - is CBR mp3 better than mp4/m4a here? any considerations regarding the codec?

As a side note - as one who's been working with ExoPlayer in the last 5 years (even had the opportunity to make a small contribution to the project :)) - I think you're doing an incredible job in making our life much easier - keep up the good work!

@ojw28

This comment has been minimized.

Copy link
Contributor

@ojw28 ojw28 commented Dec 20, 2019

MP4/M4A is always a better choice. IMO there aren't really any valid use cases for MP3 any more, unless you need to use/support it for legacy reasons.

p.s. Thanks! Happy to help :).

@natidykstein

This comment has been minimized.

Copy link
Author

@natidykstein natidykstein commented Dec 20, 2019

I've inspected our part of the code that extracts the audio using ffmpeg and noticed that we do use CBR of 48kbps. To make sure I've analyzed the mp3 we're talking about and saw that is indeed the case (see attached screenshot).
What's actually hapenning here? is the XING header disrupting the seek calculations even though the mp3 itself is CBR?
Annotation 2019-12-20 232159

@ojw28

This comment has been minimized.

Copy link
Contributor

@ojw28 ojw28 commented Dec 20, 2019

What's actually hapenning here? is the XING header disrupting the seek calculations even though the mp3 itself is CBR?

That sounds quite likely.

@natidykstein

This comment has been minimized.

Copy link
Author

@natidykstein natidykstein commented Dec 21, 2019

  1. Should I strip the XING header? (is an mp3 without a XING header is still valid?)
  2. Can the player detect if the mp3 is indeed CBR and ignore the XING header when doing seek calculations? (This would be my preferred option)
@ojw28

This comment has been minimized.

Copy link
Contributor

@ojw28 ojw28 commented Dec 21, 2019

Why isn't your preferred option to use a container format that's appropriate for your use case? Even the people who made MP3 don't think you should use it any more.

mp3 is still very popular amongst consumers. However, most state-of-the-art media services such as streaming or TV and radio broadcasting use modern ISO-MPEG codecs such as the AAC family or in the future MPEG-H. Those technologies, that have been developed with major contributions from Fraunhofer IIS, can deliver more features and a higher audio quality at much lower bitrates compared to mp3

My understanding of XING headers are that they're only for VBR content, so if your file is CBR I'm not sure why it's ended up with a XING header in the first place (if you do some research for XING header, most references on the internet suggest that they're only used for VBR content). So yes, if you can generate the CBR MP3 without the XING header, I would expect that to work. We don't support your second suggestion.

@natidykstein

This comment has been minimized.

Copy link
Author

@natidykstein natidykstein commented Dec 21, 2019

You're right - using a different container is probably the right approach.
My second suggestion is just an optimization/work-around that would fit my case exactly - but I completly understand if it doesn't seem justified as a general approach.

I guess we'll probably need to go over all of our already generated mp3 (there are a lot) and perform some kind of adjustment - removing the XING header or transcoding to a different container, and change the way we generate new mp3 files.

Thanks for the tip in the right direction.

@natidykstein

This comment has been minimized.

Copy link
Author

@natidykstein natidykstein commented Dec 23, 2019

Just a note on the competitive front - iPhone's AVPlayer seeking is precise on the same mp3 - so it probably ignores the XING header in this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.