Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reducing the Minimum Amount of Audio Data passed into ExoPlayer for it to begin playing #6325

Closed
peurpdapeurp opened this issue Aug 20, 2019 · 8 comments
Assignees
Labels

Comments

@peurpdapeurp
Copy link

[REQUIRED] Searched documentation and issues

  1. I've looked at the DefaultLoadControl javadoc reference.
  2. I've looked at the Customization page of the ExoPlayer developer guide (https://exoplayer.dev/customization.html).

[REQUIRED] Question

I'm currently trying to play streamed AAC ADTS audio frames through the ExoPlayer, and running into the issue that I cannot play a short audio stream using the ExoPlayer.

I am feeding AAC ADTS audio frames into the ExoPlayer through a ProgressiveMediaSource which contains an InputStreamDataSource (a custom DataSource implementation which takes in AAC ADTS frames using an InputStream). The ExoPlayer never leaves the Player.STATE_BUFFERING state when I give it 64 AAC ADTS frames (all around 90 bytes in size), but it does enter the Player.STATE_READY state and start playing back audio when I give it far more frames (by repeatedly writing the 64 AAC ADTS frames into the InputStream that the ExoPlayer is reading audio data from). From a rough test, I estimate that ExoPlayer starts playing audio when it's given around 10000 - 20000 bytes of audio data.

In order to get around this, I tried injecting a customized LoadControl using the DefaultLoadControl.Builder API as detailed in the DefaultLoadControl javadoc page, and setting the back buffer, buffer durations, and target buffer bytes all to very low values, like so:

    ExoPlayer player = ExoPlayerFactory.newSimpleInstance(ctx_, new DefaultTrackSelector(),
                                                          new DefaultLoadControl.Builder()
                                                          .setBackBuffer(600, false)
                                                          .setBufferDurationsMs(600,
                                                                                600,
                                                                          600,
                                                                600)
                                                          .setTargetBufferBytes(600)
                                                          .createDefaultLoadControl());

However, this still did not allow me to reliably playback an audio-stream of 64 ADTS AAC frames (strangely, I was able to play the short audio stream once, but it did not work again and I am not exactly sure why it succeeded in playing once).

My questions are:

  1. Am I configuring the ExoPlayer properly to do what I want, which is to reduce the amount of audio data it receives before it starts playing back the audio data it has?
  2. I would ideally like for the ExoPlayer to begin playing back audio after receiving only a few ADTS AAC frames (~ 100 - 300 bytes); is this possible with the ExoPlayer?

My code for playback is here: https://github.com/peurpdapeurp/basic_java_audio_stream_consumer/blob/exoplayer/app/src/main/java/com/example/audio_consumer/StreamPlayerTester.java

Thank you for any help on this question.

Link to test content

I am trying to play the AAC ADTS frames in the MUSIC_ADTS_FRAMES_BUFFER here: https://github.com/peurpdapeurp/basic_java_audio_stream_consumer/blob/exoplayer/app/src/main/java/com/example/audio_consumer/TestFrames.java

@google-oss-bot
Copy link
Collaborator

This issue does not seem to follow the issue template. Make sure you provide all the required information.

@peurpdapeurp
Copy link
Author

I was able to answer this question for myself by adding some breakpoints into the ExoPlayer source code.

The main thing causing issues for me was the selectExtractors function in ProgressiveMediaPeriod.java. What was happening was that it was trying to decide which extractor to use on the input data, and when it was checking the ADTS data using the MP3Extractor, that would stop the playing of an audio stream that was too short (presumably because the MP3Extractor needs a lot of data for its .sniff(ExtractorInput input) function to return true or false).

The solution was to pass a custom ExtractorsFactory into the ProgressiveMediaSource I was passing to the ExoPlayer, which just had a single Extractor in it (the ADTSExtractor). This was possible to do in my use case because I know the only audio format being used is ADTS AAC.

The working code is here, for anyone who would like to reference it: https://github.com/peurpdapeurp/basic_java_audio_stream_consumer/tree/exoplayer

@tonihei tonihei self-assigned this Aug 21, 2019
@tonihei
Copy link
Collaborator

tonihei commented Aug 21, 2019

Glad you solved your question :)

It's an interesting effect though because I don;t think our DefaultExtractorFactory follows any particular order to ensure the formats gets detected quickly. But I'm also not sure why this is important given that you need to read the rest of the stream in any case. can you explain what you are doing exactly and why the extractor selection needs to happen after 100-300 bytes already?

@yoursunny
Copy link

I have been working with @peurpdapeurp on his app. He's making a real-time streaming application, and one of the goals is to minimize latency. Think "digital police radio" kind of use case.
If DefaultExtractorFactory needs a lot of data to select an extractor, it would increase start-up delay perceived by the user. Given AdtsExtractor::sniff only needs four ADTS frames, ideally DefaultExtractorFactory should select ADTS as soon as four complete frames have arrived (and that is already causing 400ms of user perceived latency in his current codec setting).

@andrewlewis
Copy link
Collaborator

In ProgressiveMediaPeriod we try to sniff with each provided extractor in order, returning as soon as sniff returns true, so I think the intention was to have extractors where the sniff method is less likely to have a false positive higher up the list.

It's a bit of a hack, but I think one way to avoid reading the data during sniffing for your use case would be to make a custom extractor that delegates all methods to an internal AdtsExtractor except for sniff which can just return true. Then pass a custom extractors factory that returns just your custom extractor.

Caveats: I didn't check whether AdtsExtractor is going to end up reading more input before starting output, and you may find downstream components introduce latency too. For low latency you probably want to configure LoadControl to start playback when very little data is buffered, and you may find that bundling your own codec (e.g, via the ffmpeg extension) reduces latency compared to using MediaCodec.

@yoursunny
Copy link

Yes, we made a custom ExtractorsFactory and it's sufficient for now.

I'd suggest ExoPlayer to provide a SingleExtractorFactory:

  • Constructor accepts the type of an extractor.
  • Wrap the extractor so that sniff returns true immediately.
  • createExtractors returns instances of the wrapped extractor.

This would be beneficial to applications where the exact media type is already known.

@ojw28
Copy link
Contributor

ojw28 commented Aug 29, 2019

We could also consider having ProgressiveMediaPeriod bypass sniffing in the case that the factory only provides one extractor. If we did that then I think implementing the required ExtractorFactory would just be a lambda, so there wouldn't be a need for the library to do it.

The downside of this approach is that the failure mode will be more obscure if the app provides a single extractor and then tries to play media in another format. That said, by the time apps are providing a single extractor, it's probably reasonable for us to trust that they're doing the right thing.

At some point we should also revisit the sniffing order. In particular, I don't think we've ever looked at how much data each extractor sniffs, on average, before establishing the media is not in its own format. If there are extractors that have to sniff a lot of data to do this, we should consider whether it's possible to push them down in the order without increasing the chances of false positives.

@andrewlewis andrewlewis assigned kim-vde and unassigned tonihei Aug 30, 2019
tonihei pushed a commit that referenced this issue Sep 5, 2019
Sniffing is performed in ProgressiveMediaPeriod even if a single
extractor is provided. Skip it in that case to improve performances.

Issue:#6325
PiperOrigin-RevId: 266766373
@tonihei
Copy link
Collaborator

tonihei commented Sep 5, 2019

Fixed by commit above.

@tonihei tonihei closed this as completed Sep 5, 2019
ojw28 pushed a commit that referenced this issue Sep 17, 2019
Sniffing is performed in ProgressiveMediaPeriod even if a single
extractor is provided. Skip it in that case to improve performances.

Issue:#6325
PiperOrigin-RevId: 266766373
@google google locked and limited conversation to collaborators Nov 5, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

7 participants