Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add decoder 7.1.4 output channel support #10322

Merged
merged 4 commits into from
Jun 14, 2022

Conversation

ybai001
Copy link
Contributor

@ybai001 ybai001 commented Jun 8, 2022

Since Android 12L, Android adds spatialization support. Decoder will output 7.1.4ch PCM to get better immersive experience, e.g. Dolby DD+JOC decoder. Without this change, DD+JOC content playback will fail if device spatializer is enabled.

icbaker and others added 4 commits March 15, 2022 16:39
It was renamed to ServerSideAdInsertionMediaSource in
google@cbceb2a
Also force people to use one of the templates

Documentation:
https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/syntax-for-issue-forms
PiperOrigin-RevId: 437210882
(cherry picked from commit e11d105)
Since Android 12L, Android adds spatialization support. Decoder will output 7.1.4ch PCM to get better immersive experience, e.g. Dolby DD+JOC decoder.
@ojw28
Copy link
Contributor

ojw28 commented Jun 8, 2022

@christosts - Please take a look; thanks!

@@ -1644,6 +1644,9 @@ public static int getAudioTrackChannelConfig(int channelCount) {
// 8 ch output is not supported before Android L.
return AudioFormat.CHANNEL_INVALID;
}
case 12:
return Util.SDK_INT >= 32 ? AudioFormat.CHANNEL_OUT_7POINT1POINT4
: AudioFormat.CHANNEL_INVALID;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious (I don't have an Android 12L device with such decoding capabilities at my disposal): have you seen what Spatializer.canBeSpatialized() returns, if we pass an AudioFormat with the channel mask set to AudioFormat.CHANNEL_OUT_7POINT1POINT4?

Copy link
Contributor Author

@ybai001 ybai001 Jun 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on the spatializer capability on that device. For example, as long as one device integrates new Dolby DD+JOC decoder (multichannel output support), one Dolby 7.1.4 spatializer is integrated on that device too. So Spatializer.canBeSpatialized() will return true if you pass an AudioFormat with the channel mask set to AudioFormat.CHANNEL_OUT_7POINT1POINT4. The related code is in AudioPolicyManager::canBeSpatialized().

Our Dolby new DD+JOC decoder logic is,

  • If application sets MediaFormat.KEY_MAX_OUTPUT_CHANNEL_COUNT >= 12, decoder will output 7.1.4ch PCM.
  • If application sets this key to [8, 11], decoder will output 7.1ch PCM.
  • If application sets this key to [6, 7], decoder will output 5.1ch PCM.
  • If application sets this key to [2, 5] or doesn't set this key, decoder will output 2.0ch PCM.

ExoPlayer sets this key value to 99 in current release. So DD+JOC decoder will output 7.1.4ch for DD+JOC content.

Copy link
Contributor

@christosts christosts Jun 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the answer. How many channels will the compressed DD+JOC stream have before we decode it? Will it be 12 channels or something else?

For context: ExoPlayer will call the Spatializer.canBeSpatialized() and use a channel mask based on the channel count of the compressed audio. This will happen before decoding, while the player decides which audio track to select.

For example, some DD+JOC content we use is 6 channels and ExoPlayer will call Spatializer.canBeSpatialized() with channel mask AudioFormat.CHANNEL_OUT_5POINT1. I wonder if it is possible to

  • call Spatializer.canBeSpatialized() with a channel mask (e.g. AudioFormat.CHANNEL_OUT_5POINT1) and decide to select the audio track
  • then the decoder outputs a different number of channel which the spatializer cannot spatialize, therefore the user does not hear spatial audio.

Do you see this scenario possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent question!

The original DD+JOC stream can be 11.1ch (11 is object number) or 15.1ch (15 is object number). The decoded DD+JOC content should go through Object Audio Renderer (OAR, a component inside decoder) to be converted from object based audio to channel based audio (from 11.1/15.1 to 7.1.4 currently). OAR can output 2.0/5.1/7.1/5.1.2/7.1.4ch PCM based on configuration. Currently, for DD+JOC content, we always config it to output 7.1.4ch PCM.

As far as your example "some DD+JOC content we use is 6 channels", I don't think that is DD+JOC content. It should be legacy DD+ content with 5.1 channel mask. In that case, ExoPlayer can call Spatializer.canBeSpatialized() with channel mask AudioFormat.CHANNEL_OUT_5POINT1 and TRUE is returned in this case. Then ExoPlayer will set MediaFormat.KEY_MAX_OUTPUT_CHANNEL_COUNT to 99 (or 6, any case is OK.). For our decoder, it will output 6 channel PCM since this content is legacy 5.1 DD+ content rather than DD+JOC content. The 5.1 channel PCM goes to spatializer thread. According to CURRENT Android spatializer thread logic, this 5.1 channel PCM will be upmix to 7.1.4 by adding "0" to left rear / right rear and all four top channels. Finally, this 12 channel (6 of them are zeros) signal is spatialized by spatializer.

As far as DD+JOC content, your method will success on 11.1 DD+JOC content but fail on 15.1 DD+JOC.

  • For 11.1 DD+JOC content, you get the channel count 12 from .mpd or .m3u8 file. Then you use AudioFormat.CHANNEL_OUT_7POINT1POINT4 as the input parameter to call canBeSpatialized(), it will return true.
  • For 15.1 DD+JOC content, you get the channel count 16 from .mpd or .m3u8 file. Then you use AudioFormat.CHANNEL_OUT_9POINT1POINT6 as the input parameter to call canBeSpatialized(), it will return false. So decoder will output two channel PCM in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After internal discussion, our team's suggestion is to call Spatializer.canBeSpatialized() with AudioFormat.CHANNEL_OUT_7POINT1POINT4 always for DD+JOC content if Util.SDK_INT >= 32.

Two more things:

  • For most DD+JOC contents in market, you will get value = 16 from manifest file.
  • The original track selector logic should not be influenced if Util.SDK_INT < 32. That means, if the target device is Android S (API level 31), the content have two tracks: (1) 2.0 AAC; (2) DD+JOC. In this case, both AAC decoder and DD+JOC decoder output two channels, but DD+JOC content should have higher priority due to higher bitrates and higher content channel count. (And our DD+JOC decoder will do immersive processing internally so it has better experience than AAC.)

Copy link
Contributor

@christosts christosts Jun 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as DD+JOC content, your method will success on 11.1 DD+JOC content but fail on 15.1 DD+JOC.

I would like to clarify what might happen with 15.1 DD+JOC:

  • We generally select the channel mask with the method you have updated. For 16 channels, Util.getAudioTrackChannelConfig() will return AudioFormat.CHANNEL_INVALID.
  • During track seletion, I expect Spatializer.canBeSpatialized() to return false with channel mask AudioFormat.CHANNEL_INVALID. Therefore, on an Android 12L device that supports Spatialization, we will try to select an alternative audio track, e.g. if the content offers a stereo track.
  • Still, the player might pick the 15.1 DD+JOC track. That can happen either because the content does not have other audio tracks available, or because the device does not support Spatialization in which case we generally select the audio track with the highest channel count.
    • "device does not support Spatialization" means: device is before Android 12L, or device is Android 12L+ and Spatializer.getImmersiveAudioLevel() returns SPATIALIZER_IMMERSIVE_AUDIO_LEVEL_NONE.
  • ExoPlayer configures MediaCodec with MediaFormat.KEY_MAX_OUTPUT_CHANNEL_COUNT to 99. Will the decoder output 16 channels in this case?
  • If the decoder output 16 channels, we will call Util.getAudioTrackChannelConfig(16) to get a channel mask to configure AudioTrack, which will be AudioFormat.CHANNEL_INVALID and not AudioFormat.CHANNEL_OUT_9POINT1POINT6. I believe playback will fail.

I'm just bringing this case to your attention in case you want 15.1 DD+JOC to work too.

Copy link
Contributor Author

@ybai001 ybai001 Jun 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your detailed clarification. I think the key point here is that DD+JOC is unique and its stream channel count (16) is different from its decoder output channel count (12). The reason is DD+JOC is object based stream rather than legacy channel based stream.

As far as your questions,

ExoPlayer configures MediaCodec with MediaFormat.KEY_MAX_OUTPUT_CHANNEL_COUNT to 99. Will the decoder output 16 channels in this case?

Decoder will output 7.1.4 (12 channels) if device is Android 12L and later. Decoder will output 2 channels if device is Android 12 and before.

I'm just bringing this case to your attention in case you want 15.1 DD+JOC to work too.

YES, we need to make 15.1 DD+JOC work too because most of DD+JOC contents in market are this format.

So, is it possible to implement this proposal on your side?

After internal discussion, our team's suggestion is to call Spatializer.canBeSpatialized() with AudioFormat.CHANNEL_OUT_7POINT1POINT4 always for DD+JOC content if Util.SDK_INT >= 32.

That is to say, the first two steps would be

  • We generally select the channel mask with the method you have updated. For 16 channels, Util.getAudioTrackChannelConfig() will return AudioFormat.CHANNEL_INVALID.
  • During track selection, I know this is DD+JOC content so that I call Spatializer.canBeSpatialized() with channel mask AudioFormat.CHANNEL_OUT_7POINT1POINT4 rather than AudioFormat.CHANNEL_INVALID. On an Android 12L device that integrates Dolby Spatializer, I expect it returns true. Therefore, DD+JOC track is selected and user can get better immersive effect.

The corresponding pseudocode is
int channelConfig = Util.getAudioTrackChannelConfig(16);
if (Util.SDK_INT >= 32 && MimeTypes.AUDIO_E_AC3_JOC == format.sampleMimeType)
channelConfig = AudioFormat.CHANNEL_OUT_7POINT1POINT4;
// Call Spatializer.canBeSpatialized()

@christosts
Copy link
Contributor

Thank you for the contribution. We'll try to include this CL in the upcoming 2.18 release

@marcbaechinger marcbaechinger merged commit 1c373d5 into google:dev-v2 Jun 14, 2022
@ybai001 ybai001 deleted the dev-v2-multichannel branch June 15, 2022 06:16
marcbaechinger added a commit that referenced this pull request Jun 15, 2022
PiperOrigin-RevId: 454641746
(cherry picked from commit 1c373d5)
rohitjoins pushed a commit that referenced this pull request Jul 7, 2022
PiperOrigin-RevId: 454641746
(cherry picked from commit 970eb44)
rohitjoins pushed a commit that referenced this pull request Jul 13, 2022
@google google locked and limited conversation to collaborators Aug 14, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants