This repository has been archived by the owner. It is now read-only.

Muxing 8-channel raw AAC-LC is not working #2107

Closed
Chipcraft opened this Issue Sep 29, 2017 · 13 comments

Comments

4 participants
@Chipcraft

Chipcraft commented Sep 29, 2017

Muxing 8-channel (7.1) AAC-LC directly results in mkv, which doesn't load properly in MPC-HC. Muxing such audio track will prevent all other (e.g. PGS subtitles) tracks from loading as well, if they are placed after (greater track number) the AAC audio track.

E.G.

Track 0:0 = Video, Track 0:1 = OPUS Audio, Track 0:2 = AAC 7.1 Audio, Track 0:3 = PGS Subtitle, Track 0:4 = PGS Subtitle.

In the resulting example, MPC-HC will only see tracks 0:0 & 0:1.

MediaInfo will detect the number of channels for the AAC stream as 0.

A workaround: Mux the AAC 7.1 track to mkv with FFMpeg (e.g. 3.3.4) and use the mkv containing the AAC 7.1 track as a source in MKVToolNix. This way everything works as expected.

@mbunkus

This comment has been minimized.

Show comment
Hide comment
@mbunkus

mbunkus Sep 29, 2017

Owner

Thanks for the report. Please upload a sample file (the AAC before muxing it) to my FTP server or somewhere else. Thanks.

Owner

mbunkus commented Sep 29, 2017

Thanks for the report. Please upload a sample file (the AAC before muxing it) to my FTP server or somewhere else. Thanks.

@Chipcraft

This comment has been minimized.

Show comment
Hide comment
@Chipcraft

Chipcraft commented Sep 29, 2017

Sample: https://1drv.ms/u/s!Ag6oE4SOsCmDhQnLblQH2yOrduBd

5.1 AAC-LC works normally.

@mbunkus

This comment has been minimized.

Show comment
Hide comment
@mbunkus

mbunkus Sep 29, 2017

Owner

Thanks. I'll look into it, but that'll take a while due to current time constraints.

Owner

mbunkus commented Sep 29, 2017

Thanks. I'll look into it, but that'll take a while due to current time constraints.

@mbunkus

This comment has been minimized.

Show comment
Hide comment
@mbunkus

mbunkus Sep 29, 2017

Owner

The problem with that file is the ADTS AAC container format itself. The ADTS container cannot contain all the information modern AAC requires. Therefore the LOAS/LATM container was invented.

For example: in ADTS there are only two bits for the profile which doesn't allow for signalling HE-AACv2. Another problem is the number of channels for which only three bits are available.

Your file has all those channel bits set to 0. This is invalid according to the ADTS standard. ffmpeg circumvents this by fully decoding an AAC frame and thereby determining how many channels there actually are. mkvmerge cannot do that as mkvmerge does not contain a full AAC decoder (and it never will). So this boils down to it being a design limitation of mkvmerge vs. a design limitation of the ADTS AAC container format.

Note that MP4Box has the same issue, for the same reasons.

This bug (or rather: this design limitation) won't be fixed. You should continue using the workaround you've described.

I will adjust mkvmerge to emit an error for such files, though, so users won't have the impression that everything went well when it didn't.

Owner

mbunkus commented Sep 29, 2017

The problem with that file is the ADTS AAC container format itself. The ADTS container cannot contain all the information modern AAC requires. Therefore the LOAS/LATM container was invented.

For example: in ADTS there are only two bits for the profile which doesn't allow for signalling HE-AACv2. Another problem is the number of channels for which only three bits are available.

Your file has all those channel bits set to 0. This is invalid according to the ADTS standard. ffmpeg circumvents this by fully decoding an AAC frame and thereby determining how many channels there actually are. mkvmerge cannot do that as mkvmerge does not contain a full AAC decoder (and it never will). So this boils down to it being a design limitation of mkvmerge vs. a design limitation of the ADTS AAC container format.

Note that MP4Box has the same issue, for the same reasons.

This bug (or rather: this design limitation) won't be fixed. You should continue using the workaround you've described.

I will adjust mkvmerge to emit an error for such files, though, so users won't have the impression that everything went well when it didn't.

@mbunkus mbunkus closed this Sep 29, 2017

mbunkus added a commit that referenced this issue Sep 29, 2017

AAC reader: show error if sampling frequency or channels is 0
The problem is with the ADTS AAC container format itself. The ADTS
container cannot contain all the information modern AAC
requires. Therefore the LOAS/LATM container was invented.

For example: in ADTS there are only two bits for the profile which
doesn't allow for signalling HE-AACv2. Another problem is the number
of channels for which only three bits are available.

For files that have all those channel bits set to 0, the output file
contained a channel count of 0 — obviously invalid.

See #2107.
@JeromeMartinez

This comment has been minimized.

Show comment
Hide comment
@JeromeMartinez

JeromeMartinez Sep 30, 2017

For example: in ADTS there are only two bits for the profile which doesn't allow for signalling HE-AACv2.

a bit out of topic, but I kindly disagree: ADTS can not transport NBC (Non Backward Compatible, with ObjectType of 5 or 29) HE-AACv2 and can not transport explicit SBR/PS signaling too, but there is no problem for transporting HE-AACv2 with implicit signaling (you just don't get the SBR/PS info in the header, you see LC), you can transmux without problem HE-AACv2 with implicit signaling (from ADTS or MP4, same issue if MP4 has implicit SBR/PS signaling, you store the metadata channels/frequency of the compatibly layer, you get this legacy metadata either from MP4 with implicit signaling and ADTS)

Therefore the LOAS/LATM container was invented (...) mkvmerge cannot do that as mkvmerge does not contain a full AAC decoder (and it never will).

Can you mux from LOAS/LATM or MP4? If yes, this file has the same mechanism for signaling 7.1, you don't need a full AAC decoder, just link the begin of the frame content to the program_config_element for LOAS/LATM or GASpecificConfig (for MP4) parser you already get, channel config is at the very beginning of the frame and use the same bitstream as GASpecificConfig used in MP4 AAC descriptor, after having probed that the first 3 bits are set to 5.

000000 adts_frame - 0 (0x0) (49 bytes)
000000  adts_fixed_header (3 bytes)
000000   syncword:                             4095 (0xFFF)
000001   id:                                   No - MPEG-4
000001   layer:                                0 (0x0)
000001   protection_absent:                    Yes
000002   profile_ObjectType:                   1 (0x1) - (2 bits) - AAC LC
000002   sampling_frequency_index:             3 (0x3) - (4 bits) - 48000 (0xBB80) Hz
000002   private:                              No
000002   channel_configuration:                0 (0x0) - (3 bits)
000003   original:                             No
000003   home:                                 No
000003  adts_variable_header (4 bytes)
000003   copyright_id:                         No
000003   copyright_id_start:                   No
000003   aac_frame_length:                     49 (0x0031) - (13 bits)
000005   adts_buffer_fullness:                 2047 (0x7FF) - (11 bits) - VBR
000006   num_raw_data_blocks:                  0 (0x0) - (2 bits)
000007  raw_data_block (49 bytes)
000000   PCE - program_config_element (16 bytes)
000007    id_syn_ele:                          5 (0x5) - (3 bits) - PCE - program_config_element
000007    program_config_element (9 bytes)
000007     element_instance_tag:               0 (0x0) - (4 bits)
000007     object_type:                        1 (0x1) - (2 bits) - AAC LC
000008     sampling_frequency_index:           3 (0x3) - (4 bits) - 48000 (0xBB80)
000008     num_front_channel_elements:         2 (0x2) - (4 bits)
000009     num_side_channel_elements:          0 (0x0) - (4 bits)
000009     num_back_channel_elements:          2 (0x2) - (4 bits)
00000A     num_lfe_channel_elements:           1 (0x1) - (2 bits)
00000A     num_assoc_data_elements:            0 (0x0) - (3 bits)
00000A     num_valid_cc_elements:              0 (0x0) - (4 bits)
00000B     mono_mixdown_present:               No
00000B     stereo_mixdown_present:             No
00000B     matrix_mixdown_idx_present:         No
00000B     front_element (1 bytes)
00000B      front_element_is_cpe:              No
00000C      front_element_tag_select:          0 (0x0) - (4 bits)
00000C     front_element (0 bytes)
00000C      front_element_is_cpe:              Yes
00000C      front_element_tag_select:          0 (0x0) - (4 bits)
00000C     side_element (1 bytes)
00000C      side_element_is_cpe:               Yes
00000D      side_element_tag_select:           1 (0x1) - (4 bits)
00000D     back_element (1 bytes)
00000D      back_element_is_cpe:               Yes
00000E      back_element_tag_select:           2 (0x2) - (4 bits)
00000E     lfe_element (0 bytes)
00000E      lfe_element_tag_select:            0 (0x0) - (4 bits)
00000F     comment_field_bytes:                0 (0x00)
000010   SCE - single_channel_element (4 bytes)
000010    id_syn_ele:                          0 (0x0) - (3 bits) - SCE - single_channel_element
(...)

(Not saying that you should do it, just an hint in case you already get most of the code and that you are interested in supporting such file).

@Chipcraft looks like you use an outdated MediaInfo version, current one correctly displays 7.1 with this file.

JeromeMartinez commented Sep 30, 2017

For example: in ADTS there are only two bits for the profile which doesn't allow for signalling HE-AACv2.

a bit out of topic, but I kindly disagree: ADTS can not transport NBC (Non Backward Compatible, with ObjectType of 5 or 29) HE-AACv2 and can not transport explicit SBR/PS signaling too, but there is no problem for transporting HE-AACv2 with implicit signaling (you just don't get the SBR/PS info in the header, you see LC), you can transmux without problem HE-AACv2 with implicit signaling (from ADTS or MP4, same issue if MP4 has implicit SBR/PS signaling, you store the metadata channels/frequency of the compatibly layer, you get this legacy metadata either from MP4 with implicit signaling and ADTS)

Therefore the LOAS/LATM container was invented (...) mkvmerge cannot do that as mkvmerge does not contain a full AAC decoder (and it never will).

Can you mux from LOAS/LATM or MP4? If yes, this file has the same mechanism for signaling 7.1, you don't need a full AAC decoder, just link the begin of the frame content to the program_config_element for LOAS/LATM or GASpecificConfig (for MP4) parser you already get, channel config is at the very beginning of the frame and use the same bitstream as GASpecificConfig used in MP4 AAC descriptor, after having probed that the first 3 bits are set to 5.

000000 adts_frame - 0 (0x0) (49 bytes)
000000  adts_fixed_header (3 bytes)
000000   syncword:                             4095 (0xFFF)
000001   id:                                   No - MPEG-4
000001   layer:                                0 (0x0)
000001   protection_absent:                    Yes
000002   profile_ObjectType:                   1 (0x1) - (2 bits) - AAC LC
000002   sampling_frequency_index:             3 (0x3) - (4 bits) - 48000 (0xBB80) Hz
000002   private:                              No
000002   channel_configuration:                0 (0x0) - (3 bits)
000003   original:                             No
000003   home:                                 No
000003  adts_variable_header (4 bytes)
000003   copyright_id:                         No
000003   copyright_id_start:                   No
000003   aac_frame_length:                     49 (0x0031) - (13 bits)
000005   adts_buffer_fullness:                 2047 (0x7FF) - (11 bits) - VBR
000006   num_raw_data_blocks:                  0 (0x0) - (2 bits)
000007  raw_data_block (49 bytes)
000000   PCE - program_config_element (16 bytes)
000007    id_syn_ele:                          5 (0x5) - (3 bits) - PCE - program_config_element
000007    program_config_element (9 bytes)
000007     element_instance_tag:               0 (0x0) - (4 bits)
000007     object_type:                        1 (0x1) - (2 bits) - AAC LC
000008     sampling_frequency_index:           3 (0x3) - (4 bits) - 48000 (0xBB80)
000008     num_front_channel_elements:         2 (0x2) - (4 bits)
000009     num_side_channel_elements:          0 (0x0) - (4 bits)
000009     num_back_channel_elements:          2 (0x2) - (4 bits)
00000A     num_lfe_channel_elements:           1 (0x1) - (2 bits)
00000A     num_assoc_data_elements:            0 (0x0) - (3 bits)
00000A     num_valid_cc_elements:              0 (0x0) - (4 bits)
00000B     mono_mixdown_present:               No
00000B     stereo_mixdown_present:             No
00000B     matrix_mixdown_idx_present:         No
00000B     front_element (1 bytes)
00000B      front_element_is_cpe:              No
00000C      front_element_tag_select:          0 (0x0) - (4 bits)
00000C     front_element (0 bytes)
00000C      front_element_is_cpe:              Yes
00000C      front_element_tag_select:          0 (0x0) - (4 bits)
00000C     side_element (1 bytes)
00000C      side_element_is_cpe:               Yes
00000D      side_element_tag_select:           1 (0x1) - (4 bits)
00000D     back_element (1 bytes)
00000D      back_element_is_cpe:               Yes
00000E      back_element_tag_select:           2 (0x2) - (4 bits)
00000E     lfe_element (0 bytes)
00000E      lfe_element_tag_select:            0 (0x0) - (4 bits)
00000F     comment_field_bytes:                0 (0x00)
000010   SCE - single_channel_element (4 bytes)
000010    id_syn_ele:                          0 (0x0) - (3 bits) - SCE - single_channel_element
(...)

(Not saying that you should do it, just an hint in case you already get most of the code and that you are interested in supporting such file).

@Chipcraft looks like you use an outdated MediaInfo version, current one correctly displays 7.1 with this file.

@Chipcraft

This comment has been minimized.

Show comment
Hide comment
@Chipcraft

Chipcraft Sep 30, 2017

@JeromeMartinez

MediaInfo displays the correct channel information as long as the file isn't muxed with MKVMerge.
Once you do, it displays 0 channels.

Chipcraft commented Sep 30, 2017

@JeromeMartinez

MediaInfo displays the correct channel information as long as the file isn't muxed with MKVMerge.
Once you do, it displays 0 channels.

@mbunkus

This comment has been minimized.

Show comment
Hide comment
@mbunkus

mbunkus Sep 30, 2017

Owner

Yes, mkvmerge can parse LOAS/LATM in M2TS as well as GASpecificConfig in MP4 just fine.

Can you mux from LOAS/LATM or MP4? If yes, this file has the same mechanism for signaling 7.1, you don't need a full AAC decoder, just link the begin of the frame content to the program_config_element for LOAS/LATM or GASpecificConfig (for MP4) parser you already get, channel config is at the very beginning of the frame and use the same bitstream as GASpecificConfig used in MP4 AAC descriptor, after having probed that the first 3 bits are set to 5.

I see. I wasn't aware that the PCE is right at the beginning. That simplifies things. I'll look into it. Thanks for the information.

Owner

mbunkus commented Sep 30, 2017

Yes, mkvmerge can parse LOAS/LATM in M2TS as well as GASpecificConfig in MP4 just fine.

Can you mux from LOAS/LATM or MP4? If yes, this file has the same mechanism for signaling 7.1, you don't need a full AAC decoder, just link the begin of the frame content to the program_config_element for LOAS/LATM or GASpecificConfig (for MP4) parser you already get, channel config is at the very beginning of the frame and use the same bitstream as GASpecificConfig used in MP4 AAC descriptor, after having probed that the first 3 bits are set to 5.

I see. I wasn't aware that the PCE is right at the beginning. That simplifies things. I'll look into it. Thanks for the information.

@mbunkus

This comment has been minimized.

Show comment
Hide comment
@mbunkus

mbunkus Oct 1, 2017

Owner

@JeromeMartinez As far as I see ISO/IEC 14496-3 doesn't say anything about the order of the raw_data_blocks. Additionally program_config_element seems to be optional. So in order to reliably determine the number of channels from the bitstream a parser would still have to be able to decode all syntax elements, not just the PCE. Therefore I stand by my earlier assertion.

For this specific file my patch fixes the issue, true, but it's not a panacea for bad header data.

Owner

mbunkus commented Oct 1, 2017

@JeromeMartinez As far as I see ISO/IEC 14496-3 doesn't say anything about the order of the raw_data_blocks. Additionally program_config_element seems to be optional. So in order to reliably determine the number of channels from the bitstream a parser would still have to be able to decode all syntax elements, not just the PCE. Therefore I stand by my earlier assertion.

For this specific file my patch fixes the issue, true, but it's not a panacea for bad header data.

@JeromeMartinez

This comment has been minimized.

Show comment
Hide comment
@JeromeMartinez

JeromeMartinez Oct 1, 2017

True in theory, in practice I never saw PCE missing or elsewhere than first element in an AAC bitstream when channel config is not available from elsewhere (e.g. from old fashion ADTS header), and I see no reason to put PCE elsewhere (here, PCE is there because ADTS can not handle 7.1 config natively, this is a legitimate hack if you don't have the choice and must use ADTS for transporting the stream), so I would say that your fix is OK for 99% of 7.1 files in ADTS, and for the missing 1%... Their problem if they want to make thing complicated.
Note that MediaInfo has a relatively lightweight AAC parser dedicated to find real channel count / frequency (including SBR and PS when they are implicitly transported i.e. without any info in descriptor, as it is sometimes), but I think it is not worth it to try to implement it for such hypothetical file or implicitly transported SBR/PS without having this feature strongly requested.

JeromeMartinez commented Oct 1, 2017

True in theory, in practice I never saw PCE missing or elsewhere than first element in an AAC bitstream when channel config is not available from elsewhere (e.g. from old fashion ADTS header), and I see no reason to put PCE elsewhere (here, PCE is there because ADTS can not handle 7.1 config natively, this is a legitimate hack if you don't have the choice and must use ADTS for transporting the stream), so I would say that your fix is OK for 99% of 7.1 files in ADTS, and for the missing 1%... Their problem if they want to make thing complicated.
Note that MediaInfo has a relatively lightweight AAC parser dedicated to find real channel count / frequency (including SBR and PS when they are implicitly transported i.e. without any info in descriptor, as it is sometimes), but I think it is not worth it to try to implement it for such hypothetical file or implicitly transported SBR/PS without having this feature strongly requested.

@mbunkus

This comment has been minimized.

Show comment
Hide comment
@mbunkus

mbunkus Oct 1, 2017

Owner

True in theory, in practice I never saw PCE missing or elsewhere than first element in an AAC bitstream when channel config is not available from elsewhere

Yeah, I assumed as much as having the number of channels before any of the samples are decoded is paramount to the decoding process. ISO/IEC 14496-3 section 1.A.4.3 "Audio Data Transport Stream (ADTS)" has this to say:

channel_configuration: Indicates the channel configuration used. In the case of (channel_configuration >0), the channel configuration is given in Table 1.17. In the case of (channel_configuration == 0), the channel configuration is not specified in the header, but as follows:
MPEG-2/4 ADTS: A single program_config_element() following as first syntactic element in the first raw_data_block() after the header specifies the channel configuration. Note that the
program_config_element() might not be present in each frame. An MPEG-4 ADTS decoder should not generate any output until it received a program_config_element(), while an MPEG-2 ADTS decoder may
assume an implicit channel configuration.

This further strengthens the assumption that this fix is the right one.

here, PCE is there because ADTS can not handle 7.1 config natively

Uhm, why not? According to ISO/IEC 14496-3 table 1.17 channel_configuration == 7 means 7.1 = 8 channels.

Owner

mbunkus commented Oct 1, 2017

True in theory, in practice I never saw PCE missing or elsewhere than first element in an AAC bitstream when channel config is not available from elsewhere

Yeah, I assumed as much as having the number of channels before any of the samples are decoded is paramount to the decoding process. ISO/IEC 14496-3 section 1.A.4.3 "Audio Data Transport Stream (ADTS)" has this to say:

channel_configuration: Indicates the channel configuration used. In the case of (channel_configuration >0), the channel configuration is given in Table 1.17. In the case of (channel_configuration == 0), the channel configuration is not specified in the header, but as follows:
MPEG-2/4 ADTS: A single program_config_element() following as first syntactic element in the first raw_data_block() after the header specifies the channel configuration. Note that the
program_config_element() might not be present in each frame. An MPEG-4 ADTS decoder should not generate any output until it received a program_config_element(), while an MPEG-2 ADTS decoder may
assume an implicit channel configuration.

This further strengthens the assumption that this fix is the right one.

here, PCE is there because ADTS can not handle 7.1 config natively

Uhm, why not? According to ISO/IEC 14496-3 table 1.17 channel_configuration == 7 means 7.1 = 8 channels.

@JeromeMartinez

This comment has been minimized.

Show comment
Hide comment
@JeromeMartinez

JeromeMartinez Oct 1, 2017

Uhm, why not? According to ISO/IEC 14496-3 table 1.17 channel_configuration == 7 means 7.1 = 8 channels.

Argh, mine version is from 2009, and values 6-7 are reserved. I would say that it is from an old encoder conforming to version 2009 of the spec.

Note: reading quickly the spec, I think you (and me) were wrong about the location of PCE.
"ADTS: A single program_config_element() following as first syntactic
element in the first raw_data_block() after the header specifies the channel
configuration."

Edit: you edited your comment at the same time for saying the same thing ;-)

JeromeMartinez commented Oct 1, 2017

Uhm, why not? According to ISO/IEC 14496-3 table 1.17 channel_configuration == 7 means 7.1 = 8 channels.

Argh, mine version is from 2009, and values 6-7 are reserved. I would say that it is from an old encoder conforming to version 2009 of the spec.

Note: reading quickly the spec, I think you (and me) were wrong about the location of PCE.
"ADTS: A single program_config_element() following as first syntactic
element in the first raw_data_block() after the header specifies the channel
configuration."

Edit: you edited your comment at the same time for saying the same thing ;-)

@mbunkus

This comment has been minimized.

Show comment
Hide comment
@mbunkus

mbunkus Oct 1, 2017

Owner

Note: reading quickly the spec, I think you (and me) were wrong about the location of PCE.

"ADTS: A single program_config_element() following as first syntactic element in the first raw_data_block() after the header specifies the channel configuration."

No, we're right 😄 The first syntactic element is the 3 bit long id_syn_ele.

Owner

mbunkus commented Oct 1, 2017

Note: reading quickly the spec, I think you (and me) were wrong about the location of PCE.

"ADTS: A single program_config_element() following as first syntactic element in the first raw_data_block() after the header specifies the channel configuration."

No, we're right 😄 The first syntactic element is the 3 bit long id_syn_ele.

@remuxer32

This comment has been minimized.

Show comment
Hide comment
@remuxer32

remuxer32 Oct 1, 2017

Collaborator

New pre-builds for Windows that contain the fix have been uploaded here:

Collaborator

remuxer32 commented Oct 1, 2017

New pre-builds for Windows that contain the fix have been uploaded here:

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.