Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DecodeESDecriptor: read size 19 differs from calculated size 17 #348

Closed
3052 opened this issue Apr 27, 2024 · 12 comments
Closed

DecodeESDecriptor: read size 19 differs from calculated size 17 #348

3052 opened this issue Apr 27, 2024 · 12 comments

Comments

@3052
Copy link

3052 commented Apr 27, 2024

using this file:

https://u2.video.9c9media.com/video/v1/264265/dash/widevine/zbest-01000010/01-75rm72rm68qmjwzllzjpqsigwgzllgv76qzxqqpppplpppppvspppz7ak726fzpzqzrxtwtstuyzsppppzvppvzqppp/aac-ffa6v1-english-primary-128000/init.mp4

I get this result with current master:

> mp4ff-info init.mp4
2024/04/27 16:35:32 decode box "moov": decode box trak: decode box mdia: decode
box minf: decode box stbl: decode box stsd: decode box enca: decode box esds:
DecodeESDecriptor: read size 19 differs from calculated size 17
@3052
Copy link
Author

3052 commented Apr 27, 2024

works with Bento4:

Bento4-SDK-1-6-0-641.x86_64-microsoft-win32> bin\mp4dump init.mp4
[ftyp] size=8+20
  major_brand = isom
  minor_version = 0
  compatible_brand = isom
  compatible_brand = iso5
  compatible_brand = iso6
[moov] size=8+715
  [mvhd] size=12+96
    timescale = 90000
    duration = 0
    duration(ms) = 0
  [trak] size=8+476
    [tkhd] size=12+80, flags=7
      enabled = 1
      id = 1
      duration = 0
      width = 0.000000
      height = 0.000000
    [mdia] size=8+376
      [mdhd] size=12+20
        timescale = 90000
        duration = 0
        duration(ms) = 0
        language = und
      [hdlr] size=12+25
        handler_type = soun
        handler_name =
      [minf] size=8+299
        [smhd] size=12+4
          balance = 0
        [dinf] size=8+28
          [dref] size=12+16
            [url ] size=12+0, flags=1
              location = [local to file]
        [stbl] size=8+239
          [stsd] size=12+159
            entry_count = 1
            [enca] size=8+147
              data_reference_index = 1
              channel_count = 2
              sample_size = 16
              sample_rate = 48000
              [esds] size=12+27
                [ESDescriptor] size=2+25
                  es_id = 1
                  stream_priority = 0
                  [DecoderConfig] size=2+19
                    stream_type = 5
                    object_type = 64
                    up_stream = 0
                    buffer_size = 0
                    max_bitrate = 0
                    avg_bitrate = 128000
                    DecoderSpecificInfo = 11 90
                    [Descriptor:06] size=2+1
              [sinf] size=8+72
                [frma] size=8+4
                  original_format = mp4a
                [schm] size=12+8
                  scheme_type = cenc
                  scheme_version = 65536
                [schi] size=8+32
                  [tenc] size=12+20
                    default_isProtected = 1
                    default_Per_Sample_IV_Size = 8
                    default_KID = [cb 09 57 1e eb cb 3f 72 87 20 26 57 f6 b9 f7 a6]
          [stts] size=12+4
            entry_count = 0
          [stsc] size=12+4
            entry_count = 0
          [stsz] size=12+8
            sample_size = 0
            sample_count = 0
          [stco] size=12+4
            entry_count = 0
  [mvex] size=8+32
    [trex] size=12+20
      track id = 1
      default sample description index = 1
      default sample duration = 0
      default sample size = 0
      default sample flags = 0
  [pssh] size=12+71
    system_id = [ed ef 8b a9 79 d6 4a ce a3 c8 27 dc d5 1d 21 ed]
    data_size = 51

@tobbee
Copy link
Collaborator

tobbee commented May 2, 2024

Interesting.

I checked the parsing of the DecoderConfigDescriptor in your file and it does declare a size of its payload as 19 bytes as Bento says, but I can only find 17 bytes of parseable payload. There may be some special syntax due to the the object type which has the uncommon value of 64, which I couldn't find in the standard when I looked at it quickly. The normal values are 2, 5, or 29. I may need to look in a newer amendment.

The MPEG-4 descriptors are a big mess, though with their own definitions in part 1 and part 3 not following the box structure of mp4 files format. I will look a bit more later to see if I can find a proper interpretation of the last two bytes.

@3052 Do you happen to know what variant of MPEG-4 audio it is?

@3052
Copy link
Author

3052 commented May 2, 2024

@3052 Do you happen to know what variant of MPEG-4 audio it is?

how would I get that information?

@tobbee
Copy link
Collaborator

tobbee commented May 3, 2024

Sorry, closer reading of the standard reveals that this the field listed as object_type by Bento4 and stored as ObjectType in mp4ff is not the ObjectType but an ObjectTypeIndication, where 64 means Audio. This is thus a normal value. It would still be good to get a media segment in addition to the init segment, so that I can run it through ffprobe.

Bento4 seems to be less strict in following the hierarchy of descriptors in the standard and reads descriptors more freely, so it is a bit hard to compare the code. In any case, I think it is worth looking through the descriptor parsing of mp4ff again.

@tobbee
Copy link
Collaborator

tobbee commented May 3, 2024

@3502. Thanks for the media segment. I've checked this more carefully, and this should be a common AAC-LC segment, but the size field of the DecoderConfig is erroneous. It should be 17 not 19 as the error message says (it could be rephrased to something better, though).

Some more details:

After the DecoderConfig descriptor, there should be a SLConfigDescriptor (1 byte data preceded by one byte tag=06 and byte value). However, due to the erroneous size, it starts already inside the DecodedConfig descriptor (the two last bytes). Bento indeed shows the [Descriptor:06] indented inside the [DecoderConfig] but only two of its 3 bytes fit in the DecoderConfig, so the last byte is actually a trailing byte outside the DecoderConfig, but inside the top-level ESDescriptor. The total size in ESDescriptor ends up right.

The code structure of mp4ff is to deserialise boxes and descriptors and the serialise them again when writing out the file. It should be possible to read any file and write it again. This will not work in this case, without adding quite some code to save this inconsistent hierarchy of descriptors.

This is not the first time I see issues with descriptors. I made some errors myself before I realised that some files used 4-byte sizes although one byte would be enough.

I'm not willing to change mp4ff to handle this kind of erroneous input since it is a relatively big task. If you have a lot of such content, you can make some preprocessing and correct the size byte from 19 to 17. I just did that change in an hex editor, and then mp4ff can parse it without problem.

good_esds.mp4

@3052
Copy link
Author

3052 commented May 3, 2024

note other tools work fine as well:

> mp4tool dump init.mp4
[ftyp] Size=28 MajorBrand="isom" MinorVersion=0 CompatibleBrands=[{CompatibleBrand="isom"}, {CompatibleBrand="iso5"}, {CompatibleBrand="iso6"}]
[moov] Size=723
  [mvhd] Size=108 ... (use "-full mvhd" to show all)
  [trak] Size=484
    [tkhd] Size=92 ... (use "-full tkhd" to show all)
    [mdia] Size=384
      [mdhd] Size=32 Version=0 Flags=0x000000 CreationTimeV0=0 ModificationTimeV0=0 Timescale=90000 DurationV0=0 Language="und" PreDefined=0
      [hdlr] Size=37 Version=0 Flags=0x000000 PreDefined=0 HandlerType="soun" Name=""
      [minf] Size=307
        [smhd] Size=16 Version=0 Flags=0x000000 Balance=0
        [dinf] Size=36
          [dref] Size=28 Version=0 Flags=0x000000 EntryCount=1
            [url ] Size=12 Version=0 Flags=0x000001
        [stbl] Size=247
          [stsd] Size=171 Version=0 Flags=0x000000 EntryCount=1
            [enca] Size=155 DataReferenceIndex=1 EntryVersion=0 ChannelCount=2 SampleSize=16 PreDefined=0 SampleRate=48000
              [esds] Size=39 ... (use "-full esds" to show all)
              [sinf] Size=80
                [frma] Size=12 DataFormat="mp4a"
                [schm] Size=20 Version=0 Flags=0x000000 SchemeType="cenc" SchemeVersion=0x10000
                [schi] Size=40
                  [tenc] Size=32 ... (use "-full tenc" to show all)
          [stts] Size=16 Version=0 Flags=0x000000 EntryCount=0 Entries=[]
          [stsc] Size=16 Version=0 Flags=0x000000 EntryCount=0 Entries=[]
          [stsz] Size=20 Version=0 Flags=0x000000 SampleSize=0 SampleCount=0 EntrySize=[]
          [stco] Size=16 Version=0 Flags=0x000000 EntryCount=0 ChunkOffset=[]
  [mvex] Size=40
    [trex] Size=32 Version=0 Flags=0x000000 TrackID=1 DefaultSampleDescriptionIndex=1 DefaultSampleDuration=0 DefaultSampleSize=0 DefaultSampleFlags=0x0
  [pssh] Size=83 ... (use "-full pssh" to show all)

https://github.com/abema/go-mp4

@tobbee
Copy link
Collaborator

tobbee commented May 3, 2024

Sure, if you have a non-strict parser, one can make it work. But, the mp4ff code is both parser and writer by design, so this content is not compatible with the internal representation.

@3052
Copy link
Author

3052 commented May 3, 2024

OK you say that, but you have not provided a source for your claims. can you link to the spec in question here?

@3052
Copy link
Author

3052 commented May 3, 2024

also, the other module does writing as well:

https://github.com/abema/go-mp4#writing

@tobbee
Copy link
Collaborator

tobbee commented May 10, 2024

The documents are ISO/IEC 14496-1 (MPEG-4 systems) and ISO/IEC 14496-3 (MPEG-4 audio).
The documents are unfortunately not public must buy from ISO. I have access as being a member of the Swedish national standardization group for MPEG, but I cannot share them.

In any case, I've updated the code to handle bad input and write some information about the issues in PR #350. The descriptors are in general not very essential for media decoding, which is a reason that one can tolerate the the input is bad.

@3052
Copy link
Author

3052 commented May 10, 2024

thank you!

@3052 3052 closed this as completed May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants