Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to decode mhm1 #56

Closed
sclsj opened this issue Jun 10, 2023 · 13 comments
Closed

Unable to decode mhm1 #56

sclsj opened this issue Jun 10, 2023 · 13 comments

Comments

@sclsj
Copy link

sclsj commented Jun 10, 2023

mediainfo /Users/jin/Downloads/01\ -\ YOASOBI\ -\ Guai\ Wu.mhm
General
Complete name                            : /Users/jin/Downloads/01 - YOASOBI - Guai Wu.mhm
Format                                   : MPEG-4
Format profile                           : Base Media / Version 1
Codec ID                                 : mp41 (iso8/isom/mp41/dash/cmfc)
File size                                : 39.4 MiB
Duration                                 : 3 min 25 s
Overall bit rate                         : 1 608 kb/s
Album                                    : 怪物
Album/Performer                          : YOASOBI
Part/Position                            : 1
Track name                               : 怪物
Track name/Position                      : 1
Performer                                : YOASOBI
Genre                                    : J-POP/General
Recorded date                            : 2021
Encoded date                             : UTC 2023-05-30 13:03:54
Tagged date                              : UTC 2023-05-30 13:03:54
Cover                                    : Yes
FileExtension_Invalid                    : braw mov mp4 m4v m4a m4b m4p m4r 3ga 3gpa 3gpp 3gp 3gpp2 3g2 k3g jpm jpx mqv ismv isma ismt f4a f4b f4v

Audio
ID                                       : 1
Format                                   : MPEG-H 3D Audio
Format profile                           : LC@L4, BL@L3
Codec ID                                 : mhm1
Duration                                 : 3 min 25 s
Bit rate                                 : 1 602 kb/s
Channel(s)                               : 24 channels (22.2)
Channel layout                           : Lw Rw C LFE Lb Rb L R Cb LFE2 Lss Rss Tfl Tfr Tfc Tc Tbl Tbr Tsl Tsr Tbc Bfc Bfl Bfr
Sampling rate                            : 48.0 kHz
Frame rate                               : 46.875 FPS (1024 SPF)
Stream size                              : 39.3 MiB (100%)
Encoded date                             : UTC 2023-05-30 13:03:54
Tagged date                              : UTC 2023-05-30 13:03:54
Program loudness                         : -2.25 LKFS
Signal group #1                          : 13 objects
 Type                                    : Object
 Number of objects                       : 13 objects
Codec configuration box                  : mhaC
/Users/jin/Downloads/libmpegh-main/bin/ia_mpeghd_testbench -ifile:/Users/jin/Downloads/01\ -\ YOASOBI\ -\ Guai\ Wu.mhm -cicp:1 -ext_ren:1 -ofile:guaiwu.wav
-ifile:/Users/jin/Downloads/01 - YOASOBI - Guai Wu.mhm -ofile:guaiwu.wav 
                       ITTIAM SYSTEMS PVT LTD, BANGALORE
                             http:\\www.ittiam.com
                     IA_MPEG_H_3D_AUD_DEC_MSVC $Rev: 1.2 $
                                        

non fatal error: Ittiam mpegh_dec  core coder module :Initialization: : Insufficient input bytes
@SakethSathuvalli
Copy link
Member

Hi @sclsj,

This issue may be related to #60. We will get back to you at the earliest.

Can You please mention the source for these files?

Thanks!

@SonMaxime
Copy link

Hi @sclsj,

This issue may be related to #60. We will get back to you at the earliest.

Can You please mention the source for these files?

Thanks!

Since it looks like a music file I think it's a Amazon Music rip since they are the only platform to encode this level.

@sclsj
Copy link
Author

sclsj commented Jul 10, 2023

Yes. One track only includes 13 objects when the standard was 24 (360 RA level 3) and I'm troubleshooting whether this is platform-related so I asked someone to download it from Amazon. It also seems like all objects are static (i.e. channel only file) which is confusing as I believe it's against Sony guidelines.

@SonMaxime
Copy link

Yes. One track only includes 13 objects when the standard was 24 (360 RA level 3) and I'm troubleshooting whether this is platform-related so I asked someone to download it from Amazon. It also seems like all objects are static (i.e. channel only file) which is confusing as I believe it's against Sony guidelines.

In this song the objets are not static.

@sclsj
Copy link
Author

sclsj commented Jul 11, 2023

Thank you for the clarification!

Unfortunately I don't have the programming ability to use the API or parse the metadata bitstream from ext_ren, so I can only make an educated guess. In one support document Sony implies that dynamic objects usually only include one object/instrument/stem, such as guitar, bass, vocal, or piano. I can confirm this with other tracks that utilitizes the full 24 objects. As instruments fade in and out in all 13 objects, this lead me to believe that those might be from the 13ch offline render or another solely channel-based layout. Either way, this album (The Book and The Book 2) confuses me as it seems like it does not completely conform to the Sony guides I have. Since it's produced by a Sony engineer, maybe they have access to additional features and modified guidelines.

Is there a tool or utility you can recommend which can show the metadata of the objects in a way I can easily read and understand?

@SonMaxime
Copy link

SonMaxime commented Jul 11, 2023

Thank you for the clarification!

Unfortunately I don't have the programming ability to use the API or parse the metadata bitstream from ext_ren, so I can only make an educated guess. In one support document Sony implies that dynamic objects usually only include one object/instrument/stem, such as guitar, bass, vocal, or piano. I can confirm this with other tracks that utilitizes the full 24 objects. As instruments fade in and out in all 13 objects, this lead me to believe that those might be from the 13ch offline render or another solely channel-based layout. Either way, this album (The Book and The Book 2) confuses me as it seems like it does not completely conform to the Sony guides I have. Since it's produced by a Sony engineer, maybe they have access to additional features and modified guidelines.

Is there a tool or utility you can recommend which can show the metadata of the objects in a way I can easily read and understand?

Unfortunately not, the only highest level decodable is this one:

Général
Nom complet                              : C:\Users\Shadow\Desktop\SurroundHiRes-DL\DOWNLOADS\YOASOBI - THE BOOK 2\06 - YOASOBI - Guai Wu.mp4
Format                                   : MPEG-4
Profil du format                         : Base Media / Version 1
Identifiant du codec                     : mp41 (iso8/isom/mp41/dash/cmfc)
Taille du fichier                        : 26,3 Mio
Durée                                    : 3 min 25s
Débit global moyen                       : 1 074 kb/s
Album                                    : THE BOOK 2
Album/Interprète                         : YOASOBI
Partie/Position                          : 1
Piste                                    : 怪物
Piste/Position                           : 6
Interprète                               : YOASOBI
Genre                                    : J-POP/General
Date d'enregistrement                    : 2021
Date d'encodage                          : 2023-07-11 20:43:16 UTC
Date de marquage                         : 2023-07-11 20:43:16 UTC
Couverture                               : Yes

Audio
ID                                       : 1
Format                                   : MPEG-H 3D Audio
Profil du format                         : LC@L3, BL@L3
Identifiant du codec                     : mhm1
Durée                                    : 3 min 25s
Débit                                    : 1 069 kb/s
Canaux                                   : 12 canaux (7.1.4)
Channel layout                           : L R C LFE Lb Rb Lss Rss Tfl Tfr Tbl Tbr
Echantillonnage                          : 48,0 kHz
Images par seconde                       : 46,875 Im/s (1024 SPF)
Taille du flux                           : 26,2 Mio (100%)
Date d'encodage                          : 2023-07-11 20:43:16 UTC
Date de marquage                         : 2023-07-11 20:43:16 UTC
Program loudness                         : -2.25 LKFS
Signal group #1                          : 13 objects
 Type                                    : Object
 Number of objects                       : 13 objects
Codec configuration box                  : mhaC

@sclsj
Copy link
Author

sclsj commented Jul 12, 2023

Thank you! Could you please explain how you determine that the objects in this is not static?

Vamsi100858 pushed a commit that referenced this issue Jul 24, 2023
Significance:

[x] Fix for the decode failure issue reported
    with fragmented MP4 files with mhm1 boxes.
[x] Code cleanup in testbench.

Testing:

[x] Conformance tested.
Vamsi100858 pushed a commit that referenced this issue Jul 24, 2023
Significance:

[x] Fix for the decode failure issue reported
    with fragmented MP4 files with mhm1 boxes.
[x] Code cleanup in testbench.

Testing:

[x] Conformance tested.
[x] Tested with files shared in #60
@Vamsi100858
Copy link
Collaborator

Hi @sclsj

Can you please try with the fix available on https://github.com/ittiam-systems/libmpegh/tree/fix-fmp4-mhm1-dec-fail .

@sclsj
Copy link
Author

sclsj commented Jul 24, 2023

Hi,

The program now decodes successfully. Thank you!

@Vamsi100858
Copy link
Collaborator

Thanks for confirming! Can you please close the issue.

@sclsj
Copy link
Author

sclsj commented Jul 24, 2023

Hi, could you please look at the attached metadata bitstream and tell me if it's static or dynamic (ie if the object moves). Or if there is a tool you can recommend.
Guai Wu _ext_ren_oam_md.bs.zip

@SakethSathuvalli
Copy link
Member

Hi, could you please look at the attached metadata bitstream and tell me if it's static or dynamic (ie if the object moves). Or if there is a tool you can recommend. Guai Wu _ext_ren_oam_md.bs.zip

The bit-stream follows the syntax described by the MPEG-H standard for external rendering interfaces. You can parse this bit stream following the syntax provided in the specification document (section 17.10.3 of MPEG-H specification document).

Vamsi100858 pushed a commit that referenced this issue Jul 25, 2023
Significance:

[x] Fix for the decode failure issue reported
    with fragmented MP4 files with mhm1 boxes.
[x] Code cleanup in testbench.

Testing:

[x] Conformance tested.
[x] Tested with files shared in #60
@sclsj sclsj closed this as completed Jul 25, 2023
@sclsj
Copy link
Author

sclsj commented Jul 25, 2023

Hi,

Thank you. I've already looked at this section. Unfortunately I don't have the needed programming experience. Is there a tool that's already written?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants