Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes chdman extract cd (Broken since splitbin option) #12219

Merged
merged 6 commits into from Apr 8, 2024

Conversation

MetalSlug
Copy link
Contributor

@MetalSlug MetalSlug commented Apr 4, 2024

trackinfo.splitframes was not initialized.

…es for audio tracks fixed.

trackinfo.splitframes was not initialized.
Also fixed cdrom::parse_cue.
Audio tracks have wrong PGTYPE (MODE1) when pregap is set. PGTYPE should be VAUDIO.
Comment on lines 2813 to 2819
else
{
cdrom_file::toc* trackinfo = (cdrom_file::toc*)&toc;

for (int tracknum = 0; tracknum < toc.numtrks; tracknum++)
trackinfo->tracks[tracknum].splitframes = 0;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to initialize this in cdrom_file::parse_metadata so that it'll properly be initialized in all cases instead of just this specific case. And maybe also in the parse_cue/parse_toc/etc funcs for loading from the loose .cue/.gdi/etc files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scratch the part about the parse_cue/etc functions. I just checked and it looks like there's a memset(&outtoc, 0, sizeof(outtoc)); for all of the parser functions, so loading the loose files should be ok. It looks like parse_metadata is the only one that isn't initializing the entire toc to 0 before using it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I scratched the part about the parse_cue and added memset(&toc, 0, sizeof(toc)); to parse_metadata. Works also as expected. Hope my first pull request is correct now.

Comment on lines 2395 to 2400
else if ((outtoc.tracks[trknum].pregap > 0) && (outinfo.track[trknum].idx0offs == -1))
{
outinfo.track[trknum].idx0offs = frames;
outtoc.tracks[trknum].pgtype = outtoc.tracks[trknum].trktype;
outtoc.tracks[trknum].pgdatasize = outtoc.tracks[trknum].datasize;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What case is this trying to fix exactly? I tried to think of what conditions pregap would be non-zero but idx0offs would be -1 when setting INDEX 01 and I can't think of anything that makes sense. Using PREGAP command would set the pregap to a non-zero value but that also means that there isn't pregap data baked into the bins, but setting pgtype and pgdatasize here means that MAME/chdman will read the beginning of the track as if the pregap data was baked into the bins which would break things.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked against older versions of chdman (back to 0.261) before my recent refactoring and I confirmed that the audio track will show MODE1, but that happens with all tracks that don't explicitly have pregap data baked in (when pgtype and pgdatasize would be set). The reason it shows "MODE1" is because track type 0, which is the default value pgtype becomes when uninitialized, corresponds to CD_TRACK_MODE1.

Changing this would mean the metadata section of all CHDs that have tracks with non-zero pregaps but without pregap data baked in would change, causing the SHA-1 hash for those CHDs to also change. Not advisable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If recompressing older Sega CD Redump CHDs you need to replace the cue file, because metadata output will not match with the redump cues. The two different cues should produce the same metadata in my opinion.
Example:

Old CHD exports this CUE which results in different metadata without the change:

FILE "Adventures of Batman and Robin, The (USA) (Track 1).bin" BINARY
TRACK 01 MODE1/2352
INDEX 01 00:00:00
FILE "Adventures of Batman and Robin, The (USA) (Track 2).bin" BINARY
TRACK 02 AUDIO
PREGAP 00:01:74
INDEX 01 00:00:00
FILE "Adventures of Batman and Robin, The (USA) (Track 3).bin" BINARY
TRACK 03 AUDIO
PREGAP 00:02:00
INDEX 01 00:00:00
FILE "Adventures of Batman and Robin, The (USA) (Track 4).bin" BINARY
TRACK 04 AUDIO
PREGAP 00:02:00
INDEX 01 00:00:00
FILE "Adventures of Batman and Robin, The (USA) (Track 5).bin" BINARY
TRACK 05 AUDIO
PREGAP 00:02:00
INDEX 01 00:00:00
FILE "Adventures of Batman and Robin, The (USA) (Track 6).bin" BINARY
TRACK 06 AUDIO
PREGAP 00:02:00
INDEX 01 00:00:00
FILE "Adventures of Batman and Robin, The (USA) (Track 7).bin" BINARY
TRACK 07 AUDIO
PREGAP 00:02:00
INDEX 01 00:00:00

Redump CUE File:

FILE "Adventures of Batman and Robin, The (USA) (Track 1).bin" BINARY
TRACK 01 MODE1/2352
INDEX 01 00:00:00
FILE "Adventures of Batman and Robin, The (USA) (Track 2).bin" BINARY
TRACK 02 AUDIO
INDEX 00 00:00:00
INDEX 01 00:01:74
FILE "Adventures of Batman and Robin, The (USA) (Track 3).bin" BINARY
TRACK 03 AUDIO
INDEX 00 00:00:00
INDEX 01 00:02:00
FILE "Adventures of Batman and Robin, The (USA) (Track 4).bin" BINARY
TRACK 04 AUDIO
INDEX 00 00:00:00
INDEX 01 00:02:00
FILE "Adventures of Batman and Robin, The (USA) (Track 5).bin" BINARY
TRACK 05 AUDIO
INDEX 00 00:00:00
INDEX 01 00:02:00
FILE "Adventures of Batman and Robin, The (USA) (Track 6).bin" BINARY
TRACK 06 AUDIO
INDEX 00 00:00:00
INDEX 01 00:02:00
FILE "Adventures of Batman and Robin, The (USA) (Track 7).bin" BINARY
TRACK 07 AUDIO
INDEX 00 00:00:00
INDEX 01 00:02:00

CHD Output:

chdman - MAME Compressed Hunks of Data (CHD) manager 0.264 (mame0264)
Input file: Adventures of Batman and Robin, The (USA).chd
File Version: 5
Logical size: 325,701,504 bytes
Hunk Size: 19,584 bytes
Total Hunks: 16,631
Unit Size: 2,448 bytes
Total Units: 133,048
Compression: cdlz (CD LZMA), cdzl (CD Deflate), cdfl (CD FLAC)
CHD size: 202,180,195 bytes
Ratio: 62.1%
SHA1: 1fd748ba4a759ca4bd6d4b5962e17ce3ee06934f
Data SHA1: ff4463e371493fa0ffea2504dd2a172df7cf3bca
Metadata: Tag='CHT2' Index=0 Length=92 bytes
TRACK:1 TYPE:MODE1_RAW SUBTYPE:NONE FRAMES:68812 PREGAP:0 PGTYPE:MODE1 PGSUB:NONE POSTGAP:0.
Metadata: Tag='CHT2' Index=1 Length=91 bytes
TRACK:2 TYPE:AUDIO SUBTYPE:NONE FRAMES:13030 PREGAP:149 PGTYPE:VAUDIO PGSUB:NONE POSTGAP:0.
Metadata: Tag='CHT2' Index=2 Length=91 bytes
TRACK:3 TYPE:AUDIO SUBTYPE:NONE FRAMES:11426 PREGAP:150 PGTYPE:VAUDIO PGSUB:NONE POSTGAP:0.
Metadata: Tag='CHT2' Index=3 Length=91 bytes
TRACK:4 TYPE:AUDIO SUBTYPE:NONE FRAMES:11480 PREGAP:150 PGTYPE:VAUDIO PGSUB:NONE POSTGAP:0.
Metadata: Tag='CHT2' Index=4 Length=91 bytes
TRACK:5 TYPE:AUDIO SUBTYPE:NONE FRAMES:12041 PREGAP:150 PGTYPE:VAUDIO PGSUB:NONE POSTGAP:0.
Metadata: Tag='CHT2' Index=5 Length=91 bytes
TRACK:6 TYPE:AUDIO SUBTYPE:NONE FRAMES:11768 PREGAP:150 PGTYPE:VAUDIO PGSUB:NONE POSTGAP:0.
Metadata: Tag='CHT2' Index=6 Length=90 bytes
TRACK:7 TYPE:AUDIO SUBTYPE:NONE FRAMES:4482 PREGAP:150 PGTYPE:VAUDIO PGSUB:NONE POSTGAP:0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I legit don't think Mega CD being a relevant example: that one those ones (hint) needs to be redone from scratch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those two cues say different things about the data, and rewriting one of them to match the other without modifying the bins accordingly will result in the one using the rewritten cue to not work as you probably expect. They're both correct and not extracting to match the Redump cue or match the CHD made from Redump data is not a issue.

Redump cue:

INDEX 00 00:00:00
INDEX 01 00:01:74

Since it includes both INDEX 00 and INDEX 01, it means the .bins from Redump's dump has 1s74f worth of pregap data inside of the bin files. That's physical data that's included in the dumps.

The cue from the extracted CHD, which are based on TOSEC dumps (based on what the XML says):

PREGAP 00:01:74
INDEX 01 00:00:00

This means that there's an implied 1s74f pregap that is not included in the data (which is also valid for cues). That data presumably didn't exist in the dump when making the CHD, just that the cue said there is a 1s74s pregap and to calculate the offset accordingly. The data doesn't exist in the CHD at all, and if MAME tries to read those sectors then it'll just return 0s instead of any real data.

They're both correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I was mixing up the Mega CD and Sega CD advbatr entries. I see that it says it's a Redump dump in the Sega CD XML now.

If it's really made from the Redump cue/bins then it's just a bad CHD and should be replaced in the software list. I created a CHD using the Redump cue/bins and extracted it again and it results in the same data (had to change LF to CRLF in the cue to match the SHA-1 hash, but the data inside is the same and both use INDEX 00/INDEX 01).

> sha1sum -b orig/*
694213b0b0bb1f6c333ec4a06f8a90504b76d7bf *orig/Adventures of Batman and Robin, The (USA) (Track 1).bin
66173b37839536e13d43d0981dc8ea3bf2e35d9c *orig/Adventures of Batman and Robin, The (USA) (Track 2).bin
124c25d994b8515232e13277686991de239e4c8e *orig/Adventures of Batman and Robin, The (USA) (Track 3).bin
46c522d3ebf2f76a28ed0b33f7934d5ab2cd98e3 *orig/Adventures of Batman and Robin, The (USA) (Track 4).bin
7634a0dffac19fb06e9d9180caccff7534626c25 *orig/Adventures of Batman and Robin, The (USA) (Track 5).bin
effdd133fb881feb026f9e67890bd7f0fa591821 *orig/Adventures of Batman and Robin, The (USA) (Track 6).bin
5f2b63e4cb28155849b67d4e0b63317fba76df60 *orig/Adventures of Batman and Robin, The (USA) (Track 7).bin
e696e2c1b16728e2201576e28c4c532decc63025 *orig/Adventures of Batman and Robin, The (USA).chd
87630d5b52393179a5eeab3c231f4d837b56f2d1 *orig/Adventures of Batman and Robin, The (USA).cue

>sha1sum -b test/*
694213b0b0bb1f6c333ec4a06f8a90504b76d7bf *test/Adventures of Batman and Robin, The (USA) (Track 1).bin
66173b37839536e13d43d0981dc8ea3bf2e35d9c *test/Adventures of Batman and Robin, The (USA) (Track 2).bin
124c25d994b8515232e13277686991de239e4c8e *test/Adventures of Batman and Robin, The (USA) (Track 3).bin
46c522d3ebf2f76a28ed0b33f7934d5ab2cd98e3 *test/Adventures of Batman and Robin, The (USA) (Track 4).bin
7634a0dffac19fb06e9d9180caccff7534626c25 *test/Adventures of Batman and Robin, The (USA) (Track 5).bin
effdd133fb881feb026f9e67890bd7f0fa591821 *test/Adventures of Batman and Robin, The (USA) (Track 6).bin
5f2b63e4cb28155849b67d4e0b63317fba76df60 *test/Adventures of Batman and Robin, The (USA) (Track 7).bin
87630d5b52393179a5eeab3c231f4d837b56f2d1 *test/Adventures of Batman and Robin, The (USA).cue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked into the history of these files and it looks like they originally were added 12 years ago (960ae0c). Since it says it's merged from MESS the actual data was probably made even before that.

11 years ago physical pregap data support in cue/bin was added: 22d69c1#diff-3a6d3b53a7d3abe386012527b8d95017b85f2d615b66b4f8bf693e770d20b57bR847

Before that pregaps seem to have just been improperly handled but based on the data SHA-1 the data actually seems to be present in the file still which is why rewriting the cue apparently would work for you in this case instead of skipping over data.

So the CHD itself is bad and should be remade for sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the change so it wouldn't block the more important change.
Maybe I didn't fully understand it, but i thought the two cues describes the same data.

INDEX 00 00:00:00
INDEX 01 00:01:74

PREGAP 00:01:74
INDEX 01 00:00:00

The CHDs are completly identical but the metadata has another pgtype.
MODE1 is correct for audio tracks?

Metadata:     Tag='CHT2'  Index=1  Length=91 bytes
TRACK:2 TYPE:AUDIO SUBTYPE:NONE FRAMES:13030 PREGAP:149 PGTYPE:VAUDIO PGSUB:NONE POSTGAP:0.

Metadata:     Tag='CHT2'  Index=1  Length=90 bytes
TRACK:2 TYPE:AUDIO SUBTYPE:NONE FRAMES:13030 PREGAP:149 PGTYPE:MODE1 PGSUB:NONE POSTGAP:0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the two cues don't describe the same data. This might be repeating a little bit of my previous post but I tried to include the math and documentation links for further info.

The cue with the INDEX 00 and INDEX 01 will have 149 frames (sector size of 2352 bytes because it's CD audio) or 149 * 2352 = 350,448 bytes of data for the pregap within the .bin file. Track 1 is 13030 frames total (that includes the 149 pregap frames) so the entirety of track 1 is 13030 * 2352 = 30,646,560 bytes.

INDEX 00 00:00:00
INDEX 01 00:01:74

Now the other cue mentioned uses a PREGAP command. The PREGAP command specifically does not use any data from the .bins, but it does increment the time value so that INDEX 01 of both cues starts at the same time logically. Since the PREGAP command will not use any data from the .bin, the entire contents of the .bin will become the data returned when reading index 1 of that track through the CD drive. The bad effect of this is that there will be 1s74f of silence (or whatever is in the pregap) when that track is played because it's using 1f74s of virtual pregap, followed by the 1s74f of physical pregap from the original .bin which gets played as normal audio, and then it finally gets to the real audio data.

MAME checks the PGTYPE to determine if it uses physical pregap data or virtual pregap data. If the first value of PGTYPE starts with a V then it reads the rest of the PGTYPE value to determine the pregap data type. If the PGTYPE value does not start with a V then it's ignored entirely. So audio tracks aren't "MODE1", it's just that anything that doesn't start with V is ignored.

The version of chdman that was used to create the CHD in question was from before physical pregaps were properly supported, so the PGTYPE was not set to what chdman currently expects and aren't being handled properly. So the extracted cue with the PREGAP command is technically correct to how the CHD format works. It's just not correct for what the data really should be, so the CHD needs to be remade using the original source data using a newer version of chdman that does support physical pregaps.

To give a more visible and real world example. You can load the old CHD into a CD player within MAME and compare it to a newly made CHD from the original source data and you'll see that the old CHD is appears to the CD player as being about 12 seconds longer (2 seconds per track after track 1) because it didn't handle the pregaps correctly.

Old CHD:
chd-orig

New CHD:
chd-new

If you load the .cue files instead of .chds then you'll get the same result. As you can see, the two cues are not describing the same things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@987123879113
Thank you for taking the time for the detailed explanations.
Now it is clearer to me what is included in the data and what is virtual. My old CHDs will need a rebuild.
The redump cues doesn't use virtual pregaps at all. Tosec uses these in many cues.

I also hope that the old hashes in the main branch will be updated soon.

@MetalSlug MetalSlug changed the title Fixes chdman extract cd (Broken since splitbin option). Parse cue… Fixes chdman extract cd (Broken since splitbin option) Apr 8, 2024
@cuavas cuavas merged commit 067afa1 into mamedev:master Apr 8, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants