New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Output of file(1) is not always consistent #26

Open
orbea opened this Issue Nov 24, 2018 · 6 comments

Comments

Projects
None yet
2 participants
@orbea
Contributor

orbea commented Nov 24, 2018

I am not sure if this is a maxcso issue or even an issue, but upon converting my old PS2 iso files to cso I noticed that file(1) is not always consistent in what the generated cso files are.

$ file Atelier\ Iris\ *.cso
Atelier Iris - Eternal Mana [NTSC-U].cso:      Compressed ISO CD image
Atelier Iris 2: Azoth of Destiny [NTSC-U].cso: data
Atelier Iris 3: Grand Phantasm [NTSC-U].cso:   Compressed ISO CD image
$ file Atelier\ Iris\ *.iso
Atelier Iris - Eternal Mana [NTSC-U].iso:      UDF filesystem data (version 1.5) ''
Atelier Iris 2: Azoth of Destiny [NTSC-U].iso: UDF filesystem data (version 1.5) ''
Atelier Iris 3: Grand Phantasm [NTSC-U].iso:   UDF filesystem data (version 1.5) 'ATELIER_IRIS_GF'

However the files work regardless in PCSX2 and when decrypting the cso files the md5sum matches the original source iso files.

Any ideas why this is happening? Maybe this could be improved in file(1)? For obvious reasons I can't provide example files...

@unknownbrackets

This comment has been minimized.

Owner

unknownbrackets commented Nov 24, 2018

Well, this is an (incorrect) assumption of file that was typically correct with other CSO tools. Unfortunately, the header format was not always consistent in tools so I understand why they're resorting to checks like this...

# Other fields are used to determine what type of CISO this is:
# - 0x04 == 0x00200000: GameCube/Wii CISO (block_size)
# - 0x10 == 0x00000800: PSP CISO (ISO-9660 sector size)
# - None of the above: Compact ISO.
>4	lelong	!0
>>4	lelong	!0x200000
>>>0x10	lelong	!0x800		Compressed ISO CD image

https://github.com/unknownbrackets/maxcso/blob/master/README_CSO.md#format-version-1

It assumes that the block_size at offset 0x10 is ALWAYS 0x0800 for non-Wii CSO. PSP CFW only supports 0x800 block sizes, so this is a decent assumption for PSP games. However, PPSSPP and PS2 emulators support larger block sizes. maxcso has the following behavior:

  • If --block=# is passed, that size is used. It must be a power of 2.
  • Otherwise, if the ISO is smaller than 2 GB, 0x800 is used (for compatibility with old PSP software.)
  • Otherwise, 0x4000 is used (delivers better compression.)

That said, Atelier Iris 1 is larger than 2 GB, so for me it uses the larger default size. That means either maxcso wasn't used, or an older version before that default was used.

If you want file to work with its current assumptions, simply use maxcso --block=2048 when compressing. You can even recompress, but you'll have to use -o to give it a new output file, since it cannot recompress in place. The files will be larger, by typically 2-3%.

-[Unknown]

@orbea

This comment has been minimized.

Contributor

orbea commented Nov 24, 2018

Thanks for the detailed explanation, I think this is worth reporting to file upstream so they can improve their assumptions. I'll look into where to do that and get back to you.

@orbea

This comment has been minimized.

Contributor

orbea commented Nov 24, 2018

I made an issue for file(1), I'll leave it up to you to close this issue now or wait until the upstream issue is resolved.

https://bugs.astron.com/view.php?id=53

@orbea

This comment has been minimized.

Contributor

orbea commented Nov 26, 2018

I can also reproduce this with PSP games.

$ file 7th\ Dragon\ 2020-II\ \(*
7th Dragon 2020-II (English v091).cso: data
7th Dragon 2020-II (English v091).iso: ISO 9660 CD-ROM filesystem data 'NO LABEL'
7th Dragon 2020-II (Japan).cso:        data
7th Dragon 2020-II (Japan).iso:        ISO 9660 CD-ROM filesystem data ''

I also found this interesting.

$ du -h 7th\ Dragon\ 2020-II\ \(*
839M	7th Dragon 2020-II (English v091).cso
1.3G	7th Dragon 2020-II (English v091).iso
1.2G	7th Dragon 2020-II (Japan).cso
1.3G	7th Dragon 2020-II (Japan).iso

The English fan translation was patched using the above Japanese iso.

@unknownbrackets

This comment has been minimized.

Owner

unknownbrackets commented Nov 26, 2018

Were those created with maxcso? Maybe there's somewhere else it's validating bytes in the file incorrectly - I may not be reading the magic bytes config entirely properly.

You can view the first few bytes like this:

xxd -l 24 -e "7th Dragon 2020-II (English v091).cso"

(WARNING: don't pass two filenames.)

For example, you might see:

00000000: 4f534943 00000018 4bb20000 00000000  CISO.......K....
00000010: 00004000 00000001                    .@......

This indicates:

  • 4f534943: CISO
  • 00000018: 0x18 header size (this value is not reliable across all cso generating tools)
  • 4bb20000 00000000: the original ISO here was about 1.2 GB. This value will always be a multiple of 2048, meaning byte 9 will always be 00 and byte 10 should be ?0 or ?8, where ? is any hex digit.
  • 00004000: this is the block size: in my example, the file was explicitly compressed using --block=16384.
  • 00000001: version (01) but some tools use 0, and then index shift (00) and two zero bytes.

Actually, this file is recognized by file on my system, so there may be more to it. I'm not sure how to parse the magic.mgc I'm using or see the precise source for it.

I was really looking at this, basically (and some other versions of it for other platforms):
https://reviews.freebsd.org/D12400?large=true

-[Unknown]

@orbea

This comment has been minimized.

Contributor

orbea commented Nov 26, 2018

Yes, I created the .cso files with maxcso-1.10.0 last night from the .iso files. They seem to work.

$ xxd -l 24 -e "7th Dragon 2020-II (English v091).cso"
00000000: 4f534943 00000018 4eecc800 00000000  CISO.......N....
00000010: 00000800 00000001                    ........
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment