-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chdman createdvd writess corrupt CHD files on CIFS storage #12605
Comments
To verify if the issue was related to multi-threading, I tried this creating a CHD file three times with the default number of processes and three times more with a single thread: for a in 1 2 3 ; do chdman createdvd -i AnyLargePS2DVD.iso -o multi-$a.chd ; done
for a in 1 2 3 ; do chdman createdvd -i AnyLargePS2DVD.iso -np 1 -o single-$a.chd ; done Both the source and the destination were on the same folder, a CIFS-mounted shared folder from my NAS. The result wasn't good: $ md5sum multi-* single-*
ea3bc4e3ad10ca387e7ed734501d476a multi-1.chd
5ab5db891572d1301aa41b3b2624684e multi-2.chd
ea3bc4e3ad10ca387e7ed734501d476a multi-3.chd
c610c5723decf413b295b132d51e5ae3 single-1.chd
b2b54cdcd115663df0dbd27ac9afd3c0 single-2.chd
945f851408e0c87df6ad9fe816cd87f9 single-3.chd Then I decided the same approach, but with the source .ISO file on the CIFS mount and the destination .CHD file on $ md5sum /tmp/multi-* /tmp/single-*
ea3bc4e3ad10ca387e7ed734501d476a /tmp/multi-1.chd
ea3bc4e3ad10ca387e7ed734501d476a /tmp/multi-2.chd
ea3bc4e3ad10ca387e7ed734501d476a /tmp/multi-3.chd
ea3bc4e3ad10ca387e7ed734501d476a /tmp/single-1.chd
ea3bc4e3ad10ca387e7ed734501d476a /tmp/single-2.chd
ea3bc4e3ad10ca387e7ed734501d476a /tmp/single-3.chd I also tried two other combinations of CIFS and RAM as either sources or destinations. Here's the results:
This makes me believe the process of writing the CHD file has some assumption that isn't working here. Maybe it tries to write and read and seek too quickly, or without flushing the changes. I don't know, I haven't looked at the code. Given the recent findings, I'm updating this issue title from "chdman can't verify CHD files created by createdvd" to "chdman createdvd writess corrupt CHD files on CIFS storage". One more note: I know I have created many .CHD files from CUE/BIN CD-based images, and they all verified correctly. This makes me believe the issue is only with |
have you ensured this isn't a bad RAM / thermal issue? I can believe there could be issues, but typically on machines where I've seen CHD outright create files that don't validate it's been because of bad RAM, or a CPU overheating creating corrupt compressed data. The process is intensive, and will swamp your CPU, pushing all the cores to their limit which for larger files like DVDs could be too much in some cases. |
It is unlikely to be a thermal or RAM issue, because:
I cannot 100% discard the possibility of faulty hardware on my side (after all, hardware fails), but in this case I find it very unlikely. If you have any specific scenario you want me to try (or variations of command-line parameters), please tell me, and I can try running them later. |
Sure. Reproduce the issue on a different machine, preferably AMD or an earlier generation of Intel. |
On the same machine, I tried converting a 600MB Sega CD CUE+BIN to CHD. I used this command: for a in 1 2 3 4 5 6 7 8 9 ; do chdman createcd -i Foobar.cue -o multi-$a.chd ; done
for a in 1 2 3 4 5 6 7 8 9 ; do chdman createcd -i Foobar.cue -np 1 -o single-$a.chd ; done Essentially the same command as before, but with $ md5sum *.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-1.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-2.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-3.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-4.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-5.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-6.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-7.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-8.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-9.chd
7f50d49392430d60a08805e61de64ef5 single-1.chd
cb279c1d5ad9d870d1d7afe23e370063 single-2.chd
4e37f3a44ceb55a58705b27a78c21eba single-3.chd
cb279c1d5ad9d870d1d7afe23e370063 single-4.chd
2dcc5d096dff53aba89c2768546afafb single-5.chd
8afd362061362566be291a49479f10cf single-6.chd
cb279c1d5ad9d870d1d7afe23e370063 single-7.chd
cb279c1d5ad9d870d1d7afe23e370063 single-8.chd
57ca0894cddde1102a238da5aafa8f38 single-9.chd To make sure overheating wasn't the issue, I repeated the test while setting the CPU power profile to "Power Save" instead of "Balanced", which limits the clock speed to 900MHz (instead of over 2GHz~3GHz). The CPU was at around 45°C this time (instead of over 80°C). This took significantly more time. $ md5sum *.chd
1c2107afcbdf56bca8a63db6c10bbd21 multi-1.chd
7e3c12a8c919556d6fc164e430b47862 multi-2.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-3.chd
8477cca6863ceba46f59e7d6cc711c45 multi-4.chd
5cf98a8b3cc23cafdca317bf7b6fe27b multi-5.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-6.chd
7c5a1639db4302dcf2375ed7b177df38 multi-7.chd
c27de612808b53c4ca438d275925cf71 multi-8.chd
cb279c1d5ad9d870d1d7afe23e370063 single-1.chd
cb279c1d5ad9d870d1d7afe23e370063 single-2.chd
f9cca43a3518d718abf5e859bb2e4fc2 single-3.chd
34ddf2875fed771079ad8d1f3b77bbc9 single-4.chd
9e1d8bf395261cba01258623f541149d single-5.chd
fe481d7eda3cace16f8e2aa2884eaa63 single-6.chd
e756a67d96853a3f5d4ab713edad5f04 single-7.chd Then I tried the test again, but this time reading from CIFS and writing to $ md5sum /tmp/*.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-1.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-2.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-3.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-4.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-5.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-6.chd
cb279c1d5ad9d870d1d7afe23e370063 single-1.chd
cb279c1d5ad9d870d1d7afe23e370063 single-2.chd
cb279c1d5ad9d870d1d7afe23e370063 single-3.chd
cb279c1d5ad9d870d1d7afe23e370063 single-4.chd
cb279c1d5ad9d870d1d7afe23e370063 single-5.chd
cb279c1d5ad9d870d1d7afe23e370063 single-6.chd Then I repeated the same test on the Steam Deck. Same LAN, $ md5sum *.chd
cb279c1d5ad9d870d1d7afe23e370063 deckmulti-1.chd
cb279c1d5ad9d870d1d7afe23e370063 deckmulti-2.chd
cb279c1d5ad9d870d1d7afe23e370063 deckmulti-3.chd
cb279c1d5ad9d870d1d7afe23e370063 deckmulti-4.chd
cb279c1d5ad9d870d1d7afe23e370063 deckmulti-5.chd
cb279c1d5ad9d870d1d7afe23e370063 deckmulti-6.chd
cb279c1d5ad9d870d1d7afe23e370063 deckmulti-7.chd
cb279c1d5ad9d870d1d7afe23e370063 deckmulti-8.chd
cb279c1d5ad9d870d1d7afe23e370063 decksingle-1.chd
cb279c1d5ad9d870d1d7afe23e370063 decksingle-2.chd
cb279c1d5ad9d870d1d7afe23e370063 decksingle-3.chd
cb279c1d5ad9d870d1d7afe23e370063 decksingle-4.chd
cb279c1d5ad9d870d1d7afe23e370063 decksingle-5.chd
cb279c1d5ad9d870d1d7afe23e370063 decksingle-6.chd
cb279c1d5ad9d870d1d7afe23e370063 decksingle-7.chd
cb279c1d5ad9d870d1d7afe23e370063 decksingle-8.chd Both mounted the CIFS folder using this entry on the
Slightly different versions:
I'll do even more tests and post the results as I collect them. |
This time I repeated the tests on the same intel-based machine, but using the same flatpak version of $ md5sum *.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-1.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-2.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-3.chd
3c41d4b17560b17a3bcfb2b2a8653390 multi-4.chd
1b0ab0f3c07b401b0e12afc2d093ca3f multi-5.chd
0d8e393115cd09c46b0e8d94495b09ff multi-6.chd
a80ef3705fdcd2169a3cbc4639c4f6e7 multi-7.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-8.chd
cb279c1d5ad9d870d1d7afe23e370063 single-1.chd
48507f59b9dd73fd1463e66af7983b4e single-2.chd
79ee00e37c43cf6879530e54d4a477cc single-3.chd
cb279c1d5ad9d870d1d7afe23e370063 single-4.chd
cb279c1d5ad9d870d1d7afe23e370063 single-5.chd
d9e540c1223c287b120df046f8b28e02 single-6.chd
7edb1555d284ae7f56478e43b78b6aef single-7.chd
dcec42f4edaf30d3c2eeda723b5b8157 single-8.chd I also used the Steam Deck to $ md5sum *.chd
ea3bc4e3ad10ca387e7ed734501d476a deckmulti-1.chd
ea3bc4e3ad10ca387e7ed734501d476a deckmulti-2.chd
ea3bc4e3ad10ca387e7ed734501d476a deckmulti-3.chd
ea3bc4e3ad10ca387e7ed734501d476a deckmulti-4.chd
ea3bc4e3ad10ca387e7ed734501d476a deckmulti-5.chd
ea3bc4e3ad10ca387e7ed734501d476a deckmulti-6.chd
ea3bc4e3ad10ca387e7ed734501d476a decksingle-1.chd
ea3bc4e3ad10ca387e7ed734501d476a decksingle-2.chd
ea3bc4e3ad10ca387e7ed734501d476a decksingle-3.chd
ea3bc4e3ad10ca387e7ed734501d476a decksingle-4.chd
ea3bc4e3ad10ca387e7ed734501d476a decksingle-5.chd
ea3bc4e3ad10ca387e7ed734501d476a decksingle-6.chd Could it be that writing to the CIFS folder is buggy? I then tried copying one single CHD file from Summary so far:
|
have you tried other (non-CHDMAN cases) with the CIFS and large files, under load, with lots of appending of additional data etc.? maybe the OS handling of the CIFS itself is what is buggy? CHDMAN itself isn't doing anything special in those cases, and it seems to be the common factor. |
I am not an expert on CIFS mounts, but other things to try may be not using nolbr and/or cache=none on the CIFS mount. It might be that the combination of how chdman persists data and the nolbr option are causing the problems that you are experiencing. |
I tried remounting with $ md5sum *.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-1.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-2.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-3.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-4.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-5.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-6.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-7.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-8.chd
cb279c1d5ad9d870d1d7afe23e370063 multi-9.chd So, indeed caching sounds like it could be the cause. |
MAME version
0.267 (mame0267-dirty)
System information
Manjaro Linux, x86_64
Intel i7-6700 CPU with 8 cores/threads
Storage on a NAS device mounted over CIFS.
INI configuration details
No response
Emulated system/software
No response
Incorrect behaviour
Obtain a Playstation 2 ISO file from a game that was released as DVD.
Convert the DVD ISO file to a CHD file:
Try to verify it:
What happens:
Decompression error
Compression error
But the behavior is always consistent on the same .CHD file.
Expected behaviour
It should be able to
verify
any CHD created viacreatedvd
. It shouldn't crash either.Steps to reproduce
No response
Additional details
If I were to guess, I would think the
createdvd
isn't deterministic. There might be some uninitialized data that sometimes gets corrupted and sometimes gets written wrong on the CHD file. Or maybe the file seek/write/sync/flush isn't done correctly. Or perhaps it's related to running the compression in parallel on multiple CPU cores.Or, I would guess
verify
code is buggy, as the code shouldn't ever segfault.Given the random nature of this, it may as well happen that my hardware is defective somehow. Although that's weird, given it correctly verified many CHD files, but only had trouble with a few of the ones I had created using
createdvd
. The CHD files generated from CD bin/cue files worked just fine.I know debugging something random like this is a nightmare. I'm sorry about it.
As a guess after some more thought, it could be related to racing conditions on the
createdvd
method when running in parallel. Since the storage is on a spinning disk in a network device (Synology NAS with limited CPU power), the bottleneck is probably the I/O. Some of the CPU threads might not run in the expected timing due to delays in the I/O. If this is the case, moving the ISO to/tmp
(a RAM drive) and creating the CHD over there shouldn't have I/O issues. Well, still a mystery for me, and testing this takes quite some time.The text was updated successfully, but these errors were encountered: