Corrupted file not detected by cvmfs_server check -a #3553

DrDaveD · 2024-03-25T19:16:24Z

I found another case of doubled data on the Nebraska primary stratum 1, in the sft.cern.ch repository. The file was from 2021, so the corruption happened prior to the fix in #2991. The weirdest thing is that cvmfs_server check -a did not detect it. The last time the check was completed on sft.cern.ch was March 10, 2024. Oh, I see that there's a check option -i that says "check data integrity" and I have not been using that. Does it not even check the file sizes without -i? What does it check by default? It takes a terribly long time to check even without -i; I wonder how long it will take with it.

The corruption was also slightly different than before, but probably not enough different to be useful. This time there were two doubled 4096-byte blocks, copying bytes 4*4096 (16384) through 6*4096 (24576) into bytes 6*4096 through 8*4096 (32768).

Just in case it's helpful for testing a change to check, here is a zip file containing the good and corrupted forms of the file. The files were in the /srv/cvmfs/sft.cern.ch/data/1d subdirectory on the backup and primary stratum 1s, respectively.

The text was updated successfully, but these errors were encountered:

jblomer · 2024-04-04T08:27:47Z

Without -i, the check is only looking at the catalogs and checks the meta-data for consistency (e.g., valid file-system structure, correct accounting in the summary data, etc.). I think we could have another option to check the file sizes. This will be significantly more expensive because we need to stat (or HEAD) all referenced objects but it will still be short of a full hash verification.

DrDaveD · 2024-04-04T16:54:39Z

By default, without the -c option, it is checking for the existence of data "chunks" already using HEAD. I found that out with strace. Doesn't HEAD give file sizes? They might as well be checked at the same time.

I don't have exact timings on the -i integrity check, but it appears to add far less time than the regular check. The -i option translates to running cvmfs_swissknife scrub before cvmfs_swissknife check. On the machine where I have enabled integrity checks, there is currently running a cvmfs_swissknife check on lhcb.cern.ch that started at 21:03 two days ago (it is now 11:35) and according to the log the integrity check started at 01:41 that morning. So the integrity check took less than 19.5 hours and the regular check has been running so far over 36.5 hours. The logs only go back a month so I can't check exactly how long the last time took. However, if I sort all the "last_check" times in .cvmfs_status.json on the identical sister machine, it appears that the last check for lhcb.cern.ch there ran from Feb 27 11:42 to Mar 6 05:03, so more than 7.5 days. This was with cvmfs-server-2.10.1.

DrDaveD · 2024-04-04T17:16:34Z

On the other side, if -i is used, is there any need for also checking the "chunks"? Maybe the -i option should imply the -c option.

DrDaveD · 2024-04-05T22:22:58Z

And it's the lack of -c that's taking most of the time! I switched to cvmfs_server check -aic and the lhcb.cern.ch integrity and regular check took just slightly over 24 hours instead of 7.5 days.

DrDaveD · 2024-04-16T19:53:30Z

Jakob and I discussed this and he said that there is still value to doing the HEAD requests (without -c) but at the same time it should check the Content-Length header to verify the size. The scrub will not detect missing files, so although it's good to do have -i, -c should not be used if you want to find all classes of errors.

jblomer self-assigned this Apr 4, 2024

jblomer added the in:Server label Apr 4, 2024

DrDaveD mentioned this issue Apr 4, 2024

Add cvmfs_server check -i, update cvmfs and frontier-squid opensciencegrid/oasis-server#145

Merged

DrDaveD mentioned this issue Apr 16, 2024

cvmfs server data chunk existence check is very slow #3580

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Corrupted file not detected by cvmfs_server check -a #3553

Corrupted file not detected by cvmfs_server check -a #3553

DrDaveD commented Mar 25, 2024

jblomer commented Apr 4, 2024

DrDaveD commented Apr 4, 2024

DrDaveD commented Apr 4, 2024

DrDaveD commented Apr 5, 2024

DrDaveD commented Apr 16, 2024

Corrupted file not detected by cvmfs_server check -a #3553

Corrupted file not detected by cvmfs_server check -a #3553

Comments

DrDaveD commented Mar 25, 2024

jblomer commented Apr 4, 2024

DrDaveD commented Apr 4, 2024

DrDaveD commented Apr 4, 2024

DrDaveD commented Apr 5, 2024

DrDaveD commented Apr 16, 2024