Reduce time stamp precision #12

yemartin · 2019-06-24T22:43:10Z

The really nice thing about cshatag, compared to other tags file solutions like chkbit, is that the tag follows the file along when the file is moved or copied, as long as the destination filesystem supports extended attributes.

But this unfortunately breaks when the time resolution of the target filesystem is less that the original filesystem. This would prevent detecting bit corruption that happened during move or copy operations.

For example, using the Go rewrite, and with:

/tmp on my root filesystem (APFS)
/Volumes/Organizer from my NAS, mounted through SMB (SMB_3.02)

$ rm /Volumes/Organizer/test.bin \
; touch /tmp/test.bin \
&& cshatag /tmp/test.bin \
&& mv /tmp/test.bin /Volumes/Organizer/ \
&& cshatag /Volumes/Organizer/test.bin

remove /Volumes/Organizer/test.bin? y
<outdated> /tmp/test.bin
 stored: 0000000000000000000000000000000000000000000000000000000000000000 0000000000.000000000
 actual: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 1561415148.563117837
<outdated> /Volumes/Organizer/test.bin
 stored: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 1561415148.563117837
 actual: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 1561415148.000000000

The second cshatag call, on the SMB share, considers the tag outdated. If corruption had happened during the move operation, cshatag would have missed it.

Suggestion: if I remember well, FAT was probably the lowest denominator, with 2 seconds resolution timestamps. So to ensure maximum compatibility, cshatag should consider the file unchanged if the file timestamp is within +/- 2 seconds of the tag timestamp.

So, to sum it bug-report style:

Current behavior

<outdated> /Volumes/Organizer/test.bin
 stored: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 1561415148.563117837
 actual: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 1561415148.000000000

Expected behavior

<ok> /Volumes/Organizer/test.bin

Do you think this makes sense, and this is possible to add to the Go rewrite?

The text was updated successfully, but these errors were encountered:

es80 · 2019-11-06T17:05:47Z

Hello.
I've run into a similar issue copying files between ext4 and NTFS partitions.

On an NTFS file systems the mtime resolution is 100ns. If I copy a file from ext4 to NTFS file system using NTFS-3G, the last two digits of the file modification timestamp become 0.

Subsequent use of cshatag against the same file on NTFS will indicate the file is outdated even if the file has not been touched.

For example:

$ cshatag foo.txt
<outdated> foo.txt
 stored: 0000000000000000000000000000000000000000000000000000000000000000 0000000000.000000000
 actual: 181210f8f9c779c26da1d9b2075bde0127302ee0e3fca38c9a83f5b1dd8e5d3b 1572947268.840550551

$ cshatag foo.txt
<ok> foo.txt

$ cp -a foo.txt /home/$USER/ntfs_mount/

$ cshatag /home/$USER/ntfs_mount/foo.txt
<outdated> /home/$USER/ntfs_mount/foo.txt
 stored: 181210f8f9c779c26da1d9b2075bde0127302ee0e3fca38c9a83f5b1dd8e5d3b 1572947268.840550551
 actual: 181210f8f9c779c26da1d9b2075bde0127302ee0e3fca38c9a83f5b1dd8e5d3b 1572947268.840550500

$ cshatag /home/$USER/ntfs_mount/foo.txt
<ok> /home/$USER/ntfs_mount/foo.txt

This doesn't seem to happen between other filesystems with different mtime granularity (I tested ext2, ext3 and ext4 and chatag wouldn't work for me for files on FAT or ExFAT).

For my own use I wrote some code so that cshatag could be used with an -ntfs flag which then ignores any discrepancy under 100ns.

I think it is worth keeping the full width of the time stamp as a default. The simplest solution would be to have a different type of <ok> output to indicate that, although the timestamp is different, the file is identical. For example:

$ cshatag foo.txt
<unchanged> foo.txt

This only needs one extra 'if' condition in the code. A more complex solution would be command-line flags specifying an acceptable time discrepancy such as 2s, 1s, 1ms, 100ns, 1ns (default). For example:

$ cshatag -time=100ns foo.txt
<ok> foo.txt

I would be happy to submit pull requests for either if @rfjakob is interested.

yemartin · 2019-11-14T03:52:42Z

@es80 I was also thinking of adding an option at first. I like your -time= idea the best, by the way.

But then I realized: what happens when you do have some bit corruption during your cross-filesystem transfer, and forget to use that -time option? A false negative: bit corruption happened, but it will not be detected, which is the first job of cshatag.

Take the opposite case: we modify cshatag to ignore up to 2s time differences. Now what happens when there is bit corruption during cross-filesystem copy? We catch it, yeah! Now what if there is a legitimate file change within 2s of the original cshatagging ? We get a false positive. The user may get a scare, but no harm done.

So for me, while the option was a good idea, it needs to be implemented as the default behavior. What do you think @es80 and @rfjakob ?

es80 · 2019-11-15T16:51:04Z

Yes, that's a good point. I wrote some code for a -time option which sought to leave the default behaviour intact. But it does start to get a bit complicated since there are quite a few options for how the program might behave in the different cases.

My main preference would be to be able to differentiate those cases with different status outputs for example, or have options for how they are processed.

But as for the default behaviour, I really don't mind. I suppose the false positives you describe would be very unlikely.

(I actually tried to cook up a false positive and found even with nanosecond precision you can do

$ ./cshatag log.txt > log.txt

or occasionally

$ ./cshatag log.txt; echo "changed" >> log.txt;

then on the next check log.txt is marked corrupt. The level of precision in timestamps is somewhat illusory - the granularity on my system is around 0.005s.)

rfjakob · 2019-11-17T13:24:58Z

I like the idea of reporting it with a distinct event type. This case is now reported as <timechange>.

yemartin mentioned this issue Jun 24, 2019

WIP Feature/macos compatibility #11

Closed

rfjakob closed this as completed in 0b1c47f Nov 17, 2019

yemartin mentioned this issue Jan 3, 2022

Reduce time stamp precision, take two #21

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce time stamp precision #12

Reduce time stamp precision #12

yemartin commented Jun 24, 2019

es80 commented Nov 6, 2019

yemartin commented Nov 14, 2019 •

edited

Loading

es80 commented Nov 15, 2019

rfjakob commented Nov 17, 2019

Reduce time stamp precision #12

Reduce time stamp precision #12

Comments

yemartin commented Jun 24, 2019

Current behavior

Expected behavior

es80 commented Nov 6, 2019

yemartin commented Nov 14, 2019 • edited Loading

es80 commented Nov 15, 2019

rfjakob commented Nov 17, 2019

yemartin commented Nov 14, 2019 •

edited

Loading