-
Notifications
You must be signed in to change notification settings - Fork 427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ignoreChecksum: true by default #128
Conversation
a01ed06
to
049a48a
Compare
This is because checksum verification by default was deemed to be too vigorous and unncessary by default. This discussion stems from issue #117.
049a48a
to
fda55c7
Compare
Well, what I preferred is more like rsync. Not only changing the default, but also change the logic when to do checksum. I will explain the detail later (sorry i'm busy now) |
No worries. Am off to bed, this is a work in progress so actually you could get that file, and submit a PR but with the logic implemented and I can close this one: just to cut out the middle man, otherwise I'll wait for your explanation or jump onto it in the morning. Catch you in a few hours, am off to bed. |
I agree with @kcwu, as close to what rsync does would be my preference as well. Here's how I understand rsync uses checksums (please correct me if I'm wrong here):
I'm pretty sure the first point there is implementable in drive (and is hopefully already taken care of by this PR). I'm not so sure the second point can be exactly recreated here since we don't have control over both ends of the pipe. The way rsync does it allows for verification of chunked file transfers. I think any file transfer verification checksumming is out of scope for this PR. |
Regarding to rolling checksum mentioned by @l3iggs , it is not possible. What I am concerned are following two cases.
The old behavior: (-ignore-checksum=false)
The last case need to explain, please look https://github.com/odeke-em/drive/blob/master/src/types.go#L267 fileDifferences() and https://github.com/odeke-em/drive/blob/master/src/types.go#L296 op() fileDifferences() will compute differ flags first. So if size equals and ignoreChecksum=false, it always do checksum. However, after that, even checksum equals, last check of op() The new behavior after this patch: (-ignore-checksum=true)
Following are what I expected -ignore-checksum=false. I prefer this one as default.
with -ignore-checksum=true
maybe somebody like this new flag -always-checksum, i don't know.
|
I'll look at this later, however this might be two releases from now. |
FYI, I found the behavior of "-always-checksum" I described above, is actually what rakyll's original code does. |
@kcwu how is that different from https://github.com/odeke-em/drive/blob/master/src/types.go#L267-282, in combination with https://github.com/odeke-em/drive/blob/master/src/types.go#L315-323? |
L267-282 is run before L315-323. In fileDifferences(), checksum calculation is done even if it is not used later. |
checksum calculation is done only if you aren't ignoring it, and if size is the same. |
And if you notice, if we are going to actually ignore the checksum by default should then toggling the DifferChecksum bit should produce the expected result. |
Hmm, I didn't answer it accurately. Let me try again. The problem is, L315-323 if "mtime differ", you decided to sync no matter the checksum is identical or not. Given that, calculate checksum is wasting CPU and time. In other words, according to L315-323, if mtime differs, we actually don't care DifferChecksum bit. The problem appears because original logic have been split into two parts. |
See the thing is changing mtime is a separate operation from calculating the checksum so I don't see how those two should be coupled. Actually there are lots of times for which you just want to change the mtime and not the content e.g after touch locally or even on the server. |
So as you would see from that, that is not wasting the CPU and time. |
Yes, you are right. Sorry I misunderstood how it works. So, the current behavior should be
The new behavior after this patch: (-ignore-checksum=true)
Is this right? If so, what do you think about following behavior? (assume no conflict)
|
Sorry for the late reply, I was just completing these essays ;) #studentlife
|
Will drive ever get delta upload functionality? |
Remember you once opened this issue then evaluated and closed it? Also I don't think delta uploads are relevant to this discussion as that is a beast of its own, I'd rather keep this small and solveable. Delta uploads can take their own issue and discussion :) |
I'm not suggesting we talk in depth about it here. My question was about drive's roadmap. If delta uploads are not on the horizon then I have opinion A on this issue. If they are on the horizon, then I have opinion B. (where A and B are secrets right now :-)) |
Alright. Since you asked, I've been looking at it offline but it is quite haxorish. It is quite a risky option because it will require quite a bit of work and indexing that I would prefer to keep it off for now before it is well understood. Also Google Drive isn't yet exposing checksum-ing chunks. However it allows for checking versions and etags ( from this you can see how the pieces come together). So to answer your question: on the record no plan yet, off the record yes. |
Ok, then my opinion on how this should work is this: When checking if a transfer needs to be done (for push or pull): If the user passes a --checksum flag, compare all file metadata AND the checksum (which will be calculated over every single file in the transfer at this point), if any differences are found, then transfer the file. |
This is because checksum verification by default was deemed to be
too vigorous and unncessary by default. This discussion stems from
issue #117.