Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Check file hash before starting network transfer #6153

Closed
stefanochiappini opened this issue Nov 10, 2017 · 5 comments
Closed
Assignees
Labels
Design & UX Performance ReadyToTest QA, please validate the fix/enhancement
Milestone

Comments

@stefanochiappini
Copy link

Following this discussion with @dragotin, it would be very useful to implement file hash check in the client, in order to avoid unnecessary data transfer, saving time and bandwitdh when syncing big folder with most of the files #equal.

Expected behaviour

  1. Whenever the client is started or there's a change in the synchronized folder, for every local file, the client should check if a file with the same name and size exists on the server;
  2. if it already exists, calculate the hash of the file locally and compare that to the one on the server;
  3. if both are equal, the client knows that both files are the same, and it can fill the entry in the local sync journal from the values from the server, while not downloading the file.

Actual behaviour

Equal files are unnecessarly downloaded, wasting a lot of time and bandwidth.

Steps to reproduce

  1. Install client on PC-1 a fill the synchronized "owncloud" folder with files.
  2. Data on PC-1 are copied in the server.
  3. Install client on PC-2 a fill the synchronized "owncloud" folder with the same files in PC-1.
  4. PC-2 downloads data from the server even if the files are exactly the same

Server configuration

Operating system: CentOS 7.2
Web server: Apache 2.4.6
Database: MariaDB 5.5.56
PHP version: 7.1
ownCloud version: 10.0.3

Client configuration

Client version: 2.3.3 (build 8250)
Operating system: Windows 7
OS language: Italian

Logs

no error log

@ckamm
Copy link
Contributor

ckamm commented Nov 10, 2017

This is a new feature in 2.4, see #5838. If the size, mtime and content hash match, no data will be downloaded. We would be happy if it would see additional tests (like with the 2.4 alpha).

@ckamm ckamm closed this as completed Nov 10, 2017
@guruz guruz added this to the 2.4.0 milestone Nov 15, 2017
@guruz
Copy link
Contributor

guruz commented Nov 15, 2017

I'm re-opening this. When having a good checksum (md5 or sha1) then there is no need for mtime check. The tuple (size,checksum/hash) is good enough. Please remove this mtime check if possible :-)

@ckamm Not sure if this is too late for 2.4.0. Maybe move to 2.4.1 or 2.5.0

@ckamm
Copy link
Contributor

ckamm commented Nov 16, 2017

If we don't discover significant issues during review I think this PR will be fine for 2.4

@stefanochiappini
Copy link
Author

stefanochiappini commented Nov 16, 2017

I agree with guruz, the mtime check might be the reason why it didn't work for me during the first sync (a big folder was unnecessarily downloaded), maybe while moving files across different OS (Win, Linux, OSx) mtime can be touched somewhere...
Ops, sorry, I didn't realize it will be implemented in a future release.

ckamm added a commit that referenced this issue Nov 16, 2017
Previously we required matching mtimes but that's actually
unnecessary when the question is about whether to skip the
download. We will still update the file's metadata.

Also, adjust behavior when the checksum is weak (Adler32):
in these cases we still depend on equal mtimes.
ckamm added a commit that referenced this issue Nov 16, 2017
Previously we required matching mtimes but that's actually
unnecessary when the question is about whether to skip the
download. We will still update the file's metadata.

Also, adjust behavior when the checksum is weak (Adler32):
in these cases we still depend on equal mtimes.
@ckamm ckamm self-assigned this Nov 16, 2017
@ckamm ckamm added the ReadyToTest QA, please validate the fix/enhancement label Nov 16, 2017
@SamuAlfageme
Copy link
Contributor

Finally got to check this. It's actually quite dope. Might also partially fix #5388

Closing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Design & UX Performance ReadyToTest QA, please validate the fix/enhancement
Projects
None yet
Development

No branches or pull requests

4 participants