Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync: check file content with md5 hash #1105

Closed
davies opened this issue Dec 8, 2021 · 1 comment · Fixed by #1208
Closed

sync: check file content with md5 hash #1105

davies opened this issue Dec 8, 2021 · 1 comment · Fixed by #1208
Assignees
Labels
kind/feature New feature or request
Milestone

Comments

@davies
Copy link
Contributor

davies commented Dec 8, 2021

During data migration, it's important to verify that the content in source and destination are identical, we should provide a option to check the md5sum of them.

We could have two mode:

--check-all check all the files in source and destination

--check-new check newly copied files

@davies davies added the kind/feature New feature or request label Dec 8, 2021
@davies davies added this to the Release 1.0 milestone Dec 16, 2021
@SandyXSD SandyXSD self-assigned this Dec 20, 2021
@SandyXSD
Copy link
Contributor

From rsync:
In short, it always check md5 after file transferred; using this option to check md5 before transferring to determine if a file is changed.

-c, --checksum
              This changes the way rsync checks if the files have been changed and are in need of a transfer.  Without this  option,  rsync
              uses  a "quick check" that (by default) checks if each file’s size and time of last modification match between the sender and
              receiver.  This option changes this to compare a 128-bit checksum for each file that has a  matching  size.   Generating  the
              checksums  means that both sides will expend a lot of disk I/O reading all the data in the files in the transfer (and this is
              prior to any reading that will be done to transfer changed files), so this can slow things down significantly.

              The sending side generates its checksums while it is doing the file-system scan that builds the list of the available  files.
              The receiver generates its checksums when it is scanning for changed files, and will checksum any file that has the same size
              as the corresponding sender’s file:  files with either a changed size or a changed checksum are selected for transfer.

              Note that rsync always verifies that each transferred file was correctly reconstructed on the receiving side  by  checking  a
              whole-file  checksum  that  is  generated  as the file is transferred, but that automatic after-the-transfer verification has
              nothing to do with this option’s before-the-transfer "Does this file need to be updated?" check.

              For protocol 30 and beyond (first supported in 3.0.0), the checksum used is MD5.  For older protocols, the checksum  used  is
              MD4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants