Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to validate files by file size in Download Client. #5008

Closed
panta-123 opened this issue Nov 13, 2021 · 5 comments · Fixed by #5049
Closed

Option to validate files by file size in Download Client. #5008

panta-123 opened this issue Nov 13, 2021 · 5 comments · Fixed by #5049
Assignees
Milestone

Comments

@panta-123
Copy link
Contributor

panta-123 commented Nov 13, 2021

Motivation

At Belle2 when user download files, we provide option to validate downloded files by checksum or size. Checksum validation is default in rucio . Would it possible to have option to validate by filesize ?

New option:

$ rucio download --check-by-size

Modification

@bari12
Copy link
Member

bari12 commented Nov 15, 2021

It's do-able, I wonder though why you would ever want to do this? Do you have files without checksums in the catalog? @cserf ?

@panta-123
Copy link
Contributor Author

@bari12 , At Belle2 , the checksum verification is slower than file size verification. So when file's path already exists , then we provide user to choose between checksum or file size verification.
In case of recovery from failure download, file size verification seems a better way as checksum verification is already done at first download.
Is the time complexity between checksum or file size verification will be same in rucio ?

@iueda
Copy link

iueda commented Nov 16, 2021

At Belle2 when user download files, we provide option to validate downloded files by checksum or size. Checksum validation is default in rucio . Would it possible to have option to validate by filesize ?

Just for a clarification, this it only for verification of the "local" files downloaded in the previous attempts that are to be verified to skip unnecessary downloading.
Verification of the files downloaded in the current process better be done with the checksum.

@bari12
Copy link
Member

bari12 commented Nov 23, 2021

Understood, this should be more or less easy, we can schedule this via the core-team.
@joeldierkes can you please have a look.

if os.path.isfile(dest_file_path):
should be where we can adapt this. (Maybe a --check-filesize-only option on rucio download)

@joeldierkes
Copy link
Contributor

joeldierkes commented Nov 25, 2021

--check-filesize-only could be misleading if the file needs to be downloaded. The user could assume that the checksum is not checked, even though it is.

I would suggest a --check-local-with-filesize-only flag...

joeldierkes pushed a commit to joeldierkes/rucio that referenced this issue Nov 26, 2021
…5008

The checksum verification is slow compared to the filesize verification. This
commits adds the option to verify already downloded files by filesize only.
joeldierkes pushed a commit to joeldierkes/rucio that referenced this issue Dec 2, 2021
…5008

The checksum verification is slow compared to the filesize verification. This
commits adds the option to verify already downloded files by filesize only.
bari12 added a commit that referenced this issue Dec 9, 2021
…date_files_by_file_size_in_Download_Client_

Clients: Validate already downloaded file by filesize only Fix #5008
bari12 pushed a commit that referenced this issue Dec 9, 2021
The checksum verification is slow compared to the filesize verification. This
commits adds the option to verify already downloded files by filesize only.
@bari12 bari12 modified the milestones: 1.27.2-clients, 1.27.2 Dec 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants