Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check file exists before download #63

Merged
merged 2 commits into from
Apr 25, 2022

Conversation

wveit
Copy link
Contributor

@wveit wveit commented Apr 25, 2022

Addresses #17 - Where a failed download during a subscriber run would cause all the successfully downloaded files to be re-downloaded during the next run.

Changes:

  • Before downloading a file, subscriber first checks if the file already exists in the output_directory. If it does exist, calculate a checksum for the file and check it against the checksum that was in the granule search results. If the checksum matches, then skip downloading that file.
  • This new behavior can be overridden with a new --force/-f option, which will cause all files that shows up in the CMR search results to be downloaded

Effectively, this change does something like this:

if (   the_file_exists() and 
       the_file_checksum_is_current_with_cmr() and 
       force_option_not_used()  ):
     skip_the_download()
do_the_download()

I should mention, this PR doesn't really do anything special that causes the subscriber "resume" or "retry" if a failure occurs. It just prevents re-downloading files on the next run.

Wilbert Veit added 2 commits April 25, 2022 02:37
Prevents re-downloading files (e.g. in case previous run
failed because of other file failures).

If the subscriber sees a file already exists, it will also calculate
the file checksum and see if it matches the checksum in
CMR. If the checcksum doesn't match, it will re-download.

There is now a --force/-f option that will cause subscriber
to re-download even if the file exists and is up to date.

Issue #17
Copy link
Member

@frankinspace frankinspace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Kudos on the documentation and tests as well, looks great.

@mike-gangl mike-gangl merged commit 2b0276a into develop Apr 25, 2022
@wveit wveit deleted the check-file-exists-before-download branch April 26, 2022 22:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants