Add method to check and skip duplicate content uploads to S3 #1032
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Rebase of #1015 to 24.1-release so HBCD can benefit from those changes when we do the next bug fix release of LORIS-MRI.
Description (from PR #1015)
The changes here are intended to check to see if the content of file that would be uploaded to S3 has already been uploaded. It does this by checking to see if the hash of a file content is already available at the targeted S3 object key location before attempting to upload new content. If it already exists, it will skip it.
This helps to resolve an issue where sometimes the same content would be uploaded to an S3 bucket, even if that file already existed. Normally this would be fine, but in versioning enabled buckets this creates duplicate copies of the files when no changes are needed.
This does not appear to cause any breaking changes.