-
Notifications
You must be signed in to change notification settings - Fork 806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement automatic multipart copy functionality in S3 CRT async client #3403
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please ignore the test coverage and log statements; I will fix them as I finalize the PR
partCount, optimalPartSize)); | ||
|
||
// The list of completed parts must be sorted | ||
AtomicReferenceArray<CompletedPart> completedParts = new AtomicReferenceArray<>(partCount); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation note: the reason AtomicReferenceArray
is used instead of ConcurrentLinkedQueue
is that the completedParts must be ordered and AtomicReferenceArray
should be more performant in this case
...s3/src/main/java/software/amazon/awssdk/services/s3/internal/CopyRequestConversionUtils.java
Outdated
Show resolved
Hide resolved
...src/main/java/software/amazon/awssdk/services/s3/internal/UploadPartCopyRequestProvider.java
Outdated
Show resolved
Hide resolved
services/s3/src/main/java/software/amazon/awssdk/services/s3/internal/crt/CopyHelper.java
Outdated
Show resolved
Hide resolved
a4f1a8f
to
3b896d2
Compare
@@ -110,50 +99,14 @@ public URI endpointOverride() { | |||
return endpointOverride; | |||
} | |||
|
|||
public Executor futureCompletionExecutor() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this futureCompletionExecutor
block because this is no longer needed. (Should've created a separate PR for this change 😓 I can remove this change if this makes the review difficult)
SonarCloud Quality Gate failed. |
* Iterable class to generate {@link UploadPartCopyRequest}s | ||
*/ | ||
@SdkInternalApi | ||
public final class UploadPartCopyRequestIterable implements SdkIterable<UploadPartCopyRequest> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
* Replacing S3TransferManager interfaces that allowed builder methods of S3ClientConfiguration with builder methods of S3AsyncClient (#3247) * Added customization in codegen to generate additional builder methods (#3252) * S3Object based DownloadFilter and removing DownloadFileContext as destination based filter is removed (#3258) * Moved tm POJO classes to model pckage and tm config classes to config package. Added integration tests for s3 select using S3CrtAsyncClient (#3289) * Fix broken integ test (#3301) * S3 Transfer manager renamings based on feedback: (#3297) 1. Rename destinationDirectory to destination. 2. Move DownloadDirectoryRequest.prefix and delimiter to just rely on modifying the list requests. 3. Remove upload directory recursive option in favor of using maxDepth(1). 4. Rename UploadDirectoryRequest's prefix and delimiter to s3Prefix and s3Delimiter. 5. Rename ResumableFileDownload's to* and writeTo* methods to serializeTo*. Remove charsets from write/read methods, and just use UTF-8. 6. Do not base64 encode when writing ResumableFileDownload to disk. * Allow pausing a resumed download even when the download hasn't already started. (#3300) * Add POJO classes for upload pause/resume (#3337) * Refactoring of Transfer manager APIs (#3374) * Refactoring of Transfer manager APIs * Merging the integ test failure Pr 2119 from stagging branch * Add flexible checksum support and update perf tests (#3376) * Fix flexiblechecksum implementation (#3391) * [TM upload pause/resume Part 2] Implement pause and resume for uploadFile (#3357) * Implement pause and resume for uploadFile * Update Javadocs * address feedback * Implement automatic multipart copy functionality in S3 CRT async client (#3403) * Implement automatic multipart copy functionality in S3 CRT async client * Add more tests * fix cancellation logic * Refactor CopyRequestProvider, fix request conversion and add more tests * Fix checkstyle * Transfer Manager tests refactoring (#3420) * Remove use of Junit4, clean up and consolidate tests in tm module * Ignoring the test if unicode can't be used as directory name * Add serialization and deserialization support for ResumableFileUpload (#3432) * Support serialization and deserialization of ResumableFileUpload * Address feedback * Empty json should be unmarshalled to empty map * Errors should not be wrapped - S3 Transfer Manager (#3433) * Errors should not be wrapped * update handleException() * Changelog entry * Resolve comments Update changelog description, refactor handleException(), add test * Add failed message to SdkException * Refactor handleException() and format changelog (#3461) * Fixed an issue where SSEC params were not correctly passed in copy operation (#3464) * Replace inline snippets with external compilable snippets (#3465) * Replace inline snippets with external compilable snippets * Fix build and address feedback * Fix build * Only enable CRT checksum for getObject and putObject (#3477) * Only use CRT flexible checksum for getObject and putObject * Fix build * Fix integ tests set up and tear down steps (#3485) * Enable backpressure in TM (#3533) * integrate with crt s3 flow control * Update benchmark code * Add backpressure config * Change window size * Update initial window size * Change intial window size * Use heap max memory for initial window size * Give some buffer * change window size * Make read buffer size configurable * Log result to a file * Various updates * Various updates * Add CRT benchmark * Various updates * Fix checkstyle errors and tests * Fix flaky test * Fix checkstyle errors * Add validation * Add tests * For copy operation, always forward multipart copy exception from one … (#3549) * For copy operation, always forward multipart copy exception from one request to other multipart copy requests * Minor refactoring in CopyObjectHelper (#3552) * Add benchmarks for copy, uploadDirectory and downloadDirectory (#3551) * Add benchmarks for copy, uploadDirectory and downloadDirectory * Update sample code and fix snippet path (#3567) * Update sample code and fix snippet path * Fix link * Integrate with CRT checksum fix (#3566) * Integrate with CRT checksum fix * Rename sourceDirectory to source and add S3AsycncClient#crtCreate (#3572) * Rname sourceDirectory to source and add S3AsycncClient#crtCreate * Use ByteBufferStoringSubscriber (#3581) * Use ByteBufferStoringSubscriber * Add a comment * Create constant for bytes bufferred * Increase chunk size for file upload (#3583) * Rename S3TransferManager.build().maxDepth to uploadDirectoryMaxDepth, rename S3TransferManager.builder().s3AsyncClient to .s3Client (#3584) * Fixed an issue where sdkRepsonse is not present in the ProgressSnapshot for upload and copy (#3585) * Throw UnsupportedOperationException if a user tries to pause a upload… (#3586) * Throw UnsupportedOperationException if a user tries to pause a upload with non CRT-based S3 client * Use SimplePublisher (#3594) * Update documentation for Transfer Manager (#3592) * Update javadoc * Integrate with latest CRT pause/resume fix (#3588) * Integrate with latest CRT pause/resume fix * Bump CRT version * Fixed an issue that could result in uncompletable future when headObject request threw exception in copy (#3609) * Make crt dependency optional in transfer manager module (#3613) * Make aws-crt an optional dependency in s3-transfer-manager module. * Update README * Fix category for changelog entries Co-authored-by: John Viegas <70235430+joviegas@users.noreply.github.com> Co-authored-by: Matthew Miller <millem@amazon.com> Co-authored-by: David Ho <70000000+davidh44@users.noreply.github.com>
Motivation and Context
Implement automatic multipart copy functionality in S3CrtAsyncClient instead of relying on CRT implementation
Modifications
Implement automatic multipart copy functionality in S3CrtAsyncClient
Below is the workflow:
createMultipartUpload
multipartCopy
for all partscompleteMultipartUpload
If any error happens during 3.ii or 3.iii, the SDK will attempt to call
abortMultipartCopy
to clean up partsTesting
Added unit tests and integ tests