Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replacing S3TransferManager interfaces that allowed builder methods of S3ClientConfiguration with builder methods of S3AsyncClient #3247

Conversation

joviegas
Copy link
Contributor

@joviegas joviegas commented Jun 13, 2022

Replacing S3TransferManager interfaces that allowed builder methods of S3ClientConfiguration with builder methods of S3AsyncClient

Motivation and Context

  • Replacing S3TransferManager interfaces that allowed builder methods of S3ClientConfiguration with builder methods of S3AsyncClient

Modifications

  • Removed S3ClientConfiguration
  • Directly passing S3AsyncClient in the builder APIs of S3TransferManger.

Testing

  • updated junits

Screenshots (if appropriate)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

Checklist

  • I have read the CONTRIBUTING document
  • Local run of mvn install succeeds
  • My code follows the code style of this project
  • My change requires a change to the Javadoc documentation
  • I have updated the Javadoc documentation accordingly
  • I have added tests to cover my changes
  • All new and existing tests passed
  • I have added a changelog entry. Adding a new entry must be accomplished by running the scripts/new-change script and following the instructions. Commit the new file created by the script in .changes/next-release with your changes.
  • My change is to implement 1.11 parity feature and I have updated LaunchChangelog

License

  • [ x] I confirm that this pull request can be released under the Apache 2 license

@joviegas joviegas requested a review from a team as a code owner June 13, 2022 19:30
@joviegas joviegas force-pushed the transferManager_client_config_joviegas branch from 4736a2b to d412332 Compare June 13, 2022 21:50
@joviegas joviegas changed the title Replacing S3TransferManager interfaces that allowed builder methods o… Replacing S3TransferManager interfaces that allowed builder methods of S3ClientConfiguration with builder methods of S3AsyncClient Jun 13, 2022
Comment on lines 509 to 514
* Low level S3 client that implements {@link S3AsyncClient}.The {@link S3TransferManager} already provides sensible
* default client. As of now only {@link S3CrtAsyncClient} supports concurrent execution of Transfer manager operations.
* @param s3AsyncClient Implementation of {@link S3AsyncClient}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we clarify that the provided S3AsyncClient will not be closed when the transfer manager is closed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


if (!(s3AsyncClient instanceof S3CrtAsyncClient)) {
log.warn(() -> "s3AsyncClient must be of S3CrtAsyncClient in order to get maximum throughput. The current"
+ " s3AsyncClient is of " +s3AsyncClient.getClass().getSimpleName() +" type in which individual file"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like the following?

The provided S3AsyncClient is not an instance of S3CrtAsyncClient, and thus multipart 
upload/download feature is not enabled. To benefit from maximum throughput, 
consider using S3AsyncClient.crtBuilder().build() instead.

On a side note, getSimpleName has poor performance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@@ -356,7 +350,7 @@ public Copy copy(CopyRequest copyRequest) {

@Override
public void close() {
s3CrtAsyncClient.close();
s3AsyncClient.close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not close the client if it's provided by the user

@@ -402,16 +396,16 @@ private static boolean isMrapArn(Arn arn) {
}

private static final class DefaultBuilder implements S3TransferManager.Builder {
private S3ClientConfiguration s3ClientConfiguration = S3ClientConfiguration.builder().build();
private S3AsyncClient s3AsyncClient = S3CrtAsyncClient.builder().build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be lazily initialized. Should we move this up in the TM ctor and only create one if no client is provided?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@joviegas joviegas force-pushed the transferManager_client_config_joviegas branch from d412332 to d4918ac Compare June 14, 2022 17:58
Comment on lines 415 to 451
* If S3AsyncClient is not supplied then SDK wil create default S3AsyncClient with following properties
*
* <p>
* If region is not specified, the SDK will attempt to identify the endpoint automatically using the following logic:
* <ol>
* <li>Check the 'aws.region' system property for the region.</li>
* <li>Check the 'AWS_REGION' environment variable for the region.</li>
* <li>Check the {user.home}/.aws/credentials and {user.home}/.aws/config files for the region.</li>
* <li>If running in EC2, check the EC2 metadata service for the region.</li>
* </ol>
*
* <p>The default provider will attempt to identify the credentials automatically using the following checks:
* <ol>
* <li>Java System Properties - <code>aws.accessKeyId</code> and <code>aws.secretKey</code></li>
* <li>Environment Variables - <code>AWS_ACCESS_KEY_ID</code> and <code>AWS_SECRET_ACCESS_KEY</code></li>
* <li>Credential profiles file at the default location (~/.aws/credentials) shared by all AWS SDKs and the AWS CLI</li>
* <li>Credentials delivered through the Amazon EC2 container service if AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
* environment variable is set and security manager has permission to access the variable.</li>
* <li>Instance profile credentials delivered through the Amazon EC2 metadata service</li>
* </ol>
*
* <p>If the credentials are not found in any of the locations above, an exception will be thrown at {@link #build()}
* time.
* </p>
* <p>
* If Minimum part size is not specified then by default, it is 8MB
* </p>
*
* <p>
* If target throughput for transfer requests provided, the TransferManager
* will calculate the optional number of connections based on {@link #targetThroughputInGbps}.
* If the value is too low, the S3TransferManager might not achieve the specified target throughput.
* </p>
*
* <p>
* If maximum number of S3 connections
* </p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being unclear, I meant adding the Javadocs to the corresponding methods in the S3CrtAsyncClientBuilder class

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added now

@@ -332,7 +338,7 @@ void downloadDirectory_throwException_shouldCompleteFutureExceptionally() {
void close_shouldCloseUnderlyingResources() {
S3TransferManager transferManager = new DefaultS3TransferManager(mockS3Crt, uploadDirectoryHelper, configuration, downloadDirectoryHelper);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change to use builder pattern and pass the mockS3Crt client so that we can verify that the provided client is not closed when we close the transfer manager?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a new test case with Builder

@joviegas joviegas force-pushed the transferManager_client_config_joviegas branch from d4918ac to 87a8c31 Compare June 15, 2022 19:39
@@ -332,8 +339,15 @@ void downloadDirectory_throwException_shouldCompleteFutureExceptionally() {
void close_shouldCloseUnderlyingResources() {
S3TransferManager transferManager = new DefaultS3TransferManager(mockS3Crt, uploadDirectoryHelper, configuration, downloadDirectoryHelper);
transferManager.close();
verify(mockS3Crt).close();
verify(configuration).close();
Copy link
Contributor

@zoewangg zoewangg Jun 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason line 336 is removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed by mistake.. added it back

@joviegas joviegas force-pushed the transferManager_client_config_joviegas branch from 87a8c31 to 6320a00 Compare June 15, 2022 22:14
…f S3ClientConfiguration with builder methods of S3AsyncClient
@joviegas joviegas force-pushed the transferManager_client_config_joviegas branch from 6320a00 to 778e6e3 Compare June 15, 2022 23:06
@sonarcloud
Copy link

sonarcloud bot commented Jun 16, 2022

SonarCloud Quality Gate failed.    Quality Gate failed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

78.9% 78.9% Coverage
0.0% 0.0% Duplication

@joviegas joviegas merged commit 8f0fb24 into feature/master/transfermanager-GA Jun 16, 2022
@joviegas joviegas deleted the transferManager_client_config_joviegas branch June 17, 2022 16:58
zoewangg pushed a commit that referenced this pull request Jun 28, 2022
…f S3ClientConfiguration with builder methods of S3AsyncClient (#3247)
zoewangg added a commit that referenced this pull request Dec 14, 2022
* Replacing S3TransferManager interfaces that allowed builder methods of S3ClientConfiguration with builder methods of S3AsyncClient (#3247)

* Added customization in codegen to generate additional builder methods (#3252)

* S3Object based DownloadFilter and removing DownloadFileContext as destination based filter is removed (#3258)

* Moved tm POJO classes to model pckage and tm config classes to config package. Added integration tests for s3 select using S3CrtAsyncClient (#3289)

* Fix broken integ test (#3301)

* S3 Transfer manager renamings based on feedback: (#3297)

1. Rename destinationDirectory to destination.
2. Move DownloadDirectoryRequest.prefix and delimiter to just rely on modifying the list requests.
3. Remove upload directory recursive option in favor of using maxDepth(1).
4. Rename UploadDirectoryRequest's prefix and delimiter to s3Prefix and s3Delimiter.
5. Rename ResumableFileDownload's to* and writeTo* methods to serializeTo*. Remove charsets from write/read methods, and just use UTF-8.
6. Do not base64 encode when writing ResumableFileDownload to disk.

* Allow pausing a resumed download even when the download hasn't already started. (#3300)

* Add POJO classes for upload pause/resume (#3337)

* Refactoring of Transfer manager APIs (#3374)

* Refactoring of Transfer manager APIs

* Merging the integ test failure Pr 2119 from stagging branch

* Add flexible checksum support and update perf tests (#3376)

* Fix flexiblechecksum implementation (#3391)

* [TM upload pause/resume Part 2] Implement pause and resume for uploadFile (#3357)

* Implement pause and resume for uploadFile

* Update Javadocs

* address feedback

* Implement automatic multipart copy functionality in S3 CRT async client (#3403)

* Implement automatic multipart copy functionality in S3 CRT async client

* Add more tests

* fix cancellation logic

* Refactor CopyRequestProvider, fix request conversion and add more tests

* Fix checkstyle

* Transfer Manager tests refactoring (#3420)

* Remove use of Junit4, clean up and consolidate tests in tm module

* Ignoring the test if unicode can't be used as directory name

* Add serialization and deserialization support for ResumableFileUpload (#3432)

* Support serialization and deserialization of ResumableFileUpload

* Address feedback

* Empty json should be unmarshalled to empty map

* Errors should not be wrapped - S3 Transfer Manager (#3433)

* Errors should not be wrapped

* update handleException()

* Changelog entry

* Resolve comments
Update changelog description, refactor handleException(), add test

* Add failed message to SdkException

* Refactor handleException() and format changelog (#3461)

* Fixed an issue where SSEC params were not correctly passed in copy operation (#3464)

* Replace inline snippets with external compilable snippets (#3465)

* Replace inline snippets with external compilable snippets

* Fix build and address feedback

* Fix build

* Only enable CRT checksum for getObject and putObject (#3477)

* Only use CRT flexible checksum for getObject and putObject

* Fix build

* Fix integ tests set up and tear down steps (#3485)

* Enable backpressure in TM (#3533)

* integrate with crt s3 flow control

* Update benchmark code

* Add backpressure config

* Change window size

* Update initial window size

* Change intial window size

* Use heap max memory for initial window size

* Give some buffer

* change window size

* Make read buffer size configurable

* Log result to a file

* Various updates

* Various updates

* Add CRT benchmark

* Various updates

* Fix checkstyle errors and tests

* Fix flaky test

* Fix checkstyle errors

* Add validation

* Add tests

* For copy operation, always forward multipart copy exception from one … (#3549)

* For copy operation, always forward multipart copy exception from one request to other multipart copy requests

* Minor refactoring in CopyObjectHelper (#3552)

* Add benchmarks for copy, uploadDirectory and downloadDirectory (#3551)

* Add benchmarks for copy, uploadDirectory and downloadDirectory

* Update sample code and fix snippet path (#3567)

* Update sample code and fix snippet path

* Fix link

* Integrate with CRT checksum fix (#3566)

* Integrate with CRT checksum fix

* Rename sourceDirectory to source and add S3AsycncClient#crtCreate (#3572)

* Rname sourceDirectory to source and add S3AsycncClient#crtCreate

* Use ByteBufferStoringSubscriber (#3581)

* Use ByteBufferStoringSubscriber

* Add a comment

* Create constant for bytes bufferred

* Increase chunk size for file upload (#3583)

* Rename S3TransferManager.build().maxDepth to uploadDirectoryMaxDepth, rename S3TransferManager.builder().s3AsyncClient to .s3Client (#3584)

* Fixed an issue where sdkRepsonse is not present in the ProgressSnapshot for upload and copy (#3585)

* Throw UnsupportedOperationException if a user tries to pause a upload… (#3586)

* Throw UnsupportedOperationException if a user tries to pause a upload with non CRT-based S3 client

* Use SimplePublisher (#3594)

* Update documentation for Transfer Manager (#3592)

* Update javadoc

* Integrate with latest CRT pause/resume fix (#3588)

* Integrate with latest CRT pause/resume fix
* Bump CRT version

* Fixed an issue that could result in uncompletable future when headObject request threw exception in copy (#3609)

* Make crt dependency optional in transfer manager module (#3613)

* Make aws-crt an optional dependency in s3-transfer-manager module.

* Update README

* Fix category for changelog entries

Co-authored-by: John Viegas <70235430+joviegas@users.noreply.github.com>
Co-authored-by: Matthew Miller <millem@amazon.com>
Co-authored-by: David Ho <70000000+davidh44@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants