Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-18073. Upgrade AWS SDK to v2 in S3A #5980

Conversation

steveloughran
Copy link
Contributor

Description of PR

Aggregate PR of commits needed to move hadoop trunk to the AWS v2 SDK.

will merge as a chain of commits, just pushing through yetus as a single large PR to see how it reacts

How was this patch tested?

s3 london -Dprefetch -Dscale

Found one regression, HADOOP-18853, which affects one test and can be fixed later.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

xinglin and others added 30 commits July 20, 2023 10:46
…chan Yoon.

Reviewed-by: Tao Li <tomscut@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
…ache#5822). Contributed by farmmamba.

Reviewed-by: zhangshuyan <zqingchai@gmail.com>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
…. Contributed by Shuyan Zhang.

Reviewed-by: hfutatzhanghb <1036798979@qq.com>
Reviewed-by: Tao Li <tomscut@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
…ontributed by Zhaohui Wang.

Reviewed-by: Simbarashe Dzinamarira <sdzinamarira@linkedin.com>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
…y Ayush Saxena.

Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
…bserver reads enabled. (apache#5860). Contributed by Simbarashe Dzinamarira.
…. Contributed by Hualong Zhang.

Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
Contributed By: Ahmar Suhail <ahmarsu@amazon.co.uk>
…BlockManager processMisReplicatesAsync. (apache#5877). Contributed by Haiyang Hu.

Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
…RouterSafemodeService (apache#5876). Contributed by Haiyang Hu.

Reviewed-by: hfutatzhanghb <1036798979@qq.com>
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
…ks (apache#5904)

Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Tao Li <tomscut@apache.org>
…cyDefault Class (apache#5907)

Co-authored-by: huangzhaobo <huangzhaobo99@126.com>
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Tao Li <tomscut@apache.org>
Co-authored-by: Benjamin Teke <bteke@cloudera.com>
…pache#5900). Contributed by Shuyan Zhang.

Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
…t may cause transitive dependency issue with 2.12.7 (apache#5884)
hchaverri and others added 29 commits August 8, 2023 07:45
…che so tokens are updated frequently. (apache#5897) Contributed by Hector Sandoval Chaverri.

Reviewed-by: Simbarashe Dzinamarira <sdzinamarira@linkedin.com>
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
…a block logic when set decrease replication. (apache#5913). Contributed by Haiyang Hu.

Reviewed-by: Tao Li <tomscut@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
…rPolicyConfiguration Of Queues. (apache#5862) Contributed by Shilun Fan.

Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
…ed by Liangjun He.

Reviewed-by: Shilun Fan <slfan1989@apache.org>
Reviewed-by: Xing Lin <linxingnku@gmail.com>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
…ommitted-allowed. (apache#5933)

Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Tao Li <tomscut@apache.org>
…Contributed by Shuyan Zhang.

Reviewed-by: hfutatzhanghb <1036798979@qq.com>
Reviewed-by: Haiyang Hu <haiyang.hu@shopee.com>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
…ache#5941). Contributed by Shuyan Zhang.

Reviewed-by: Haiyang Hu <haiyang.hu@shopee.com>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Tao Li <tomscut@apache.org>
…istent reads. (apache#5951)

Reviewed-by: Simbarashe Dzinamarira <sdzinamarira@linkedin.com>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
…Shilun Fan.

Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
apache#5938). Contributed by Shuyan Zhang.

Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
…5.2 as it may cause transitive dependency issue with 2.12.7" (apache#5969)

This reverts commit 35af8b9.
See aws_sdk_v2_changelog.md for details.

Co-authored-by: Ahmar Suhail <ahmarsu@amazon.co.uk>
Co-authored-by: Alessandro Passaro <alexpax@amazon.co.uk>

HADOOP-18073. Address review comments. (apache#31)

addresses review comments + yetus errors

Co-authored-by: Ahmar Suhail <ahmarsu@amazon.co.uk>

Move MultiObjectDeleteException to impl

Reinstate old constants

Move TransferManager initialization to ClientFactory

Add unit tests for BlockingEnumeration

Add unit tests for SelectEventStreamPublisher

updates new providers in TestS3AAWSCredentialsProvider to V2

update GET range referrer header logic to V2

adds in unit check for bytes

HADOOP-18565. Complete outstanding items for the AWS SDK V2 upgrade. (apache#5421)

Changes include
* use bundled transfer manager
* adds transfer listener to upload
* adds support for custom signers
* don't set default endpoint
* removes v1 sdk bundle, only use core package
* implements region caching
+ many more

Note: spotbugs is warning about inconsistent
synchronization in accessing a new s3a FS field.
This will be fixed in a follow-up patch.

Contributed by Ahmar Suhail
This removes the AWS V1 SDK as a hadoop-aws runtime dependency.

It is still used at compile time so as to build a wrapper class
V1ToV2AwsCredentialProviderAdapter which allows v1 credential provider
to be used for authentication.
All well known credential providers have their classname remapped from
v1 to v2 classes prior to instantiation; this wrapper is not needed
for them.

There is no support for migrating other SDK plugin points
(signing, handlers)

Access to the v2 S3Client class used by an S3A FileSystem
instance is now via a new interface org.apache.hadoop.fs.s3a.S3AInternals;
other low-level operations (getObjectMetadata(Path)) have moved.

Contributed by Steve Loughran
Upgrades the AWS sdk v2 version to 2.20.28 

This
* adds multipart COPY/rename in the java async client
* removes the aws-crt JAR dependency

Contributed by Ahmar Suhail
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet