Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-19066. S3A: AWS SDK V2 - Enabling FIPS should be allowed with central endpoint #6539

Merged
merged 1 commit into from Mar 12, 2024

Conversation

virajjasani
Copy link
Contributor

Jira: HADOOP-19066

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 17m 41s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 46m 25s trunk passed
+1 💚 compile 0m 41s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 31s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 29s trunk passed
+1 💚 mvnsite 0m 40s trunk passed
+1 💚 javadoc 0m 24s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 31s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 9s trunk passed
+1 💚 shadedclient 37m 30s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 29s the patch passed
+1 💚 compile 0m 33s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 33s the patch passed
+1 💚 compile 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 20s the patch passed
+1 💚 mvnsite 0m 30s the patch passed
+1 💚 javadoc 0m 15s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 6s the patch passed
+1 💚 shadedclient 37m 18s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 53s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 35s The patch does not generate ASF License warnings.
154m 41s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6539/1/artifact/out/Dockerfile
GITHUB PR #6539
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux e0ea602ff831 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / fa91891
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6539/1/testReport/
Max. process+thread count 580 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6539/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@virajjasani
Copy link
Contributor Author

Tested against us-west-2 bucket with endpoints: s3.amazonaws.com and s3-us-west-2.amazonaws.com:

mvn clean verify -Dparallel-tests -DtestsThreadCount=8 -Dscale -Dprefetch

@virajjasani
Copy link
Contributor Author

@ahmarsuhail @mukund-thakur could you please review this PR?

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1; lets merge and see if any regressions surface

@steveloughran steveloughran merged commit 44c14ed into apache:trunk Mar 12, 2024
4 checks passed
@steveloughran
Copy link
Contributor

(testing cherrypick; if all is good will merge to 3.4.x)

@steveloughran
Copy link
Contributor

not good on branch-3.4; we need a followup i'm afraid. leaving in trunk rather than reverting for now as the other tests all seem happy.

@virajjasani
Copy link
Contributor Author

I will re-run the test suite and followup.

@steveloughran
Copy link
Contributor

steveloughran commented Mar 12, 2024

looking at my current settings I've set endpoint to london but the region is unset; making sure that the classic binding mechanism still works.


  <property>
    <name>fs.s3a.bucket.stevel-london.endpoint</name>
    <value>${london.endpoint}</value>
  </property>

  <property>
    <name>X.fs.s3a.bucket.stevel-london.endpoint.region</name>
    <value>${london.region}</value>
  </property>

@virajjasani
Copy link
Contributor Author

virajjasani commented Mar 12, 2024

rebasing and rebuilding both trunk and branch-3.4 before re-running the tests.

@virajjasani
Copy link
Contributor Author

virajjasani commented Mar 12, 2024

Something seems odd. This test overrides endpoint/region configs so setting any endpoint/region should have made no difference:

  @Test
  public void testCentralEndpointAndNullRegionFipsWithCRUD() throws Throwable {
    describe("Access the test bucket using central endpoint and"
        + " null region and fips enabled, perform file system CRUD operations");
    final Configuration conf = getConfiguration();

    final Configuration newConf = new Configuration(conf);

    removeBaseAndBucketOverrides(
        newConf,
        ENDPOINT,
        AWS_REGION,
        FIPS_ENDPOINT);

    newConf.set(ENDPOINT, CENTRAL_ENDPOINT);
    newConf.setBoolean(FIPS_ENDPOINT, true);

    newFS = new S3AFileSystem();
    newFS.initialize(getFileSystem().getUri(), newConf);

    assertOpsUsingNewFs();
  }

I tested using these settings and there is no difference in behaviour because the test overrides base and bucket configs for endpoint/region.

I tried:

  1. endpoint: us-west-2, region: unset
  2. endpoint: central, region: unset
  3. endpoint: unset, region: unset
  4. endpoint: us-west-2, region: us-west-2

From the stacktrace from Jira:

[ERROR] Tests run: 18, Failures: 0, Errors: 1, Skipped: 1, Time elapsed: 56.26 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.ITestS3AEndpointRegion
[ERROR] testCentralEndpointAndNullRegionFipsWithCRUD(org.apache.hadoop.fs.s3a.ITestS3AEndpointRegion)  Time elapsed: 4.821 s  <<< ERROR!
java.net.UnknownHostException: getFileStatus on s3a://stevel-london/test/testCentralEndpointAndNullRegionFipsWithCRUD/srcdir: software.amazon.awssdk.core.exception.SdkClientException: Received an UnknownHostException when attempting to interact with a service. See cause for the exact endpoint that is failing to resolve. If this is happening on an endpoint that previously worked, there may be a network connectivity issue or your DNS cache could be storing endpoints for too long.:    software.amazon.awssdk.core.exception.SdkClientException: Received an UnknownHostException when attempting to interact with a service. See cause for the exact endpoint that is failing to resolve. If this is happening on an endpoint that previously worked, there may be a network connectivity issue or your DNS cache could be storing endpoints for too long.: stevel-london.s3-fips.eu-west-2.amazonaws.com
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.hadoop.fs.s3a.impl.ErrorTranslation.wrapWithInnerIOE(ErrorTranslation.java:182)
	at org.apache.hadoop.fs.s3a.impl.ErrorTranslation.maybeExtractIOException(ErrorTranslation.java:152)
	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:207)
	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4066)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3922)
	at org.apache.hadoop.fs.s3a.S3AFileSystem$MkdirOperationCallbacksImpl.probePathStatus(S3AFileSystem.java:3794)
	at org.apache.hadoop.fs.s3a.impl.MkdirOperation.probePathStatusOrNull(MkdirOperation.java:173)
	at org.apache.hadoop.fs.s3a.impl.MkdirOperation.getPathStatusExpectingDir(MkdirOperation.java:194)
	at org.apache.hadoop.fs.s3a.impl.MkdirOperation.execute(MkdirOperation.java:108)
	at org.apache.hadoop.fs.s3a.impl.MkdirOperation.execute(MkdirOperation.java:57)
	at org.apache.hadoop.fs.s3a.impl.ExecutingStoreOperation.apply(ExecutingStoreOperation.java:76)
	at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
	at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
	at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2707)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2726)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:3766)
	at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2494)
	at org.apache.hadoop.fs.s3a.ITestS3AEndpointRegion.assertOpsUsingNewFs(ITestS3AEndpointRegion.java:461)
	at org.apache.hadoop.fs.s3a.ITestS3AEndpointRegion.testCentralEndpointAndNullRegionFipsWithCRUD(ITestS3AEndpointRegion.java:454)

Here, we set:

    removeBaseAndBucketOverrides(
        newConf,
        ENDPOINT,
        AWS_REGION,
        FIPS_ENDPOINT);

    newConf.set(ENDPOINT, CENTRAL_ENDPOINT);
    newConf.setBoolean(FIPS_ENDPOINT, true);

    newFS = new S3AFileSystem();
    newFS.initialize(getFileSystem().getUri(), newConf);

How could stacktrace show different region than us-east-2 when the test overrides endpoint to central and removes region?

@virajjasani
Copy link
Contributor Author

virajjasani commented Mar 12, 2024

Just created a bucket in london and now i can reproduce the failure, checking.
Fails for london bucket, even if central endpoint is used.

@virajjasani
Copy link
Contributor Author

virajjasani commented Mar 12, 2024

Issue seems with FIPS cases.

FIPS enabled and

  1. bucket created on oregon, s3 client configured with us-east-2 region with cross-region access enabled and no endpoint override: things look good
  2. bucket created on london, s3 client configured with us-east-2 region with cross-region access enabled and no endpoint override: fails with
Caused by: software.amazon.awssdk.core.exception.SdkClientException: Received an UnknownHostException when attempting to interact with a service. See cause for the exact endpoint that is failing to resolve. If this is happening on an endpoint that previously worked, there may be a network connectivity issue or your DNS cache could be storing endpoints for too long.
  1. bucket created on paris, s3 client configured with us-east-2 region with cross-region access enabled and no endpoint override: fails with
Caused by: software.amazon.awssdk.core.exception.SdkClientException: Received an UnknownHostException when attempting to interact with a service. See cause for the exact endpoint that is failing to resolve. If this is happening on an endpoint that previously worked, there may be a network connectivity issue or your DNS cache could be storing endpoints for too long.

All of above cases pass if FIPS is disabled.

will create an SDK issue soon.

@virajjasani
Copy link
Contributor Author

Oh wait, FIPS is only for US and Canada endpoints. The above error is legit.

Let me provide an addendum to ignore the test if non-US or Canada endpoints are used.

@virajjasani
Copy link
Contributor Author

Addendum PR: #6624

@steveloughran
Copy link
Contributor

thanks. always good to have a broad set of test configs amongst other devs, especially now there are things like s3 express. Milan and Jakarta and any other Post 2019 region are also trouble as central DNS doesn't resolve bucket names as stevel-milan.s3.amazonaws.com

asfgit pushed a commit that referenced this pull request Mar 13, 2024
…central endpoint (#6539)

Includes HADOOP-19066. Run FIPS test for valid bucket locations (ADDENDUM) (#6624)

FIPS is only supported in north america AWS regions; relevant tests in
ITestS3AEndpointRegion are skipped for buckets with different endpoints/regions.

Contributed by Viraj Jasani
@virajjasani
Copy link
Contributor Author

Post 2019 region are also trouble as central DNS doesn't resolve bucket names

indeed, that is also problematic.

@steveloughran
Copy link
Contributor

FWIW a real problem is that the v2 sdk retries on unknown host exception until timeout, and that inner exception is lost.

I see there may be ways in the aws sdk to restrict more exceptions to retry; probably merits investigation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants