Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-13044][s3][fs] Fix handling of relocated amazon classes #9318

Merged
merged 1 commit into from
Aug 6, 2019

Conversation

zentol
Copy link
Contributor

@zentol zentol commented Aug 1, 2019

To provide credentials to S3 users may configure a credentials provider. For providers from amazon (which are relocated) we allow users to configure the original class name, and relocate it manually in the S3 filesystem factories.

The factories however

  • used the wrong shading pattern
  • could not correctly detect providers with a "com.amazon*" prefix, as the String constant containing this prefix was also being relocated.

This commit obfuscates the constant to prevent relocations, fixes the shading patterns, and sets up e2e tests to cover this case.

I've triggered a cron job:

@flinkbot
Copy link
Collaborator

flinkbot commented Aug 1, 2019

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit d244ab3 (Tue Aug 06 19:44:16 UTC 2019)

Warnings:

  • No documentation files were touched! Remember to keep the Flink docs up to date!

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.


The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Aug 1, 2019

CI report:

@@ -72,5 +72,8 @@ run_test "SQL Client end-to-end test for Kafka 0.10" "$END_TO_END_DIR/test-scrip
run_test "SQL Client end-to-end test for Kafka 0.11" "$END_TO_END_DIR/test-scripts/test_sql_client_kafka011.sh"
run_test "SQL Client end-to-end test for modern Kafka" "$END_TO_END_DIR/test-scripts/test_sql_client_kafka.sh"

run_test "Shaded Hadoop S3A with credentials provider end-to-end test" "$END_TO_END_DIR/test-scripts/test_batch_wordcount.sh hadoop_with_provider"
run_test "Shaded Presto S3 with credentials provider end-to-end test" "$END_TO_END_DIR/test-scripts/test_batch_wordcount.sh presto_with_provider"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Afaik, the run-nightly-tests.sh is not used in CI runs, but it's mentioned in the README.md.
I think it would be nice to add those new tests into the run-nightly-tests.sh too, to keep the file in sync with tests splits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The run-nightly-tests.sh should actually be removed entirely. It's already out-of-sync with the actually run tests, and acts as a trap for committers who aren't that familiar with the Travis setup. Multiple tests are defined in there that are missing from the splits, and this is likely not intentional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If anything it should be reworked to call into all splits.

Copy link
Contributor

@1u0 1u0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have run the new hadoop_with_provider e2e test locally. For presto e2e test I had issues running locally before.

+1, after the tests pass in CI.

@zentol
Copy link
Contributor Author

zentol commented Aug 4, 2019

According to the CI results there's a shading issue in the presto filesystem:

There's a shading issue in the Presto filesystem. When I setup the E2E test to use a credentials provider I run into a NoSuchMethodException:

Caused by: java.lang.RuntimeException: Error creating an instance of org.apache.flink.fs.s3base.shaded.com.amazonaws.auth.EnvironmentVariableCredentialsProvider for URI s3://[secure]/static/words
	at org.apache.flink.fs.s3presto.shaded.com.facebook.presto.hive.s3.PrestoS3FileSystem.getCustomAWSCredentialsProvider(PrestoS3FileSystem.java:724)
	at org.apache.flink.fs.s3presto.shaded.com.facebook.presto.hive.s3.PrestoS3FileSystem.getAwsCredentialsProvider(PrestoS3FileSystem.java:708)
	at org.apache.flink.fs.s3presto.shaded.com.facebook.presto.hive.s3.PrestoS3FileSystem.createAmazonS3Client(PrestoS3FileSystem.java:632)
	at org.apache.flink.fs.s3presto.shaded.com.facebook.presto.hive.s3.PrestoS3FileSystem.initialize(PrestoS3FileSystem.java:216)
	at org.apache.flink.fs.s3.common.AbstractS3FileSystemFactory.create(AbstractS3FileSystemFactory.java:126)
	... 28 more
Caused by: java.lang.NoSuchMethodException: org.apache.flink.fs.s3base.shaded.com.amazonaws.auth.EnvironmentVariableCredentialsProvider.<init>(java.net.URI, org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.conf.Configuration)
	at java.lang.Class.getConstructor0(Class.java:3082)
	at java.lang.Class.getConstructor(Class.java:1825)
	at org.apache.flink.fs.s3presto.shaded.com.facebook.presto.hive.s3.PrestoS3FileSystem.getCustomAWSCredentialsProvider(PrestoS3FileSystem.java:720)
	... 32 more``` Branch: https://github.com/apache/flink/tree/s3_test2

@1u0
Copy link
Contributor

1u0 commented Aug 5, 2019

It looks like it's an another issue with Presto fs:
the com.amazonaws.auth.EnvironmentVariableCredentialsProvider is not suitable AWSCredentialsProvider for com.facebook.presto.hive.s3.PrestoS3FileSystem.

From Prestodb documentation:

You can configure a custom S3 credentials provider by setting the Hadoop configuration property presto.s3.credentials-provider to be the fully qualified class name of a custom AWS credentials provider implementation. This class must implement the AWSCredentialsProvider interface and provide a two-argument constructor that takes a java.net.URI and a Hadoop org.apache.hadoop.conf.Configuration as arguments.

The versions of EnvironmentVariableCredentialsProvider we are using have only trivial constructor.

@zentol
Copy link
Contributor Author

zentol commented Aug 6, 2019

@1u0 Do you know by chance any credentials provider that we could use?

A custom implementation cannot work since it wouldn't use the relocated amazon dependencies; if there is no compatible provider in the presto/amazon dependencies then this feature is simply broken.

@zentol
Copy link
Contributor Author

zentol commented Aug 6, 2019

The only ones I could find so far are org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider and org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider, but they both load hadoop properties with an fs.s3a prefix, which we don't mirror from the Flink configuration.

@zentol
Copy link
Contributor Author

zentol commented Aug 6, 2019

Given that this isn't new, and the removal of shading logic in 1.10 would provide simple workarounds, I would be fine with excluding presto from this PR and just fix it for s3-hadoop.

"Fixing" this for presto would effectively be adding a new feature.

@1u0
Copy link
Contributor

1u0 commented Aug 6, 2019

Personally I'm +1 for removing the end2end test for Presto in this PR.

If I understand correctly, there is no need to revert production code changes for Presto, as the (modified) settings has no effect (before change and after).

Also +1 to address Presto "auto-prefix" as a separate effort.

To provide credentials to S3 users may configure a credentials provider. For providers from amazon (which are relocated) we allow users to configure the original class name, and relocate it manually in the S3 filesystem factories.

The factories however
- used the wrong shading pattern
- could not correctly detect providers with a "com.amazon*" prefix, as the String constant containing this prefix was also being relocated.

This commit obfuscates the constant to prevent relocations, fixes the shading patterns, and sets up e2e tests to cover this case.
@@ -76,5 +76,7 @@ run_test "SQL Client end-to-end test for modern Kafka" "$END_TO_END_DIR/test-scr

run_test "Dependency shading of table modules test" "$END_TO_END_DIR/test-scripts/test_table_shaded_dependencies.sh"

run_test "Shaded Hadoop S3A with credentials provider end-to-end test" "$END_TO_END_DIR/test-scripts/test_batch_wordcount.sh hadoop_with_provider"
Copy link
Contributor

@1u0 1u0 Aug 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to the changes in this PR, but could you also move:

run_test "Shaded Hadoop S3A end-to-end test" "$END_TO_END_DIR/test-scripts/test_batch_wordcount.sh hadoop"
run_test "Shaded Presto S3 end-to-end test" "$END_TO_END_DIR/test-scripts/test_batch_wordcount.sh presto"

from run-pre-commit-tests.sh to here, so they are aggregated in one place?

Apparently, the above tests are not actually running during pre-commit tests run:

==============================================================================
Running 'Shaded Hadoop S3A end-to-end test'
==============================================================================
...
Did not find AWS environment variables, NOT running the e2e test.
...
[PASS] 'Shaded Hadoop S3A end-to-end test' passed after 0 minutes and 0 seconds! Test exited with exit code 0.

==============================================================================
Running 'Shaded Presto S3 end-to-end test'
==============================================================================
...
Did not find AWS environment variables, NOT running the e2e test.
...
[PASS] 'Shaded Presto S3 end-to-end test' passed after 0 minutes and 0 seconds! Test exited with exit code 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The s3 e2e tests rely on environment variables that are only available on actual branches, not pull requests. On a branch (like master) these tests will be run on each commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants