Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-18610. ABFS OAuth2 Token Provider support for Azure Workload Identity #5953

Closed
wants to merge 14 commits into from

Conversation

creste
Copy link

@creste creste commented Aug 16, 2023

Description of PR

Add support for Azure Active Directory (Azure AD) workload identities which integrate with the Kubernetes's native capabilities to federate with any external identity provider.

This PR is based on Haifeng Chen's patch attached to HADOOP-18610. I fixed a few typos and linter errors but did not modify the core functionality.

How was this patch tested?

New ABFS OAuth test configuration added for WorkloadIdentityTokenProvider. Complete test suite was run against Azure Blob Storage in Central US region.


:::: AGGREGATED TEST RESULT ::::

HNS-OAuth

[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
[INFO]
[ERROR] Tests run: 141, Failures: 1, Errors: 0, Skipped: 4
[INFO] Results:
[INFO]
[WARNING] Tests run: 587, Failures: 0, Errors: 0, Skipped: 99
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR] ITestAbfsTerasort.test_110_teragen:244->executeStage:206 » TestTimedOut test t...
[INFO]
[ERROR] Tests run: 339, Failures: 0, Errors: 1, Skipped: 56

HNS-SharedKey

[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
[ERROR] Errors:
[ERROR] TestExponentialRetryPolicy.testThrottlingIntercept:106 » KeyProvider Failure t...
[INFO]
[ERROR] Tests run: 141, Failures: 1, Errors: 1, Skipped: 5
[INFO] Results:
[INFO]
[WARNING] Tests run: 587, Failures: 0, Errors: 0, Skipped: 68
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR] ITestAbfsTerasort.test_110_teragen:244->executeStage:206 » TestTimedOut test t...
[INFO]
[ERROR] Tests run: 339, Failures: 0, Errors: 1, Skipped: 43

NonHNS-SharedKey

[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
[ERROR] Errors:
[ERROR] TestExponentialRetryPolicy.testThrottlingIntercept:106 » KeyProvider Failure t...
[INFO]
[ERROR] Tests run: 141, Failures: 1, Errors: 1, Skipped: 11
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] ITestAzureBlobFileSystemCheckAccess.testCheckAccessForAccountWithoutNS:181 Expecting org.apache.hadoop.security.AccessControlException with text "This request is not authorized to perform this operation using this permission.", 403 but got : "void"
[INFO]
[ERROR] Tests run: 587, Failures: 1, Errors: 0, Skipped: 277
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR] ITestAbfsTerasort.test_110_teragen:244->executeStage:206 » TestTimedOut test t...
[INFO]
[ERROR] Tests run: 339, Failures: 0, Errors: 1, Skipped: 46

AppendBlob-HNS-OAuth

[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
[INFO]
[ERROR] Tests run: 141, Failures: 1, Errors: 0, Skipped: 4
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR] ITestAzureBlobFileSystemLease.testTwoWritersCreateAppendNoInfiniteLease:177->twoWriters:165 » AbfsRestOperation
[INFO]
[ERROR] Tests run: 587, Failures: 0, Errors: 1, Skipped: 99
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] ITestAbfsStreamStatistics.testAbfsStreamOps:140->Assert.assertTrue:42->Assert.fail:89 The actual value of 99 was not equal to the expected value
[ERROR] Errors:
[ERROR] ITestAbfsTerasort.test_110_teragen:244->executeStage:206 » TestTimedOut test t...
[INFO]
[ERROR] Tests run: 339, Failures: 1, Errors: 1, Skipped: 80

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 0s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+0 🆗 yamllint 0m 1s yamllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 45m 21s trunk passed
+1 💚 compile 0m 42s trunk passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04
+1 💚 compile 0m 38s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 0m 35s trunk passed
+1 💚 mvnsite 0m 44s trunk passed
+1 💚 javadoc 0m 44s trunk passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 37s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 1m 10s trunk passed
+1 💚 shadedclient 34m 9s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 31s the patch passed
+1 💚 compile 0m 31s the patch passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04
+1 💚 javac 0m 31s the patch passed
+1 💚 compile 0m 28s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 0m 28s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 22s /results-checkstyle-hadoop-tools_hadoop-azure.txt hadoop-tools/hadoop-azure: The patch generated 1 new + 2 unchanged - 0 fixed = 3 total (was 2)
+1 💚 mvnsite 0m 32s the patch passed
+1 💚 javadoc 0m 27s the patch passed with JDK Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 26s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 1m 4s the patch passed
+1 💚 shadedclient 34m 10s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 10s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 40s The patch does not generate ASF License warnings.
130m 46s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/1/artifact/out/Dockerfile
GITHUB PR #5953
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint yamllint
uname Linux 44c8462208e4 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 2632811
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20+8-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/1/testReport/
Max. process+thread count 699 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@anmolanmol1234
Copy link
Contributor

Add tests around the code piece added.

@creste
Copy link
Author

creste commented Sep 1, 2023

@anmolanmol1234 - Thank you for the review. I added string constants.

Regarding tests, would you mind providing some guidance on what you are expecting? I added this documentation showing how the code changes can be tested using integration tests. I ran the integration tests and copied the output in the merge request description above.

If you're expecting unit tests instead of integration tests, then I would appreciate specific guidance on how that can be accomplished. This pull request includes changes to these existing classes:

  • AzureADAuthenticator. No existing unit tests exist for this class and I don't see a simple way to mock the HTTP calls.
  • AbfsConfiguration. No existing unit tests exist to directly test AbfsConfiguration.getTokenProvider(). I see that method called a few times indirectly as part of other tests though.

This pull request adds a new WorkloadTokenIdentityProvider class that extends AccessTokenProvider. Several other classes extend AccessTokenProvider, including:

  • ClientCredsTokenProvider - Does not have unit tests.
  • CustomTokenProviderAdapter - Has a single unit test with a comment implying E2E integration tests are preferred, but not possible in this case.
  • MsiTokenProvider - Has an integration test class. I'm not sure how this is called when following the instructions for testing Azure.
  • RefreshTokenBasedTokenProvider - Does not have unit tests.
  • UserPasswordTokenProvider - Does not have unit tests.

Given the lack of unit tests in existing code, is it reasonable to conclude the new integration test documentation and test runs are sufficient in this case?

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 57s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+0 🆗 yamllint 0m 1s yamllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 45m 38s trunk passed
+1 💚 compile 0m 42s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 38s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 checkstyle 0m 35s trunk passed
+1 💚 mvnsite 0m 45s trunk passed
+1 💚 javadoc 0m 43s trunk passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 38s trunk passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 1m 10s trunk passed
+1 💚 shadedclient 33m 49s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 31s the patch passed
+1 💚 compile 0m 32s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 32s the patch passed
+1 💚 compile 0m 28s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 javac 0m 28s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 22s /results-checkstyle-hadoop-tools_hadoop-azure.txt hadoop-tools/hadoop-azure: The patch generated 1 new + 2 unchanged - 0 fixed = 3 total (was 2)
+1 💚 mvnsite 0m 30s the patch passed
+1 💚 javadoc 0m 28s the patch passed with JDK Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 26s the patch passed with JDK Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
+1 💚 spotbugs 1m 3s the patch passed
+1 💚 shadedclient 33m 52s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 11s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 39s The patch does not generate ASF License warnings.
130m 37s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/2/artifact/out/Dockerfile
GITHUB PR #5953
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint yamllint
uname Linux 26f85c8852a9 4.15.0-212-generic #223-Ubuntu SMP Tue May 23 13:09:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 088b5c3
Default Java Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_382-8u382-ga-1~20.04.1-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/2/testReport/
Max. process+thread count 672 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@fei819
Copy link

fei819 commented Dec 29, 2023

May I ask why this PR still not merged? just curious

@steven13cooper
Copy link

@creste Is there any update on this PR? It is something I ideally need too and having similar pain points

@creste
Copy link
Author

creste commented Jan 8, 2024

@steven13cooper - I'm currently waiting on @anmolanmol1234 or another reviewer to approve the PR or at least provide guidance on how to add more tests.

@tomscut
Copy link
Contributor

tomscut commented Jan 11, 2024

Hi @steveloughran and @saxenapranav, could you please help review this PR? Thanks!

Copy link
Contributor

@saxenapranav saxenapranav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets add some tests around the change.

return expiring;
}

private static String getClientAssertion(String tokenFile)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason for having it static?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method doesn't use any member variables so I made it static. I can remove static if you prefer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets not have it static, reason being, it is being called only from single non-static method. We can remove the tokenFile argument and let it depend on the object field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, reading will be done for each refresh call, let call it from the constructor itself. This will be better as: if the file read fails for any reason, it will be raised on the FileSystem creation itself, and also would be saving IO calls.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets not have it static, reason being, it is being called only from single non-static method. We can remove the tokenFile argument and let it depend on the object field.

Will do.

Also, reading will be done for each refresh call, let call it from the constructor itself. This will be better as: if the file read fails for any reason, it will be raised on the FileSystem creation itself, and also would be saving IO calls.

Doing that will introduce a bug where the ABFS driver will successfully authenticate the first time when getClientAssertion is called from the constructor, but will fail during subsequent invocations of getClientAssertion because the token in the file is refreshed out-of-band by Azure. Subsequent invocations of refreshToken will return the token that was cached when the constructor was invoked instead of reading the new token that was written to tokenFile by Azure.

For your reference, this WorkloadIdentityTokenProvider class has logic that is similar to the WorkloadIdentityCredential class in Microsoft's Azure Identity SDK for Java. That class stores the path to the token file (federatedTokenFilePathInput) in the constructor and reads from the token file every time a new token is requested.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am understanding correctly, there is some process on the azure vm which refreshes this file? Would be great if you can please cite the reference and also if we could add as a javadoc for the method getClientAssertion.
As, it is important that the tokenFile is not cached, lets add a test around this as well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is some process on the azure vm which refreshes this file?

Yes.

please cite the reference

I added the citation to the javadoc for getClientAssertion.

As, it is important that the tokenFile is not cached, lets add a test around this as well.

The only way I see to verify the token file is not cached is to somehow mock the call to AzureADAuthenticator.getTokenUsingJWTAssertion() in WorkloadIdentityTokenProvider.refreshToken(), which is problematic. Please see my comments here regarding my thoughts on mocking it. If we can identify a way to mock that code then I can add a unit test to verify the token file is not cached. Otherwise, I don't see how to implement that unit test.

@creste
Copy link
Author

creste commented Jan 11, 2024

Lets add some tests around the change.

@saxenapranav - Please see my thoughts on tests here. It is not clear to me how to add tests to the code. Any guidance you could provide would be greatly appreciated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 53s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+0 🆗 yamllint 0m 0s yamllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 43m 30s trunk passed
+1 💚 compile 0m 36s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 34s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 30s trunk passed
+1 💚 mvnsite 0m 40s trunk passed
+1 💚 javadoc 0m 37s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 2s trunk passed
+1 💚 shadedclient 32m 21s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 27s the patch passed
+1 💚 compile 0m 29s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 29s the patch passed
+1 💚 compile 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 18s the patch passed
+1 💚 mvnsite 0m 29s the patch passed
+1 💚 javadoc 0m 24s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 2s the patch passed
+1 💚 shadedclient 32m 16s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 5s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 34s The patch does not generate ASF License warnings.
123m 52s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/3/artifact/out/Dockerfile
GITHUB PR #5953
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint yamllint
uname Linux b9a825722ed4 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / d47fd76
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/3/testReport/
Max. process+thread count 556 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

/**
* Provides tokens based on Azure AD Workload Identity.
*/
public class WorkloadIdentityTokenProvider extends AccessTokenProvider {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add tests around this class. Following is something around which we can write tests which can help prevent regressions in future:

  1. how refreshing the token work.
    • We can have a protected method getTokenTtl() which on production code would give ONE_HOUR, but in test, we can mock it as per the test requirement.
    • We can mock the external call, the super call.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the guidance. I implemented several unit tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, we have added tests around hasEnoughTimeElapsedSinceLastRefresh. Was expecting if token refreshing could also be tested. And, also we could see refreshing on a upper layer (still an unit test):

  1. We have AccessTokenProvider object (instance of WorkloadIdentityTokenProvider ), and we could call AccessTokenProvider 's method getToken(), and then test different scenarios.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the request, but I don't see a reasonable way to mock the code for a unit test of AccessTokenProvider. Note that AccessTokenProvider.getToken() calls WorkloadIdentityProvider.refreshToken(), which calls AzureADAuthenticator.getTokenUsingJWTAssertion(), which is a static method that eventually makes HTTP requests. That is a problem because:

  • All unit tests currently use Mockito version 2.28.2, which does not support mocking static methods.
  • The TestAzureADAuthenticator unit tests do not show how to mock the HTTP requests made by AzureADAuthenticator.

Without a way to mock the calls made by WorkloadIdentityProvider.refreshToken() or AzureADAuthenticator.getTokenUsingJWTAssertion(), all unit tests will try to make real HTTP requests which will always fail.

Do you have any ideas on how to work around the limitations of the code to implement the desired unit test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we dont need to mock the static method. What can be done is:
we have a method: getTokenUsingJWTAssertion() which calls AzureADAuthenticator .getTokenUsingJWTAssertion(authEndpoint, clientId, clientAssertion);. Now, this new method is mockable, and in the test, we can give the required behavior.
Also, the methods which are mockable from the test, we add an annotation VisibleForTesting

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the help. I added more unit tests per your instructions.

@saxenapranav
Copy link
Contributor

Lets add some tests around the change.

@saxenapranav - Please see my thoughts on tests here. It is not clear to me how to add tests to the code. Any guidance you could provide would be greatly appreciated.

Thanks for taking the comments. Have added my thought-process for the test.

@creste
Copy link
Author

creste commented Jan 12, 2024

@saxenapranav - Thank you for the guidance. I added unit tests and addressed all other feedback.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 53s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+0 🆗 yamllint 0m 0s yamllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 41m 27s trunk passed
+1 💚 compile 0m 37s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 34s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 29s trunk passed
+1 💚 mvnsite 0m 39s trunk passed
+1 💚 javadoc 0m 36s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 4s trunk passed
+1 💚 shadedclient 32m 18s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 27s the patch passed
+1 💚 compile 0m 29s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 29s the patch passed
+1 💚 compile 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 18s /results-checkstyle-hadoop-tools_hadoop-azure.txt hadoop-tools/hadoop-azure: The patch generated 2 new + 2 unchanged - 0 fixed = 4 total (was 2)
+1 💚 mvnsite 0m 28s the patch passed
+1 💚 javadoc 0m 25s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 2s the patch passed
+1 💚 shadedclient 32m 34s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 7s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 34s The patch does not generate ASF License warnings.
122m 4s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/6/artifact/out/Dockerfile
GITHUB PR #5953
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint yamllint
uname Linux 9de056802853 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 08da2ea
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/6/testReport/
Max. process+thread count 633 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/6/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 55s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+0 🆗 yamllint 0m 0s yamllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 41m 46s trunk passed
+1 💚 compile 0m 36s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 34s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 31s trunk passed
+1 💚 mvnsite 0m 39s trunk passed
+1 💚 javadoc 0m 36s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 2s trunk passed
+1 💚 shadedclient 32m 19s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
-1 ❌ mvninstall 0m 21s /patch-mvninstall-hadoop-tools_hadoop-azure.txt hadoop-azure in the patch failed.
-1 ❌ compile 0m 28s /patch-compile-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt hadoop-azure in the patch failed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.
-1 ❌ javac 0m 28s /patch-compile-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt hadoop-azure in the patch failed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.
-1 ❌ compile 0m 22s /patch-compile-hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt hadoop-azure in the patch failed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08.
-1 ❌ javac 0m 22s /patch-compile-hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt hadoop-azure in the patch failed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08.
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 18s the patch passed
-1 ❌ mvnsite 0m 23s /patch-mvnsite-hadoop-tools_hadoop-azure.txt hadoop-azure in the patch failed.
+1 💚 javadoc 0m 25s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 23s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
-1 ❌ spotbugs 0m 23s /patch-spotbugs-hadoop-tools_hadoop-azure.txt hadoop-azure in the patch failed.
+1 💚 shadedclient 34m 25s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 0m 26s /patch-unit-hadoop-tools_hadoop-azure.txt hadoop-azure in the patch failed.
+1 💚 asflicense 0m 35s The patch does not generate ASF License warnings.
119m 17s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/7/artifact/out/Dockerfile
GITHUB PR #5953
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint yamllint
uname Linux 55b90fe0f2f6 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / fe75cd3
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/7/testReport/
Max. process+thread count 561 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/7/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 31s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+0 🆗 yamllint 0m 0s yamllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 41m 56s trunk passed
+1 💚 compile 0m 36s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 32s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 30s trunk passed
+1 💚 mvnsite 0m 37s trunk passed
+1 💚 javadoc 0m 37s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 4s trunk passed
+1 💚 shadedclient 32m 36s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
-1 ❌ mvninstall 0m 22s /patch-mvninstall-hadoop-tools_hadoop-azure.txt hadoop-azure in the patch failed.
-1 ❌ compile 0m 28s /patch-compile-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt hadoop-azure in the patch failed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.
-1 ❌ javac 0m 28s /patch-compile-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt hadoop-azure in the patch failed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.
-1 ❌ compile 0m 22s /patch-compile-hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt hadoop-azure in the patch failed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08.
-1 ❌ javac 0m 22s /patch-compile-hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt hadoop-azure in the patch failed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08.
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 18s the patch passed
-1 ❌ mvnsite 0m 24s /patch-mvnsite-hadoop-tools_hadoop-azure.txt hadoop-azure in the patch failed.
+1 💚 javadoc 0m 24s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
-1 ❌ spotbugs 0m 23s /patch-spotbugs-hadoop-tools_hadoop-azure.txt hadoop-azure in the patch failed.
+1 💚 shadedclient 34m 39s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 0m 27s /patch-unit-hadoop-tools_hadoop-azure.txt hadoop-azure in the patch failed.
+1 💚 asflicense 0m 34s The patch does not generate ASF License warnings.
119m 32s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/8/artifact/out/Dockerfile
GITHUB PR #5953
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint yamllint
uname Linux 2db524880db4 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / fe75cd3
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/8/testReport/
Max. process+thread count 561 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/8/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 53s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+0 🆗 yamllint 0m 1s yamllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 43m 7s trunk passed
+1 💚 compile 0m 37s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 34s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 30s trunk passed
+1 💚 mvnsite 0m 38s trunk passed
+1 💚 javadoc 0m 37s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 3s trunk passed
+1 💚 shadedclient 32m 27s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 28s the patch passed
+1 💚 compile 0m 29s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 29s the patch passed
+1 💚 compile 0m 26s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 26s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 19s the patch passed
+1 💚 mvnsite 0m 28s the patch passed
+1 💚 javadoc 0m 26s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 3s the patch passed
+1 💚 shadedclient 32m 21s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 4s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 35s The patch does not generate ASF License warnings.
123m 45s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/9/artifact/out/Dockerfile
GITHUB PR #5953
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint yamllint
uname Linux 14669132ef6f 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 81c1b62
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/9/testReport/
Max. process+thread count 560 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/9/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@saxenapranav saxenapranav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@creste, thanks for taking the feedback. Have added some more comments. Thanks.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 19m 56s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+0 🆗 yamllint 0m 0s yamllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 50m 40s trunk passed
+1 💚 compile 0m 41s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 36s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 30s trunk passed
+1 💚 mvnsite 0m 40s trunk passed
+1 💚 javadoc 0m 38s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 31s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 10s trunk passed
-1 ❌ shadedclient 5m 28s branch has errors when building and testing our client artifacts.
_ Patch Compile Tests _
-1 ❌ mvninstall 0m 22s /patch-mvninstall-hadoop-tools_hadoop-azure.txt hadoop-azure in the patch failed.
-1 ❌ compile 0m 23s /patch-compile-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt hadoop-azure in the patch failed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.
-1 ❌ javac 0m 23s /patch-compile-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt hadoop-azure in the patch failed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.
-1 ❌ compile 0m 23s /patch-compile-hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt hadoop-azure in the patch failed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08.
-1 ❌ javac 0m 23s /patch-compile-hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt hadoop-azure in the patch failed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08.
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 21s /buildtool-patch-checkstyle-hadoop-tools_hadoop-azure.txt The patch fails to run checkstyle in hadoop-azure
-1 ❌ mvnsite 0m 22s /patch-mvnsite-hadoop-tools_hadoop-azure.txt hadoop-azure in the patch failed.
-1 ❌ javadoc 0m 23s /patch-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt hadoop-azure in the patch failed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.
-1 ❌ javadoc 0m 23s /patch-javadoc-hadoop-tools_hadoop-azure-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt hadoop-azure in the patch failed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08.
-1 ❌ spotbugs 0m 23s /patch-spotbugs-hadoop-tools_hadoop-azure.txt hadoop-azure in the patch failed.
+1 💚 shadedclient 4m 32s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 0m 23s /patch-unit-hadoop-tools_hadoop-azure.txt hadoop-azure in the patch failed.
+0 🆗 asflicense 0m 23s ASF License check generated no output?
90m 54s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/10/artifact/out/Dockerfile
GITHUB PR #5953
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint yamllint
uname Linux bfccdeee4eba 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 3aeacfa
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/10/testReport/
Max. process+thread count 88 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/10/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@saxenapranav saxenapranav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@creste , shared my thoughts around how we can mock it.

/**
* Provides tokens based on Azure AD Workload Identity.
*/
public class WorkloadIdentityTokenProvider extends AccessTokenProvider {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we dont need to mock the static method. What can be done is:
we have a method: getTokenUsingJWTAssertion() which calls AzureADAuthenticator .getTokenUsingJWTAssertion(authEndpoint, clientId, clientAssertion);. Now, this new method is mockable, and in the test, we can give the required behavior.
Also, the methods which are mockable from the test, we add an annotation VisibleForTesting

@sugibuchi
Copy link

This behaviour of the AAD workload identity is not well documented, but the AAD workload identity webhook injects the following env variables into pods.

https://azure.github.io/azure-workload-identity/docs/quick-start.html#7-deploy-workload

Environment variable Description
AZURE_AUTHORITY_HOST The Azure Active Directory (AAD) endpoint.
AZURE_CLIENT_ID The client ID of the AAD application or user-assigned managed identity.
AZURE_TENANT_ID The tenant ID of the registered AAD application or user-assigned managed identity.
AZURE_FEDERATED_TOKEN_FILE The path of the projected service account token file.

WorkloadIdentityCredential provided by Azure SDK also reads these env variables by default.

I think it is better to make the following config properties optional and use values provided by the env variables above by default.

  • fs.azure.account.oauth2.msi.authority
  • fs.azure.account.oauth2.client.id
  • fs.azure.account.oauth2.msi.tenant
  • fs.azure.account.oauth2.token.file

These values provided by Hadoop config are redundant in general and potentially cause inconsistency with the correct properties of federated identities.

@steveloughran
Copy link
Contributor

you can use env var resolution within a hadoop core-site file; which lets you at the values with defaults when unset. on locked down config loading (oozie etc) then only the default is valid.

${env.LOCAL_DIRS:-some.default}/

so: no need to add explicit resolution, just document or set as default. example, s3a uses temp dirs in yarn containers automatically.

<property>
  <name>fs.s3a.buffer.dir</name>
  <value>${env.LOCAL_DIRS:-${hadoop.tmp.dir}}/s3a</value>
  <description>Comma separated list of directories that will be used to buffer file
    uploads to.
    Yarn container path will be used as default value on yarn applications,
    otherwise fall back to hadoop.tmp.dir
  </description>
</property>

@creste
Copy link
Author

creste commented Jan 22, 2024

@sugibuchi @steveloughran - I added the Azure environment variables to the hadoop config documented in the readme. Please take a look when you get a chance.

@sugibuchi
Copy link

@steveloughran
I agree. The env ver resolution looks the best solution for this.

@creste
Thank you very much for this prompt update. About the descriptions of the four properties, I think we can simply copy-paste the descriptions provided by ADD Workload identity documentation.

  • fs.azure.account.oauth2.msi.tenant: The tenant ID of the registered AAD application or user-assigned managed identity.
  • fs.azure.account.oauth2.client.id: The client ID of the AAD application or user-assigned managed identity.
  • fs.azure.account.oauth2.token.file: The path of the projected service account token file.

About the description of the auth method:

OAuth 2.0 tokens are written to a file that is only accessible from the executing pod (/var/run/secrets/azure/tokens/azure-identity-token). The issued credentials can be used to authenticate.

This is not precise. The token files injected by the AAD workload identity webhook are files of "projected service account tokens" issued by Kubernetes clusters. They are not OAuth2 access tokens for accessing Azure resources.

https://azure.github.io/azure-workload-identity/docs/introduction.html#how-it-works

I propose to update the description of this auth method like:

With a projected service account token injected by the Azure Workload Identity webhook, make a request of the Azure Active Directry endpoint to retrieve access tokens.
The required properties for this authentication method are automatically injected into the executing pod as environment variables by the AAD Workload Identity webhook.

@creste
Copy link
Author

creste commented Jan 22, 2024

@sugibuchi - Thank you for the additional comments.

About the descriptions of the four properties, I think we can simply copy-paste the descriptions provided by ADD Workload identity documentation.

  • fs.azure.account.oauth2.msi.tenant: The tenant ID of the registered AAD application or user-assigned managed identity.
  • fs.azure.account.oauth2.client.id: The client ID of the AAD application or user-assigned managed identity.
  • fs.azure.account.oauth2.token.file: The path of the projected service account token file.

The current descriptions of the properties were copied from other parts of the README. For example, see the property descriptions for MSITokenProvider. @steveloughran or @anmolanmol1234 - what descriptions should the README use for those properties?

About the description of the auth method:

OAuth 2.0 tokens are written to a file that is only accessible from the executing pod (/var/run/secrets/azure/tokens/azure-identity-token). The issued credentials can be used to authenticate.

This is not precise. The token files injected by the AAD workload identity webhook are files of "projected service account tokens" issued by Kubernetes clusters. They are not OAuth2 access tokens for accessing Azure resources.

I propose to update the description of this auth method like:

With a projected service account token injected by the Azure Workload Identity webhook, make a request of the Azure Active Directry endpoint to retrieve access tokens.
The required properties for this authentication method are automatically injected into the executing pod as environment variables by the AAD Workload Identity webhook.

I have no preference, but since this text was also based on other descriptions in the README I would appreciate input from a maintainer before making the change. @steveloughran or @anmolanmol1234 - any thoughts on this?

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 56s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+0 🆗 yamllint 0m 0s yamllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 41m 37s trunk passed
+1 💚 compile 0m 35s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 34s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 31s trunk passed
+1 💚 mvnsite 0m 38s trunk passed
+1 💚 javadoc 0m 36s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 3s trunk passed
+1 💚 shadedclient 32m 19s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 29s the patch passed
+1 💚 compile 0m 28s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 28s the patch passed
+1 💚 compile 0m 26s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 26s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 20s /results-checkstyle-hadoop-tools_hadoop-azure.txt hadoop-tools/hadoop-azure: The patch generated 1 new + 2 unchanged - 0 fixed = 3 total (was 2)
+1 💚 mvnsite 0m 28s the patch passed
+1 💚 javadoc 0m 25s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 2s the patch passed
+1 💚 shadedclient 32m 46s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 5s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 35s The patch does not generate ASF License warnings.
122m 34s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/11/artifact/out/Dockerfile
GITHUB PR #5953
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint yamllint
uname Linux 21bc237e9b41 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 329e3c9
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/11/testReport/
Max. process+thread count 725 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/11/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 58s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+0 🆗 yamllint 0m 0s yamllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 42m 12s trunk passed
+1 💚 compile 0m 36s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 30s trunk passed
+1 💚 mvnsite 0m 39s trunk passed
+1 💚 javadoc 0m 35s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 32s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 1s trunk passed
+1 💚 shadedclient 32m 23s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 28s the patch passed
+1 💚 compile 0m 29s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 29s the patch passed
+1 💚 compile 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 19s the patch passed
+1 💚 mvnsite 0m 30s the patch passed
+1 💚 javadoc 0m 25s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 3s the patch passed
+1 💚 shadedclient 32m 15s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 5s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 36s The patch does not generate ASF License warnings.
122m 43s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/12/artifact/out/Dockerfile
GITHUB PR #5953
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint yamllint
uname Linux 88c4d7719585 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / f8be1ca
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/12/testReport/
Max. process+thread count 552 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5953/12/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented; use of env vars is good.

not knowledgeable about the docs to review them

import org.apache.hadoop.fs.azurebfs.AbstractAbfsTestWithTimeout;
import org.junit.Test;
import org.mockito.Mockito;

import java.io.File;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this block of imports should go up first

new WorkloadIdentityTokenProvider(AUTHORITY, TENANT_ID, CLIENT_ID, tokenFile.getPath()));
Mockito.doReturn(azureAdToken)
.when(tokenProvider).getTokenUsingJWTAssertion(TOKEN);
assertEquals(azureAdToken, tokenProvider.getToken());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use assertJ assertions.

Mockito.doReturn(azureAdToken)
.when(tokenProvider).getTokenUsingJWTAssertion(TOKEN);
assertEquals(azureAdToken, tokenProvider.getToken());
assertTrue("token fetch time was not set correctly", tokenProvider.getTokenFetchTime() > startTime);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use AssertJ especially here.
make test >= so that if the start time and load happens in same millisecond by clock granularity, no test failure

new WorkloadIdentityTokenProvider(AUTHORITY, TENANT_ID, CLIENT_ID, tokenFile.getPath()));
Mockito.doReturn(azureAdToken)
.when(tokenProvider).getTokenUsingJWTAssertion(TOKEN);
boolean exception = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use LambdaTestUtils.intercept() here, ideally looking for the error string as well as exception class.

*/
@InterfaceAudience.Private
@InterfaceStability.Evolving
package org.apache.hadoop.fs.azurebfs.oauth2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to worry about package files in test modules...is yetus complaining about it?

FS_AZURE_ACCOUNT_OAUTH_MSI_AUTHORITY,
AuthConfigurations.DEFAULT_FS_AZURE_ACCOUNT_OAUTH_MSI_AUTHORITY);
authority = appendSlashIfNeeded(authority);
String tenantId = getPasswordString(FS_AZURE_ACCOUNT_OAUTH_MSI_TENANT);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always good to trim this so if someone splits a value with newlines it is trimmed properly

@cthtrifork
Copy link

Any progress on this?

@snvijaya
Copy link
Contributor

Hi @creste , there is a need for this change to merge to trunk. Can you please share the plan on completing this PR.

Please let me know if I can help you anyway to address the review comments to get this change in. Thanks.

@creste
Copy link
Author

creste commented Apr 30, 2024

@snvijaya - My team is no longer using Hadoop and has moved on to another project so I am unable to commit to completing this PR. Feel free to address the feedback and make any changes needed to get this merged.

@snvijaya
Copy link
Contributor

snvijaya commented May 1, 2024

Thanks @creste . @anujmodi2021 Will pick this up.
As the PR is raised from a forked repo, we will not be able to make changes to the same dev branch from which the PR is raised. Will re-raise the PR and address the comments.

@anujmodi2021
Copy link
Contributor

Hi @steveloughran, @cthtrifork
Created a new PR for this Patch: #6787

Addressed all the pending comments here.
Please take your further reviews to the new PR.

Thanks a lot.

@creste creste closed this May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.