Description
Describe the bug
Kpow for Apache Kafka is an enterprise toolkit for Apache Kafka that includes multiple AWS libraries including LicenseManager, MSK Iam Auth, and AWS Glue.
Including the AWS Glue dependency silently breaks IRSA, causing the pod to run under the NodeInstanceRole rather than the properly configured IRSA role.
This is likely to impact projects intending to use IRSA with MSK, as the MSK IAM library and AWS Glue libraries are very likely to be included in those projects. See: aws/aws-msk-iam-auth#55
Isolating this error required a full deploy into an IRSA enabled EKS environment with debug logging in place.
Expected Behavior
Kpow prior to the addition of the AWS Glue dependency implemented IRSA correctly.
We expect to add the AWS Glue dependency to Kpow without breaking IRSA.
Current Behavior
After adding the AWS Glue dependency Kpow reverted to operating under the EKS Node Instance role:
01:30:45.217 ERROR [main] instruct.system – [:instruct.system/init :kafka/primary-cluster] instruction failed
software.amazon.awssdk.services.licensemanager.model.AuthorizationException: User: arn:aws:sts::489728315157:assumed-role/eksctl-awsmp-kpow-example-nodegro-NodeInstanceRole-RF0DW6JPCQ07/i-0dd68413a10f85f5c is not authorized to perform: license-manager:CheckoutLicense because no identity-based policy allows the license-manager:CheckoutLicense action (Service: LicenseManager, Status Code: 400, Request ID: c2546bfe-6a8e-4d0f-a635-36d07ddacad2)
Turning on debug logging shows the WebIdentityTokenCredentialsProvider is not executing due to an error resolving the http implementation,
01:30:44.137 DEBUG [main] s.a.a.a.c.AwsCredentialsProviderChain – Unable to load credentials from WebIdentityTokenCredentialsProvider(): Multiple HTTP implementations were found on the classpath. To avoid non-deterministic loading implementations, please explicitly provide an HTTP client via the client builders, set the software.amazon.awssdk.http.service.impl system property with the FQCN of the HTTP service to use as the default, or remove all but one HTTP implementation from the classpath
software.amazon.awssdk.core.exception.SdkClientException: Multiple HTTP implementations were found on the classpath. To avoid non-deterministic loading implementations, please explicitly provide an HTTP client via the client builders, set the software.amazon.awssdk.http.service.impl system property with the FQCN of the HTTP service to use as the default, or remove all but one HTTP implementation from the classpath
at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:102)
at software.amazon.awssdk.core.internal.http.loader.ClasspathSdkHttpServiceProvider.loadService(ClasspathSdkHttpServiceProvider.java:62)
The error is caused by conflicting http implementations brought in by:
LicenseManager or MSK (Iam Auth): ApacheHttpClient
AWS Glue: URLConnection
Reproduction Steps
See minimum viable reproducer here: https://github.com/factorhouse/aws-irsa-deps-reproducer
Possible Solution
Manually set the http client implementation:
(System/setProperty "software.amazon.awssdk.http.service.impl" "software.amazon.awssdk.http.apache.ApacheSdkHttpService")
It is not clear how this setting impacts any of the libraries, but Glue/LM/MSK appear to work with that setting and IRSA roles are resumed.
Additional Information/Context
No response
AWS Java SDK version used
2.18.20
JDK version used
java --version openjdk 11.0.16 2022-07-19
Operating System and version
Mac OS