Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloudwatch taking worker role in AWS and not using OIDC #303

Closed
joshi55 opened this issue Oct 1, 2020 · 9 comments · Fixed by prometheus-community/helm-charts#502
Closed

Comments

@joshi55
Copy link

joshi55 commented Oct 1, 2020

Hi
We are using prom/cloudwatch-exporter:cloudwatch_exporter-0.9.0 in AWS EKS. Also, the service account has been specified to the one which uses OIDC role.

The pod show the AWS role with OIDC, when i run command aws sts get-caller-identity. However the cloudwatch exporter keeps on taking the default worker role and fails .

AWS has confirmed that it is not the issue with any configuration or setting, but is saying that the image is not supporting OIDC .

@joshi55 joshi55 changed the title clouwatch taking worker role in AWS cloudwatch taking worker role in AWS and not using OIDC Oct 1, 2020
@brian-brazil
Copy link
Contributor

I'm afraid I have no experience with this way of doing AWS auth, so if you think there's a bug you'll need to point out where it is in the code.

@mozai
Copy link

mozai commented Oct 29, 2020

Running into this myself. It may be related to this bugreport on Amazon Java SDK 2 "Java SDK does not support EKS IAM for service accounts". Says it was fixed in JDK v2.10.11, even though the sdk-for-java/v2 documentation doesn't mention this new link on the chain. The bug report also cautions "make sure you have aws-java-sdk-sts dependency packaged for you application as well." and

include sts with the latest version from maven repo, today it is 2.11.14

  <dependency>
  	<groupId>software.amazon.awssdk</groupId>
  	<artifactId>sts</artifactId>
  </dependency>

Looks like cloudwatch_exporter is using com.amazonaws aws-java-sdk-sts v1.11.839 but I admit I don't know how different groupId dependencies map version numbers.

EDIT: found the corresponding bug report for AWS Java JDK (without a 2) "Support for EKS 'IAM for service accounts' not default". Suggests it should be fixed in com.amazonaws aws-java-sdk-sts v1.11.722 and this time the link on the chain is documented ... but cloudwatch_exporter 0.9.0 is on aws-java-sdk-sts 1.11.839 and I still see it using the instance profile instead of the web identity token.

@brian-brazil
Copy link
Contributor

Sounds like that's not it, though if someone wants to try migrating to v2 I'm open to that.

@mozai
Copy link

mozai commented Nov 6, 2020

I figured out my problem: the Java SDK was silently failing to acquire the role_arn, and using the instance role since it's the last in the DefaultAWSCredentialsProviderChain. I was using an IAM Role issued to me by the network security team, but that IAM Role did not have the EKS cluster's OIDC Identity Provider in it's Trusted Relationships description, so sts:AssumeRole failed, and the Java SDK used the next IAM Role it could acquire -- a Role that doesn't have access to cloudwatch:ListMetrics .

I can't say if what I encountered is what @joshi55 encountered.

@tomfankhaenel
Copy link

I also got the same issue. I did the setup of the serviceAccount properly and it the role and token is exported and mounted properly in the pod. As soon as the exporter starts it will fail because the role attached to the EKS nodegroup is used instead of the attached serviceaccount role.
The only difference on my end is the "aws sts get-caller-identity" also states the role of EKS nodegroup instead of the attached serviceaccount. Any advice on that?
@joshi55 How did you installed/used the awscli?

@brian-brazil
Copy link
Contributor

If this is fixed in v2, then #307 could fix it. Could you try that?

@tomfankhaenel
Copy link

tomfankhaenel commented Nov 9, 2020

@brian-brazil Unfortunatley that did not changed the behavior :/ I built a docker image from the fork of anas-aso and replaced it in the helm chart but the error looks the same.

I think i migth have another issue. As far as i understood the service account document that should cause no error.

root@prometheus-cloudwatch-exporter-5b489f6d65-fmhns:/# aws sts assume-role-with-web-identity  --role-arn $AWS_ROLE_ARN  --role-session-name cwexport  --web-identity-token file://$AWS_WEB_IDENTITY_TOKEN_FILE  --duration-seconds 1000 > /tmp/irp-cred.txt

An error occurred (AccessDenied) when calling the AssumeRoleWithWebIdentity operation: Not authorized to perform sts:AssumeRoleWithWebIdentity

SOLVED (for me)
My issue was caused by a naming mismatch of the serviceaccount created in AWS vs. the serviceacount created in EKS. Now they are macthing and the issue is solved. I also updated to the latest helm chart version "0.10.1".

@jp
Copy link

jp commented Dec 16, 2020

Hello,

I got into the same issue.

I found that what was missing for me was the configuration of the fsGroup in order to read the AWS IAM token as the container is not running as root by default (see : https://github.com/prometheus-community/helm-charts/blob/main/charts/prometheus-cloudwatch-exporter/values.yaml#L210).

So in the values.yaml I'm adding the following configuration:

  securityContext:
    fsGroup: 65534
    runAsUser: 65534

jp added a commit to jp/helm-charts that referenced this issue Dec 16, 2020
Configuring the fsGroup in the securityContext prevents an authentication issue when running the cloudwatch exporter in EKS : prometheus/cloudwatch_exporter#303
jp added a commit to jp/helm-charts that referenced this issue Dec 20, 2020
Configuring the fsGroup in the securityContext prevents an authentication issue when running the cloudwatch exporter in EKS : prometheus/cloudwatch_exporter#303

Signed-off-by: julien <pelletj@gmail.com>
torstenwalter pushed a commit to prometheus-community/helm-charts that referenced this issue Dec 20, 2020
…y context (#502)

* Add the fsGroup:65534 to the security context

Configuring the fsGroup in the securityContext prevents an authentication issue when running the cloudwatch exporter in EKS : prometheus/cloudwatch_exporter#303

Signed-off-by: julien <pelletj@gmail.com>

* Bump chart version

Signed-off-by: julien <pelletj@gmail.com>
zanhsieh pushed a commit to prometheus-community/helm-charts that referenced this issue Jan 4, 2021
…y context (#502)

* Add the fsGroup:65534 to the security context

Configuring the fsGroup in the securityContext prevents an authentication issue when running the cloudwatch exporter in EKS : prometheus/cloudwatch_exporter#303

Signed-off-by: julien <pelletj@gmail.com>

* Bump chart version

Signed-off-by: julien <pelletj@gmail.com>
chris-vest pushed a commit to chris-vest/helm-charts that referenced this issue Jan 14, 2021
…y context (prometheus-community#502)

* Add the fsGroup:65534 to the security context

Configuring the fsGroup in the securityContext prevents an authentication issue when running the cloudwatch exporter in EKS : prometheus/cloudwatch_exporter#303

Signed-off-by: julien <pelletj@gmail.com>

* Bump chart version

Signed-off-by: julien <pelletj@gmail.com>
Signed-off-by: chris-vest <intellivision@pm.me>
@matthiasr
Copy link
Contributor

This fix is now upstream in the helm chart so I'm going to close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants