Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS IAM: Support for AWS EKS ServiceAccount roles #20473

Closed
tatusl opened this issue Nov 19, 2019 · 34 comments · Fixed by #21594
Closed

AWS IAM: Support for AWS EKS ServiceAccount roles #20473

tatusl opened this issue Nov 19, 2019 · 34 comments · Fixed by #21594

Comments

@tatusl
Copy link

tatusl commented Nov 19, 2019

What happened:

When running Grafana in AWS EKS with IAM role attached to service account of a pod, Grafana still uses IAM role of an underlying EC2 instance. When adding Cloudwath datasource, the following error informs that Grafana uses instance role:

t=2019-11-19T10:31:39+0000 lvl=eror msg="Metric request error" logger=context userId=1 orgId=1 uname=admin error="Failed to call cloudwatch:ListMetrics, AccessDenied: User: arn:aws:sts::$ACCOUNT_ID:assumed-role/$INSTANCE_ROLE_NAME/$INSTANCE_ID is not authorized to perform: cloudwatch:ListMetrics\n\tstatus code: 403, request id: a4aac23f-2194-422a-83de-2926833ce40f”

$ACCOUNT_ID, $INSTANCE_ROLE_NAME and $INSTANCE_ID are sanitized from the output.

What you expected to happen:

Grafana would use IAM role attached to a service account of a pod.

How to reproduce it (as minimally and precisely as possible):

  • Setup OIDC provider for EKS
  • Setup Kubernetes service account and create IAM role, which is attached to a service account
  • Create Kubernetes deployment and define above-mentioned service account for pod

Walkthrough for setup steps https://aws.amazon.com/blogs/opensource/introducing-fine-grained-iam-roles-service-accounts/

Anything else we need to know?:

We also ran python container with same service account and installed aws-cli and it used the correct IAM role (not the underlying instance role):

# aws sts get-caller-identity
{
    "UserId": "$USER_ID:botocore-session-1574167643",
    "Account": "$ACCOUNT_ID",
    "Arn": "arn:aws:sts::$ACCOUNT_ID:assumed-role/grafana_test_eks_log_role/botocore-session-1574167643"
}

$USER_ID and $ACCOUNT_ID are sanitized from output.

awscli version: aws-cli/1.16.284 Python/3.6.9 Linux/4.14.146-119.123.amzn2.x86_64 botocore/1.13.20

In addition, the above-mentioned Grafana container has the following environment variables injected by admission controller:

/usr/share/grafana $ cat /proc/1/environ | tr '\0' '\n' |grep AWS
AWS_ROLE_ARN=arn:aws:iam::$ACCOUNT_ID:role/grafana_test_eks_log_role
AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token

To my knowledge, this indicates that IAM attachment to service account and pod is done successfully.

To test this issue, we ran Grafana with minimal configuration.

PR #19138 updated aws-sdk-go to version which supports IAM roles to service accounts

Environment:

  • Grafana version: v6.5.0-beta1 (grafana/grafana:6.5.0-beta1 Docker image)
  • Data source type & version: Cloudwatch
  • OS Grafana is installed on: Alpine Linux 3.10.3 (Docker container)
  • User OS & Browser: Chrome Version 78.0.3904.97 (Official Build) (64-bit)
  • Grafana plugins: -
  • Others:
    • EKS Kubernetes version: 1.14
    • EKS Platform version: eks.1
    • EKS worker node Kubernetes version: v1.14.7-eks-1861c5
    • Worker node AMI: Official EKS worker node AMI: amazon-eks-node-1.14-v20190927 (ami-059c6874350e63ca9) - Description: EKS Kubernetes Worker AMI with AmazonLinux2 image, (k8s: 1.14.7, docker:18.06)
@hugohaggmark hugohaggmark added the needs investigation for unconfirmed bugs. use type/bug for confirmed bugs, even if they "need" more investigating label Nov 20, 2019
@sunker
Copy link
Contributor

sunker commented Nov 20, 2019

Thanks for reporting this @tatusl.
Did you have this problem also for 6.4.x, or was this only for 6.5.0-beta?

@tatusl
Copy link
Author

tatusl commented Nov 20, 2019

Yes, we first noticed the problem with 6.4.1 when migrating from ECS to EKS, which meant we need to use IAM role for service account instead of ECS task role.

We assumed that this was due to old version of aws-sdk-go not supporting new AssumeRoleWithWebIdentity as Grafana 6.4.1 has v1.18.5 version of aws-sdk-go. Minimum required version of aws-sdk-go is 1.23.13 (https://aws.amazon.com/blogs/opensource/introducing-fine-grained-iam-roles-service-accounts/). For reference, 6.5.0-beta1 has v1.25.6 of aws-sdk-go.

Just in case, I retested with 6.4.4 in EKS and the problem persists.

I'm happy to provide more information if needed.

@iptizer
Copy link

iptizer commented Nov 21, 2019

Same up here. If I can help with testing or even tweak code, please give me a hint.

Context:

  • AWS EKS 1.14
  • EKS with OIDC. Verified in different container with similar annotations -> Works.
  • Grafana Image tag=6.5.0-beta1
  • Shell into Grafana Container shows JWT token is injected correctly.

Error in Container:

grafana t=2019-11-21T11:13:51+0000 lvl=eror msg="Metric request error" logger=context userId=2 orgId=1 uname="xxxxxxnamexxxxx" error="Failed to call cloudwatch:ListMetrics, AccessDenied: User: arn:aws:sts::11111111111111:assumed-role/eks-cluster-role/i-1234567 is not authorized to perform: cloudwatch:ListMetrics\n\tstatus code: 403, request id: 1111111-22222-3333-4444-5555555555"

Cluster role is used. AssumeRole call is used instead of AssumeWebRoleIdentity call.

@patstrom
Copy link
Contributor

patstrom commented Nov 27, 2019

Hello! Since #19138 didn't solve the issue of using EKS ServiceAccount IAM roles I am taking a closer look as to why.

I think the issue is the custom credentials chain specified in credentials.go. Specifically, I suspect that because user-configured credentials has been provided this is resolved to false, whereas it would've been resolveCredentials that eventually finds the web identity tokens. There seems to be no straightforward way to add this to the manually specified credentials chain without essentially duplicating the SDK. This is because the NewWebIdentityCredentials function call is one abstraction above what tsdb/cloudwatch/credentials.go works at (it seems).

Digging a little deeper I think the idea with the original implementation was to support the values specified in the grafana config file while doing some fallback to some of the "defaults" of the aws credentials chain. Perhaps it would be more intuitive to switch it around and if anything is specified it will only use those credentials for authenticating, otherwise use the default credentials chain.

Right now the frontend allows you specify 3 authentication types for the cloudwatch datasource, Access & secret key, Credentials file and ARN. Only ARN is treated differently as it requires an AssumeRole call to be done beforehand. I think it would be more intuitive if instead there were only SDK/CLI Default and Assume Role (which would use to SDK/CLI Default when assuming the role).

I am also curious as to why there is a manual credentials cache implemented when the aws-sdk-go already caches credentials by default (https://docs.aws.amazon.com/sdk-for-go/api/aws/credentials/).

I plan to dig a little deeper into this to see if the entire workflow here can be simplified to rely on the aws-sdk-go even more. For example it seems like the CloudwatchExecutor keeps two AWS interface as fields (ec2Svc and rftaSvc) and require users to do ensureClientSession every time before using them whereas the "normal" cloudwatch client is created on-demand with getClient. This seems like it could at least be unified and briefly looking at the code it seems like the CloudwatchExecutor should either keep all 3 interface as fields or keep a session.Session as a field instead and provide methods for getting the corresponding interfaces.

@mohsen0
Copy link

mohsen0 commented Dec 30, 2019

Right now the frontend allows you specify 3 authentication types for the cloudwatch datasource, Access & secret key, Credentials file and ARN. Only ARN is treated differently as it requires an AssumeRole call to be done beforehand. I think it would be more intuitive if instead there were only SDK/CLI Default and Assume Role (which would use to SDK/CLI Default when assuming the role).

@hugohaggmark / @marefr do you agree with the proposed view on how to change this?

@mohsen0
Copy link

mohsen0 commented Jan 15, 2020

@hugohaggmark / @marefr Shouldn't grafana follow the default AWS credential chain in the SDK and then override if user wants to provide credentials through its interface?

patstrom added a commit to patstrom/grafana that referenced this issue Jan 18, 2020
@aknuds1 aknuds1 self-assigned this Jan 29, 2020
samcrang added a commit to alphagov/gsp that referenced this issue Feb 13, 2020
We are planning on removing Kiam. It is being replaced by IAM roles for
service accounts on Amazon EKS clusters[1].

Grafana is the only thing in GSP that still uses Kiam. We are waiting
for an upstream fix[2] in order to use IAM roles for service accounts.
It doesn't appear that this will be resolved any time soon. In order to
allow to Grafana to work once we've removed Kiam we've decided to allow
_all_ nodes (and therefore _all_ containers) to be able to read
CloudWatch metrics.

[1] https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html
[2] grafana/grafana#20473 (comment)
samcrang added a commit to alphagov/gsp that referenced this issue Feb 13, 2020
We are planning on removing Kiam. It is being replaced by IAM roles for
service accounts on Amazon EKS clusters[1].

Grafana is the only thing in GSP that still uses Kiam. We are waiting
for an upstream fix[2] in order to use IAM roles for service accounts.
It doesn't appear that this will be resolved any time soon. In order to
allow Grafana to read CloudWatch metrics once Kiam has been removed
we've decided to allow _all_ nodes (and therefore _all_ containers) to
be able to read CloudWatch metrics.

[1] https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html
[2] grafana/grafana#20473 (comment)
@sc250024
Copy link

Having the same issue. I agree with @patstrom 's approach, but it is a bit confusing when initially setting up since the UI doesn't really show how it's obtaining the credentials. I lean more towards @mohsen0 's approach to make things more explicit in the UI. Until then, I'm continuing to use Kube2Iam / Kiam to obtain AWS credentials.

@123BLiN
Copy link

123BLiN commented Mar 15, 2020

Hi, maybe unrelated, and works as designed, so please forgive me if this is wrong thread.
I was having hard time trying to get grafana 6.6.1 cloud watch datasource working with Kiam, pretty simple and standard setup, but in logs I saw:

 error="AccessDenied: User: arn:aws:sts::XXXXXXXXXXXXXX:assumed-role/CloudWatchAccess/kiam-kiam is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::XXXXXXXXXXXXXX:role/CloudWatchAccess\n\tstatus code: 403

However everything works as expected after changing datasource type from arn to credentials (Credentials file) leaving profile path empty (default) (file is not present actually, so it falls back to the default chain)
Not sure if it is expected behaviour.

@sc250024
Copy link

@123BLiN It is related to this, and that is indeed the current way to get things working if you're using Kube2IAM / KIAM.

chancez pushed a commit to chancez/grafana that referenced this issue Apr 6, 2020
@aknuds1 aknuds1 added this to Done in Backend Platform Squad via automation Apr 8, 2020
aknuds1 pushed a commit that referenced this issue Apr 8, 2020
* Replicate SDK behaviour for WebIdentityRole

Fix #20473

* Use WebIdentityRole in s3 uploader as well

* Use consistent casing

* use WebIdentityRole to assume another role

Co-authored-by: eV <ev@7pr.xyz>
@marefr marefr added type/feature-request and removed needs investigation for unconfirmed bugs. use type/bug for confirmed bugs, even if they "need" more investigating labels Apr 9, 2020
@marefr marefr added this to the 7.0 milestone Apr 9, 2020
peterholmberg pushed a commit that referenced this issue Apr 9, 2020
* Replicate SDK behaviour for WebIdentityRole

Fix #20473

* Use WebIdentityRole in s3 uploader as well

* Use consistent casing

* use WebIdentityRole to assume another role

Co-authored-by: eV <ev@7pr.xyz>
@waelghaith123
Copy link

Having the same issue, was a solution to this problem found?

@kieranbrown
Copy link

Having the same issue, was a solution to this problem found?

This is solved in 7.0.0-beta1

@waelghaith123
Copy link

Having the same issue, was a solution to this problem found?

This is solved in 7.0.0-beta1

Thanks, I tested it and it's working fine
awesome release

@ingshtrom
Copy link

ingshtrom commented Jun 2, 2020

I know this is being necro'ed a bit, but I'm still running into this and I would appreciate any help possible. I am running in EKS on 1.15 of Kubernetes. I am running version 7.0.1 of Grafana (Docker image grafana/grafana:7.0.1) and the environment variables AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE are present on the pods. I can definitely see the error that Grafana cannot access data from the Cloudwatch data source and it is assuming the EC2 Instance's IAM Role from the EC2 Instance Profile.

What else can I provide to help debug this and if you want a new ticket, please let me know

@waelghaith123
Copy link

I know this is being necro'ed a bit, but I'm still running into this and I would appreciate any help possible. I am running in EKS on 1.15 of Kubernetes. I am running version 7.0.1 of Grafana (Docker image grafana/grafana:7.0.1) and the environment variables AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE are present on the pods. I can definitely see the error that Grafana cannot access data from the Cloudwatch data source and it is assuming the EC2 Instance's IAM Role from the EC2 Instance Profile.

What else can I provide to help debug this and if you want a new ticket, please let me know

What is the IAM Role you used to access the Cloudwatch ?!
Which service account you used to integrate Grafana with the IAM Role?

@ingshtrom
Copy link

@waelghaith I'm not sure what you are asking--it is a custom role we created, but in the error in Grafana it says it's using the EC2 Instance Profile IAM role, so I'm not sure what value linking the IAM role here would do, but I can provide more information.

Here is the service account we have set up:

apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::<redacted>:role/<custom_grafana_iam_role_name>
  name: grafana
  namespace: infra-monitoring

And then I can see it being used here:

> k describe deploy grafana | grep Service
  Service Account:  grafana

And when I exec'ed into a test container that I added to the pod, running the latest AWS CLI v2 it shows that it is picking up the service account IAM Role when I run aws sts get-caller-identity. But then in the error reported from grafana when trying to save and test the Cloudwatch data source, it gives an error saying that it cannot list metrics but it is using the EC2 Instance Profile. Do note that I tested this using kubectl port-forward... so that is why the referer/remote_addr is set to localhost in the logs

[grafana-79bb4cc9bf-n6lln] t=2020-06-03T13:37:24+0000 lvl=eror msg="Metric request error" logger=context userId=0 orgId=1 uname= error="Failed to call cloudwatch:ListMetrics, AccessDenied: User: arn:aws:sts::<redacted>:assumed-role/stage-NGDefault-20200228204751747600000002/i-009d2f9e2420a2ba0 is not authorized to perform: cloudwatch:ListMetrics\n\tstatus code: 403, request id: <redacted>" remote_addr=127.0.0.1
[grafana-79bb4cc9bf-n6lln] t=2020-06-03T13:37:24+0000 lvl=eror msg="Request Completed" logger=context userId=0 orgId=1 uname= method=POST path=/api/tsdb/query status=500 remote_addr=127.0.0.1 time_ms=3199 size=34 referer=http://localhost:3000/datasources/edit/4/

Does this help?

@waelghaith123
Copy link

waelghaith123 commented Jun 3, 2020

@ingshtrom
I think you have an access issue
I don't have more much experience but need you to make sure have right IAM Role
This is right CloudFormation IAM Role with the policy from Grafana docs https://grafana.com/docs/grafana/latest/features/datasources/cloudwatch/#iam-policies

  IAMRole: 
    Type: "AWS::IAM::Role"
    Properties: 
      AssumeRolePolicyDocument: 
        Version: "2012-10-17"
        Statement: 
          - Effect: "Allow"
            Principal: 
              Federated: 
                - "arn:aws:iam::<ACCOUNT_ID>:oidc-provider/oidc.eks.<REGION>.amazonaws.com/id/<YOUR_EKS_OIDC>"
            Action: 
              - "sts:AssumeRoleWithWebIdentity"
            Condition:
              StringEquals:
                "oidc.eks.<REGION>.amazonaws.com/id/<YOUR_EKS_OIDC>:sub": "system:serviceaccount:infra-monitoring:grafana"
      Path: "/"
  IAMPolicy: 
    Type: "AWS::IAM::Policy"
    Properties: 
      PolicyName: "grafana-policy"
      PolicyDocument: 
        Version: '2012-10-17'
        Statement:
        - Sid: AllowReadingMetricsFromCloudWatch
          Effect: Allow
          Action:
              - cloudwatch:DescribeAlarmsForMetric
              - cloudwatch:DescribeAlarmHistory
              - cloudwatch:DescribeAlarms
              - cloudwatch:ListMetrics
              - cloudwatch:GetMetricStatistics
              - cloudwatch:GetMetricData
          Resource: "*"
        - Sid: AllowReadingLogsFromCloudWatch
          Effect: Allow
          Action:
              - logs:DescribeLogGroups
              - logs:GetLogGroupFields
              - logs:StartQuery
              - logs:StopQuery
              - logs:GetQueryResults
              - logs:GetLogEvents
          Resource: "*"
        - Sid: AllowReadingTagsInstancesRegionsFromEC2
          Effect: Allow
          Action:
              - ec2:DescribeTags
              - ec2:DescribeInstances
              - ec2:DescribeRegions
          Resource: "*"
        - Sid: AllowReadingResourcesForTags
          Effect: Allow
          Action: tag:GetResources
          Resource: "*"
      Roles: 
        - !Ref IAMRole

  InstanceProfile: 
    Type: "AWS::IAM::InstanceProfile"
    Properties: 
      Path: "/"
      Roles: 
        - !Ref IAMRole

@ingshtrom
Copy link

oh, right. I have double checked my settings and they match what I'm using for other IAM Role bound service accounts. I should also reiterate that I attached a test container and was able to assume the IAM Role that is bound to the service account, so I don't think any of that is incorrect.

So I know that the attachment of the IAM Role to the containers in the pod via service account binding is working, but the Grafana container doesn't seem to be picking up this IAM Role.

@FaHeymann
Copy link

Have you set a security context on your Pod? I assume you were root in your test container, Grafana runs as uid 472 though. For the process to be able to access the projected credentials, you will need to set

securityContext:
  fsGroup: 472
  runAsGroup: 472
  runAsUser: 472

@ingshtrom
Copy link

@FaHeymann you are the best, that works! Is this documented somewhere that maybe I missed?

patstrom added a commit to patstrom/grafana that referenced this issue Jun 5, 2020
EKS IAM Roles for Service Accounts requires a security context in order
for Grafana to find the credentials. This commit adds a section that
describes how (and why) to configure your pod appropriately.

Credits to @FaHeymann who provided the correct answer when helping out
@ingshtrom in grafana#20473 (comment)
@patstrom
Copy link
Contributor

patstrom commented Jun 5, 2020

I don't think it's part of the documentation. I added a section about it in the PR that is already reworking the CloudWatch authentication!

b095a74

@ingshtrom
Copy link

@patstrom Thank you for all of your hard work getting IRSA working for Grafana and these changes. It's made our migration to EKS much more seamless!

@okdas
Copy link

okdas commented Aug 21, 2020

Huge thanks to the developers who worked on this feature. I am still seeing this issue unfortunately. I'm using the latest stable helm chart, where the securityContext is already populated with recommended above values. Service account has the correct annotation, and EKS populates AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE environment variables successfully. The token is available also. Still, Grafana tries to assume the role assigned to the Node, where these API calls are not allowed hence the AccessDenied error. What else can help me to troubleshoot this issue? I've used golang SDK with IRSA before with no issues, and that problem with Grafana is a mystery to me. :(

@larsrnielsen
Copy link

hi @okdas - did you managed to fix your issue? I experience the same issue. The Helm chart adds the correct security context, the SA points to the correct IAM role and the IAM role trusts the correct SA.

securityContext: fsGroup: 472 runAsGroup: 472 runAsUser: 472

@Dudssource
Copy link

@larsrnielsen can you share your deployment manifest?

@catalinmer
Copy link

catalinmer commented Sep 24, 2020

I have the same issue as @okdas or @larsrnielsen. From grafana pod I have
bash-5.0$ id
uid=472(grafana) gid=472(grafana) groups=1(bin),472(grafana)

bash-5.0$ env | grep AWS
AWS_ROLE_ARN=arn:aws:iam::<account_id>:role/grafana_serviceaccount_role
AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token

So role is assigned correctly to the pod but the pod tries with the node role.
msg="Metric request error" logger=context userId=1 orgId=1 uname=admin error="failed to call cloudwatch:ListMetrics: AccessDenied: User: arn:aws:sts::<account_id>:assumed-role/eks-worker-role/i-xxx is not authorized to perform: cloudwatch:ListMetrics\n\tstatus

I use grafana 7.1.3, also tried with 7.2.0
Any idea on how to use the serviceaccount role_arn and not to fallback to node role?
Thank you

UPDATE:
Sorted, i was missing the cert fingerprints in the thumbprint_list of iam_openid_connect_provider. - poorly documented in terraform

@jshcmpbll
Copy link

@catalinmer Were you able to solve this. I am having the exact results you are seeing..

@aknuds1
Copy link
Contributor

aknuds1 commented Oct 15, 2020

@ingshtrom @okdas @larsrnielsen @catalinmer @jshcmpbll We have merged a fundamental re-design of the AWS CloudWatch authentication scheme, which aligns it with the AWS SDK defaults. You might want to try one of the recent nightly downloads, or 7.3.0-beta1, which should be due today/tomorrow, to see if your problems are resolved.

@jshcmpbll
Copy link

Awesome! Yeah I saw that PR and figured it was related but wasn’t sure. I’ll give it a go tomorrow. Thanks!

@catalinmer
Copy link

@jshcmpbll Yes i managed to sort it and it was not due to grafana but my aws_iam_openid_connect_provider was missing a valid thumbprint_list. After adding valid certificate fingerprints into the thumbprint_list i could authenticate successfully.

@jshcmpbll
Copy link

jshcmpbll commented Oct 16, 2020

@aknuds1 Wanted to come back and update on this. 7.30-beta1 its working and I got authenticated. I hit a little bit of a hiccup where the role appears to not have permission to assume role to itself..? Not entirely sure whats going on there and if thats something on my IAM policy or a bug but wanted to update you guys on my findings

error="failed to call cloudwatch:ListMetrics: AccessDenied: User: arn:aws:sts::************:assumed-role/dev_cloudwatch_grafana_eks_helm/1602799472958375602 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::************:role/dev_cloudwatch_grafana_eks_helm\n\tstatus

This is how the roles were originally defined in terraform

data "aws_iam_policy_document" "cloudwatch_grafana_eks_helm" {
  statement {
    effect = "Allow"

    actions = [
      "cloudwatch:DescribeAlarmsForMetric",
      "cloudwatch:DescribeAlarmHistory",
      "cloudwatch:DescribeAlarms",
      "cloudwatch:ListMetrics",
      "cloudwatch:GetMetricStatistics",
      "cloudwatch:GetMetricData",
      "logs:DescribeLogGroups",
      "logs:GetLogGroupFields",
      "logs:StartQuery",
      "logs:StopQuery",
      "logs:GetQueryResults",
      "logs:GetLogEvents",
      "ec2:DescribeTags",
      "ec2:DescribeInstances",
      "ec2:DescribeRegions"
    ]

    resources = ["*"]
  }
}

To fix I just added "sts:AssumeRole".. Ill have to fine tune these permissions

@patstrom
Copy link
Contributor

@aknuds1 Wanted to come back and update on this. 7.30-beta1 its working and I got authenticated. I hit a little bit of a hiccup where the role appears to not have permission to assume role to itself..? Not entirely sure whats going on there and if thats something on my IAM policy or a bug but wanted to update you guys on my findings

error="failed to call cloudwatch:ListMetrics: AccessDenied: User: arn:aws:sts::************:assumed-role/dev_cloudwatch_grafana_eks_helm/1602799472958375602 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::************:role/dev_cloudwatch_grafana_eks_helm\n\tstatus

This is how the roles were originally defined in terraform

data "aws_iam_policy_document" "cloudwatch_grafana_eks_helm" {
  statement {
    effect = "Allow"

    actions = [
      "cloudwatch:DescribeAlarmsForMetric",
      "cloudwatch:DescribeAlarmHistory",
      "cloudwatch:DescribeAlarms",
      "cloudwatch:ListMetrics",
      "cloudwatch:GetMetricStatistics",
      "cloudwatch:GetMetricData",
      "logs:DescribeLogGroups",
      "logs:GetLogGroupFields",
      "logs:StartQuery",
      "logs:StopQuery",
      "logs:GetQueryResults",
      "logs:GetLogEvents",
      "ec2:DescribeTags",
      "ec2:DescribeInstances",
      "ec2:DescribeRegions"
    ]

    resources = ["*"]
  }
}

To fix I just added "sts:AssumeRole".. Ill have to fine tune these permissions

In your case I don't think you want to specify an Assume Role ARN at all (I don't see why you'd want it to assume itself). If you leave that field blank it should be fine.

@aknuds1
Copy link
Contributor

aknuds1 commented Oct 19, 2020

Thanks @patstrom, I was also wondering if it was a case of double-assuming the same role.

@jshcmpbll Are you specifying a role to assume in the Grafana data source? Try removing that parameter, as it sounds like it might overlap with e.g. an environment variable to do the same?

@jshcmpbll
Copy link

jshcmpbll commented Oct 19, 2020

Thanks all! Honestly, didn't realize that the ARN was optional and assumed it was required. Its making a lot of sense now haha. Anyways, 7.30-beta1 its working great 😄

@nicolasps
Copy link

What really helps me was to update Grafana to a version higher than 7.2.9, and stop using the ARN option, just DEFAULT.

And of course, the ServiceAccount for Grafana must be annotated with eks.amazonaws.com/role-arn: arn:aws:iam::<redacted>:role/<custom_grafana_iam_role_name

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment