-
Notifications
You must be signed in to change notification settings - Fork 11.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS IAM: Support for AWS EKS ServiceAccount roles #20473
Comments
Thanks for reporting this @tatusl. |
Yes, we first noticed the problem with We assumed that this was due to old version of aws-sdk-go not supporting new Just in case, I retested with I'm happy to provide more information if needed. |
Same up here. If I can help with testing or even tweak code, please give me a hint. Context:
Error in Container: grafana t=2019-11-21T11:13:51+0000 lvl=eror msg="Metric request error" logger=context userId=2 orgId=1 uname="xxxxxxnamexxxxx" error="Failed to call cloudwatch:ListMetrics, AccessDenied: User: arn:aws:sts::11111111111111:assumed-role/eks-cluster-role/i-1234567 is not authorized to perform: cloudwatch:ListMetrics\n\tstatus code: 403, request id: 1111111-22222-3333-4444-5555555555" Cluster role is used. AssumeRole call is used instead of AssumeWebRoleIdentity call. |
Hello! Since #19138 didn't solve the issue of using EKS ServiceAccount IAM roles I am taking a closer look as to why. I think the issue is the custom credentials chain specified in credentials.go. Specifically, I suspect that because user-configured credentials has been provided this is resolved to false, whereas it would've been Digging a little deeper I think the idea with the original implementation was to support the values specified in the grafana config file while doing some fallback to some of the "defaults" of the aws credentials chain. Perhaps it would be more intuitive to switch it around and if anything is specified it will only use those credentials for authenticating, otherwise use the default credentials chain. Right now the frontend allows you specify 3 authentication types for the cloudwatch datasource, I am also curious as to why there is a manual credentials cache implemented when the I plan to dig a little deeper into this to see if the entire workflow here can be simplified to rely on the |
@hugohaggmark / @marefr do you agree with the proposed view on how to change this? |
@hugohaggmark / @marefr Shouldn't grafana follow the default AWS credential chain in the SDK and then override if user wants to provide credentials through its interface? |
We are planning on removing Kiam. It is being replaced by IAM roles for service accounts on Amazon EKS clusters[1]. Grafana is the only thing in GSP that still uses Kiam. We are waiting for an upstream fix[2] in order to use IAM roles for service accounts. It doesn't appear that this will be resolved any time soon. In order to allow to Grafana to work once we've removed Kiam we've decided to allow _all_ nodes (and therefore _all_ containers) to be able to read CloudWatch metrics. [1] https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html [2] grafana/grafana#20473 (comment)
We are planning on removing Kiam. It is being replaced by IAM roles for service accounts on Amazon EKS clusters[1]. Grafana is the only thing in GSP that still uses Kiam. We are waiting for an upstream fix[2] in order to use IAM roles for service accounts. It doesn't appear that this will be resolved any time soon. In order to allow Grafana to read CloudWatch metrics once Kiam has been removed we've decided to allow _all_ nodes (and therefore _all_ containers) to be able to read CloudWatch metrics. [1] https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html [2] grafana/grafana#20473 (comment)
Having the same issue. I agree with @patstrom 's approach, but it is a bit confusing when initially setting up since the UI doesn't really show how it's obtaining the credentials. I lean more towards @mohsen0 's approach to make things more explicit in the UI. Until then, I'm continuing to use Kube2Iam / Kiam to obtain AWS credentials. |
Hi, maybe unrelated, and works as designed, so please forgive me if this is wrong thread.
However everything works as expected after changing datasource type from |
@123BLiN It is related to this, and that is indeed the current way to get things working if you're using Kube2IAM / KIAM. |
* Replicate SDK behaviour for WebIdentityRole Fix #20473 * Use WebIdentityRole in s3 uploader as well * Use consistent casing * use WebIdentityRole to assume another role Co-authored-by: eV <ev@7pr.xyz>
* Replicate SDK behaviour for WebIdentityRole Fix #20473 * Use WebIdentityRole in s3 uploader as well * Use consistent casing * use WebIdentityRole to assume another role Co-authored-by: eV <ev@7pr.xyz>
Having the same issue, was a solution to this problem found? |
This is solved in 7.0.0-beta1 |
Thanks, I tested it and it's working fine |
I know this is being necro'ed a bit, but I'm still running into this and I would appreciate any help possible. I am running in EKS on 1.15 of Kubernetes. I am running version 7.0.1 of Grafana (Docker image What else can I provide to help debug this and if you want a new ticket, please let me know |
What is the IAM Role you used to access the Cloudwatch ?! |
@waelghaith I'm not sure what you are asking--it is a custom role we created, but in the error in Grafana it says it's using the EC2 Instance Profile IAM role, so I'm not sure what value linking the IAM role here would do, but I can provide more information. Here is the service account we have set up:
And then I can see it being used here:
And when I exec'ed into a test container that I added to the pod, running the latest AWS CLI v2 it shows that it is picking up the service account IAM Role when I run
Does this help? |
@ingshtrom
|
oh, right. I have double checked my settings and they match what I'm using for other IAM Role bound service accounts. I should also reiterate that I attached a test container and was able to assume the IAM Role that is bound to the service account, so I don't think any of that is incorrect. So I know that the attachment of the IAM Role to the containers in the pod via service account binding is working, but the Grafana container doesn't seem to be picking up this IAM Role. |
Have you set a security context on your Pod? I assume you were root in your test container, Grafana runs as uid 472 though. For the process to be able to access the projected credentials, you will need to set
|
@FaHeymann you are the best, that works! Is this documented somewhere that maybe I missed? |
EKS IAM Roles for Service Accounts requires a security context in order for Grafana to find the credentials. This commit adds a section that describes how (and why) to configure your pod appropriately. Credits to @FaHeymann who provided the correct answer when helping out @ingshtrom in grafana#20473 (comment)
I don't think it's part of the documentation. I added a section about it in the PR that is already reworking the CloudWatch authentication! |
@patstrom Thank you for all of your hard work getting IRSA working for Grafana and these changes. It's made our migration to EKS much more seamless! |
Huge thanks to the developers who worked on this feature. I am still seeing this issue unfortunately. I'm using the latest stable helm chart, where the |
hi @okdas - did you managed to fix your issue? I experience the same issue. The Helm chart adds the correct security context, the SA points to the correct IAM role and the IAM role trusts the correct SA.
|
@larsrnielsen can you share your deployment manifest? |
I have the same issue as @okdas or @larsrnielsen. From grafana pod I have
So role is assigned correctly to the pod but the pod tries with the node role. I use grafana 7.1.3, also tried with 7.2.0 UPDATE: |
@catalinmer Were you able to solve this. I am having the exact results you are seeing.. |
@ingshtrom @okdas @larsrnielsen @catalinmer @jshcmpbll We have merged a fundamental re-design of the AWS CloudWatch authentication scheme, which aligns it with the AWS SDK defaults. You might want to try one of the recent nightly downloads, or 7.3.0-beta1, which should be due today/tomorrow, to see if your problems are resolved. |
Awesome! Yeah I saw that PR and figured it was related but wasn’t sure. I’ll give it a go tomorrow. Thanks! |
@jshcmpbll Yes i managed to sort it and it was not due to grafana but my aws_iam_openid_connect_provider was missing a valid thumbprint_list. After adding valid certificate fingerprints into the thumbprint_list i could authenticate successfully. |
@aknuds1 Wanted to come back and update on this. 7.30-beta1 its working and I got authenticated. I hit a little bit of a hiccup where the role appears to not have permission to assume role to itself..? Not entirely sure whats going on there and if thats something on my IAM policy or a bug but wanted to update you guys on my findings
This is how the roles were originally defined in terraform data "aws_iam_policy_document" "cloudwatch_grafana_eks_helm" {
statement {
effect = "Allow"
actions = [
"cloudwatch:DescribeAlarmsForMetric",
"cloudwatch:DescribeAlarmHistory",
"cloudwatch:DescribeAlarms",
"cloudwatch:ListMetrics",
"cloudwatch:GetMetricStatistics",
"cloudwatch:GetMetricData",
"logs:DescribeLogGroups",
"logs:GetLogGroupFields",
"logs:StartQuery",
"logs:StopQuery",
"logs:GetQueryResults",
"logs:GetLogEvents",
"ec2:DescribeTags",
"ec2:DescribeInstances",
"ec2:DescribeRegions"
]
resources = ["*"]
}
} To fix I just added |
In your case I don't think you want to specify an Assume Role ARN at all (I don't see why you'd want it to assume itself). If you leave that field blank it should be fine. |
Thanks @patstrom, I was also wondering if it was a case of double-assuming the same role. @jshcmpbll Are you specifying a role to assume in the Grafana data source? Try removing that parameter, as it sounds like it might overlap with e.g. an environment variable to do the same? |
Thanks all! Honestly, didn't realize that the ARN was optional and assumed it was required. Its making a lot of sense now haha. Anyways, 7.30-beta1 its working great 😄 |
What really helps me was to update Grafana to a version higher than 7.2.9, and stop using the ARN option, just DEFAULT. And of course, the ServiceAccount for Grafana must be annotated with |
What happened:
When running Grafana in AWS EKS with IAM role attached to service account of a pod, Grafana still uses IAM role of an underlying EC2 instance. When adding Cloudwath datasource, the following error informs that Grafana uses instance role:
$ACCOUNT_ID, $INSTANCE_ROLE_NAME and $INSTANCE_ID are sanitized from the output.
What you expected to happen:
Grafana would use IAM role attached to a service account of a pod.
How to reproduce it (as minimally and precisely as possible):
Walkthrough for setup steps https://aws.amazon.com/blogs/opensource/introducing-fine-grained-iam-roles-service-accounts/
Anything else we need to know?:
We also ran python container with same service account and installed aws-cli and it used the correct IAM role (not the underlying instance role):
$USER_ID and $ACCOUNT_ID are sanitized from output.
awscli version:
aws-cli/1.16.284 Python/3.6.9 Linux/4.14.146-119.123.amzn2.x86_64 botocore/1.13.20
In addition, the above-mentioned Grafana container has the following environment variables injected by admission controller:
To my knowledge, this indicates that IAM attachment to service account and pod is done successfully.
To test this issue, we ran Grafana with minimal configuration.
PR #19138 updated aws-sdk-go to version which supports IAM roles to service accounts
Environment:
grafana/grafana:6.5.0-beta1
Docker image)The text was updated successfully, but these errors were encountered: