Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EKS IAM Roles for Pods #23

Open
pauncejones opened this Issue Dec 5, 2018 · 48 comments

Comments

@pauncejones
Copy link
Contributor

pauncejones commented Dec 5, 2018

Update 1/9/19:

After talking about this internally, we've been working on a proposed solution for this. Below is a writeup on what we're thinking, and we've included some example scripts so you can get a feel for how we expect this to work.

Our plan for IAM and Kubernetes integration

A recent Kubernetes feature, TokenRequestProjection, allows users of Kubernetes to mount custom projected service account tokens in their pods. A “projected service account” is a bearer token that is intended for use outside of the cluster. Conveniently, these projected service account tokens are also valid OpenID Connect (OIDC) tokens. AWS IAM has supported OIDC as a federated identity provider since 2014, which has allowed customers to use an external identity to assume an IAM role.

By combining these two features, an application running in a pod can pass the projected service account token along with a role ARN to the STS API AssumeRoleWithWebIdentity, and get back temporary role credentials! In order for this to work properly, there is some setup required to create an OIDC provider, and update an IAM role's trust policy so that the Kubernetes service account for a particular cluster is permitted to assume the role.

Some of the advantages to this approach are that any pod (including host pods) can assume a role, there is not a reliance on Kubernetes annotations for security, there are not any extra processes that need to be run on nodes, and you will be able to have nodes without any IAM permissions of their own.

In the coming months we will be building out functionality in EKS to create and manage OIDC providers for EKS clusters, as well as configuring IAM roles that can be used in an EKS cluster. We will also be adding support for this authentication mechanism in the AWS SDKs.

Totally open for comments, questions or suggestions on this -- let us know in the comments!

Micah Hausler (@micahhausler), System Development Engineer on EKS

@pauncejones pauncejones created this issue from a note in containers-roadmap (We're Working On It) Dec 5, 2018

@pauncejones pauncejones added the EKS label Dec 5, 2018

@christopherhein

This comment has been minimized.

Copy link
Member

christopherhein commented Dec 13, 2018

Exciting to see this get so much attention. Here is an implementation that was brought up in sig-aws back in July of this year, those of you interested if you want to provide feedback it will help to guide the implementation. kubernetes/community#2329

We'll publish more about our approach soon.

👍

@gtaylor

This comment has been minimized.

Copy link

gtaylor commented Dec 13, 2018

Ahh, I was looking for that.

Will that KEP eventually me moved to https://github.com/kubernetes/enhancements ? It looks like kubernetes/community#2329 was closed due to KEPs being moved out to k/enhancements. Seems to have halted discussion and consideration.

@christopherhein

This comment has been minimized.

Copy link
Member

christopherhein commented Dec 13, 2018

@gtaylor that was actually incorrect. Sorry about that. That was another implementation from the community. We'll have more details about our implementation coming out soon. Sorry for the confusion.

@cpaika

This comment has been minimized.

Copy link

cpaika commented Dec 20, 2018

Big fan of this - our organization can't adopt EKS until this is resolved.

@sbkg0002

This comment has been minimized.

Copy link

sbkg0002 commented Dec 23, 2018

Same here, glad this is shared upfront.

@007

This comment has been minimized.

Copy link

007 commented Dec 23, 2018

@gtaylor

This comment has been minimized.

Copy link

gtaylor commented Dec 23, 2018

@007 kube2iam can not handle rapid pod churn and lacks some controls for selectively limiting metadata server exposure. It is not a complete, final solution to this problem.

Source: have used kube2iam in production at a large scale.

@Vlaaaaaaad

This comment has been minimized.

Copy link

Vlaaaaaaad commented Dec 23, 2018

@gtaylor : did you try kiam too? Did you find a workaround for the rapid pod churn issues?

I'm in the process of implementing some very spiky workloads and I'm trying to prepare the best I can.

@gtaylor

This comment has been minimized.

Copy link

gtaylor commented Dec 23, 2018

I think we are going to stick it out for the "final" solution (the one this issue is tracking).

We had looked at kiam but aren't hurting badly enough to the point of having to make such a large change (for us). That might change, though. Kiam is probably where we'll go if we end up in a spot where kube2iam becomes untenable.

@realAndyLuo

This comment has been minimized.

Copy link

realAndyLuo commented Jan 9, 2019

my EKS friends, any rough ETA on this one?

@micahhausler

This comment has been minimized.

Copy link
Member

micahhausler commented Jan 9, 2019

@realAndyLuo

This comment has been minimized.

Copy link

realAndyLuo commented Jan 9, 2019

thanks @micahhausler . Does "Working On it" come with any target date? or too much a spoiler to ask for

@skyzyx

This comment has been minimized.

Copy link

skyzyx commented Jan 9, 2019

@realAndyLuo: Never. As a former Amazonian, I can tell you that it'll be ready when it's ready. "Working on it" is as close as you'll ever get to a time commitment.

Cheers. 👍

@mikkeloscar

This comment has been minimized.

Copy link

mikkeloscar commented Jan 9, 2019

I have been working on a replacement for kube2iam/kiam in the form of https://github.com/mikkeloscar/kube-aws-iam-controller. Currently it has only focused on robustness and doesn't have features to restrict what roles you can request within a cluster (there are open issues for that). It also only works with some of the AWS SDKs but eliminates all the race conditions which are inherit in the design of kube2iam and kiam.

Maybe it's interesting for some of you.

@christopherhein

This comment has been minimized.

Copy link
Member

christopherhein commented Jan 10, 2019

@cullenmcdermott

This comment has been minimized.

Copy link

cullenmcdermott commented Jan 11, 2019

The new proposal looks interesting. Quick question though, how would I get/distribute the tokens? Would each token map to one role in IAM?

@mikkeloscar

This comment has been minimized.

Copy link

mikkeloscar commented Jan 11, 2019

By combining these two features, an application running in a pod can pass the projected service account token along with a role ARN to the STS API AssumeRoleWithWebIdentity, and get back temporary role credentials! In order for this to work properly, there is some setup required to create an OIDC provider, and update an IAM role's trust policy so that the Kubernetes service account for a particular cluster is permitted to assume the role.

Does this mean that applications have to actively implement this, or would the AWS SDK automatically do it? What I wanted to avoid with https://github.com/mikkeloscar/kube-aws-iam-controller is that applications needs to implement a custom SDK setup for running on Kubernetes. It should just work out of the box whether you run the application on bare EC2 or on Kubernetes or any other AWS like environment IMO. If this is not the case, then there will be a long tail of open source applications which needs to be updated to support this.

@micahhausler

This comment has been minimized.

Copy link
Member

micahhausler commented Jan 11, 2019

@cullenmcdermott

The new proposal looks interesting. Quick question though, how would I get/distribute the tokens? Would each token map to one role in IAM?

Projected service account tokens are issued via the API server, and mounted via the kubelet. You can add a projected token today on newer versions of Kubernetes by using the projected volume type.

kind: Pod
apiVersion: v1
metadata: 
  name: pod-name
  namespace: default
spec:
  serviceAccountName: default
  containers: 
  - name: container-name
    image: container-image:version
    volumeMounts:
    - mountPath: "/var/run/secrets/something/serviceaccount/"
      name: projected-token
  volumes:
  - name: projected-token
    projected:
      sources:
      - serviceAccountToken:
          audience: "client-id"
          expirationSeconds: 86400
          path: token 

The thinking right now is you would add an annotation to either the ServiceAccount or the Pod (not totally decided yet) with the IAM role ARN, and the token volume, volumeMount, and required env AWS environment variables (variable names TBD, but the SDKs will need a role ARN and token path) would get added via a mutating webhook.

On a high level the user workflow would look like this:

  • Create an EKS cluster, OIDC identity provider gets created in IAM for the cluster automatically
  • User whitelists a specific ServiceAccount namespace/name for a specific cluster to assume the preexisting IAM role, which updates the role's trust policy (similar to this, but we'll make it easier than editing the JSON yourself)
  • User annotates ServiceAccount with the IAM role ARN
  • All pods using that service account get the projected volume and environment variables added by the webhook
  • Updated AWS SDKs running inside the pod know to look for env vars specifying the role and OIDC token path.

@mikkeloscar

Does this mean that applications have to actively implement this, or would the AWS SDK automatically do it?

It would be automatic with new versions of the SDK.

@pingles

This comment has been minimized.

Copy link

pingles commented Jan 11, 2019

This sounds cool, we'll definitely be looking to adopt (I say that as one of the creators of https://github.com/uswitch/kiam) 😀 Glad to see this in the roadmap.

Given the SDK update requirement we'd probably have to run side-by-side for a while as all our teams update their apps and libs etc but sounds like that's doable too so all good to me. Thanks to the team there for thinking on it and not just taking the first suggestion!

@mustafaakin

This comment has been minimized.

Copy link

mustafaakin commented Jan 22, 2019

Would it be possible without upgrading all AWS SDK? It would be nice that if this component of the SDKs, at least for Java, be a seperate component until we can upgrade?

@micahhausler

This comment has been minimized.

Copy link
Member

micahhausler commented Jan 29, 2019

@mustafaakin for applications that couldn't transition right away, you could run a sidecar that would perform the sts:AssumeRoleWithWebIdentity call and expose those credentials on a localhost HTTP endpoint within the pod. You'd have to configure the application container to use the sidecar by setting the environment variable AWS_CONTAINER_CREDENTIALS_FULL_URI.

@gtaylor

This comment has been minimized.

Copy link

gtaylor commented Jan 29, 2019

Does this also apply to both/boto3?

@micahhausler

This comment has been minimized.

Copy link
Member

micahhausler commented Jan 29, 2019

Yes, pretty much any SDK within the last 2 years would have AWS_CONTAINER_CREDENTIALS_FULL_URI support.

@mikkeloscar

This comment has been minimized.

Copy link

mikkeloscar commented Jan 29, 2019

@mustafaakin for applications that couldn't transition right away, you could run a sidecar that would perform the sts:AssumeRoleWithWebIdentity call and expose those credentials on a localhost HTTP endpoint within the pod. You'd have to configure the application container to use the sidecar by setting the environment variable AWS_CONTAINER_CREDENTIALS_FULL_URI.

Isn't this just a recipe for race conditions? :) If your application container starts and requests the IAM role before the sidecar container has done assumeRole, then your application fails to get the credentials.

@micahhausler

This comment has been minimized.

Copy link
Member

micahhausler commented Jan 29, 2019

@mikkeloscar You are right, but I would also say it depends on the implementation of the application. Most AWS SDK's have a retry for metadata credential fetching, and some applications may not initialize the AWS SDK at startup. For those that do and exit, Kubernetes should restart that container while still bringing the sidecar online. It is not the optimal solution, but for cases where an newer SDK update is not immediately available, it could work.

@schlomo

This comment has been minimized.

Copy link

schlomo commented Feb 1, 2019

@mikkeloscar @micahhausler what is for me - as a K8S and AWS user - important is that it "just works" from a usage perspective. Same as aws sts get-caller-identity just works on EC2, I expect the same inside a K8S pod.

My biggest concern with the approach in this issue is the question, if all the AWS SDKs support re-reading the credential file from time to time or if there are some out there that assume that the content of a credential file never changes. IMHO this is the big benefit of the EC2 metadata interface, everybody using it knows that the information and credentials obtained from it are temporary.

In a previous setting we made a good experience with pre-fetching of IAM credentials as a solution to the 1-second-timeout in the AWS SDKs.

@chrisz100

This comment has been minimized.

Copy link

chrisz100 commented Feb 5, 2019

General question: is this going to be Eks only or will you opensource the solution to be deployed on custom kubernetes installations on aws as well?

@micahhausler

This comment has been minimized.

Copy link
Member

micahhausler commented Feb 6, 2019

EKS will have automatic setup, but this will have the capability work with clusters on any provider.

@arminc

This comment has been minimized.

Copy link

arminc commented Feb 6, 2019

I was wondering does this mean we will be able to use a role in another account or will I still need to do a secondary step and assume the role by my self (inside the container for example)? It sounds like cross account role assumption might work which would be nice.

@micahhausler

This comment has been minimized.

Copy link
Member

micahhausler commented Feb 6, 2019

Cross-account role assumption is definitely be possible, but there would be some setup required in the other account. You'd have to create an OpenID connect provider in the second account referencing cluster's Issuer URL, and update the trust policy on any roles in the second account allowing the ServiceAccount identity to assume it.

@aavileli

This comment has been minimized.

Copy link

aavileli commented Mar 26, 2019

Without this feature we cannot move to EKS

@rtkgjacobs

This comment has been minimized.

Copy link

rtkgjacobs commented Mar 27, 2019

kiam has been working superbly for us, but looking forward to a native AWS solution in EKS. Ideally not requiring ' modification' or newer AWS SDK's fingers crossed - or i'd likely have us hold off moving from kiam until most assets out there are including using newer AWS SDK's.

@whereisaaron

This comment has been minimized.

Copy link

whereisaaron commented Mar 27, 2019

Hi @rtkgjacob, sounds positive. Last I looked at kaim it expected the server component to run on master nodes, or at least not a node with the kaim agent running. How do you handle that with EKS?

kiam components also require you to maintain internal client/server certificates, but there didn't seem to be a mechanism to rotate them? How do you handle that?

@adaniline-traderev

This comment has been minimized.

Copy link

adaniline-traderev commented Mar 27, 2019

@whereisaaron We had to run kiam server and agents in privileged mode using host network. We use cert-manager to generate cliant/server certificates, which automatically renews them. We are yet to see if kiam processes will require a restart after certificates are renewed

@rtkgjacobs

This comment has been minimized.

Copy link

rtkgjacobs commented Mar 28, 2019

I also used cert-manager to manage the certs for kiam. Kiam does not have ionotif / auto reload if the certs change, and I don't think it had a /reload style http hook for a sidecar to do so easily. We set a cert timespan to hopefully be longer than AWS's native IAM solution hitting EKS for the immediate. Would be an ideal design pattern for it to support.

@byrneo

This comment has been minimized.

Copy link

byrneo commented Apr 4, 2019

@micahhausler will the OIDC provider (which gets auto-created by AWS) be dynamically configured as a federated IDP in the user account's IAM service? If so, would it be correct to think that there will be no 'actual' OIDC protocol interactions between k8s and the OIDC IDP? sts:AssumeRoleWithWebIdentity with the required parameters is all we'll need? .

btw: thanks for having this discussion in the open!

@micahhausler

This comment has been minimized.

Copy link
Member

micahhausler commented Apr 4, 2019

@byrneo Correct, the OIDC IDP will get created in the user's IAM service, but no actual OAuth2 flow will happen. EKS will host an IDP for the .well-known/openid-configuration and jwks_uri bits, but that will be transparent to the user, just so STS can verify the signing key RSA/ECDSA key.

@lstoll

This comment has been minimized.

Copy link

lstoll commented Apr 4, 2019

@micahhausler will the discovery bits be publicly exposed as well? We're already using OIDC-esque methods for auth outside of AWS, if we could re-use this it would be a huge win.

@micahhausler

This comment has been minimized.

Copy link
Member

micahhausler commented Apr 4, 2019

Yep! That is the plan. That way, you could configure projected tokens with alternate audiences (aka client_id) for use with other systems

@gtaylor

This comment has been minimized.

Copy link

gtaylor commented Apr 4, 2019

Whoa, that's great. So to verify understanding: we're using Okta + OIDC on our self-hosted clusters now. In the future, EKS would allow us to continue using this in addition to STS?

@micahhausler

This comment has been minimized.

Copy link
Member

micahhausler commented Apr 4, 2019

@gtaylor For pods to authenticate with outside OIDC systems, yes. (This is not for user auth using OIDC to the Kubernets API server)

@geota

This comment has been minimized.

Copy link

geota commented Apr 10, 2019

Hi @rtkgjacob, sounds positive. Last I looked at kaim it expected the server component to run on master nodes, or at least not a node with the kaim agent running. How do you handle that with EKS?

We just managed split node groups by using taints, tolerations, and node selectors.

@pawelprazak

This comment has been minimized.

Copy link

pawelprazak commented Apr 10, 2019

@geota were you able to prevent running pods on those nodes (by adding tolerations and selectors) with PSP or an admission controller?

because normally any user can potentially add proper tolerations and/or selectors and run on any node pool

@rtkgjacobs

This comment has been minimized.

Copy link

rtkgjacobs commented Apr 10, 2019

Hi @rtkgjacob, sounds positive. Last I looked at kaim it expected the server component to run on master nodes, or at least not a node with the kaim agent running. How do you handle that with EKS?

I managed to build a configuration that autoscaled from our dev instances to prod. You can get both the agent and server (kiam) wise both run on a single EKS worker node (since AWS does not let you put anything on the control plane masters) by setting both to use 'host' networking container wise. There are considerations doing this, YMMV etc.

Here is an example kiam server pod configuration:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  namespace: kube-system
  name: kiam-server
spec:
  replicas: 1
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9620"
      labels:
        app: kiam
        role: server
    spec:
      # We need to use node host network to bypass kiam-agents iptables re-routing
      hostNetwork: true          < --- key emphasis
      serviceAccountName: kiam-server
      volumes:
        - name: ssl-certs
          hostPath:
            path: /etc/pki/ca-trust/extracted/pem/
        - name: tls
          secret:
            secretName: kiam-server-tls
      containers:
        - name: kiam
          image: quay.io/uswitch/kiam:v3.0-rc1
          imagePullPolicy: Always
          command:
            - /kiam
          args:
            - server
            - --json-log
            - --level=info
            - --bind=0.0.0.0:443
            - --cert=/etc/kiam/tls/tls.crt
            - --key=/etc/kiam/tls/tls.key
            - --ca=/etc/kiam/tls/ca.crt            
            - --role-base-arn-autodetect
            - --sync=1m
            - --prometheus-listen-addr=0.0.0.0:9620
            - --prometheus-sync-interval=5s
          volumeMounts:
            - mountPath: /etc/ssl/certs
              name: ssl-certs
            - mountPath: /etc/kiam/tls
              name: tls
          livenessProbe:
            exec:
              command:
              - /kiam
              - health
              - --cert=/etc/kiam/tls/tls.crt
              - --key=/etc/kiam/tls/tls.key
              - --ca=/etc/kiam/tls/ca.crt
              - --server-address=localhost:443
              - --gateway-timeout-creation=5s
              - --timeout=5s
            initialDelaySeconds: 10
            periodSeconds: 10
            timeoutSeconds: 10
          readinessProbe:
            exec:
              command:
              - /kiam
              - health
              - --cert=/etc/kiam/tls/tls.crt
              - --key=/etc/kiam/tls/tls.key
              - --ca=/etc/kiam/tls/ca.crt
              - --server-address=localhost:443
              - --gateway-timeout-creation=5s
              - --timeout=5s
            initialDelaySeconds: 3
            periodSeconds: 10
            timeoutSeconds: 10

And here is an example client pod config (also can run on same eks worker node)

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  namespace: kube-system
  name: kiam-agent        
spec:
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9620"
      labels:
        app: kiam
        role: agent
    spec:
      # We need to use node host network so this container can manipulate the host EC2 nodes iptables and intercept the meta-api calls
      hostNetwork: true        < --- key emphasis
      dnsPolicy: ClusterFirstWithHostNet
      volumes:
        - name: ssl-certs
          hostPath:
            path: /etc/pki/ca-trust/extracted/pem/
        - name: tls
          secret:
            secretName: kiam-server-tls
        - name: xtables
          hostPath:
            path: /run/xtables.lock
            type: FileOrCreate
      containers:
        - name: kiam
          securityContext:
            capabilities:
              add: ["NET_ADMIN"]      < -- important so  it can interact with iptables of host
          image: quay.io/uswitch/kiam:v3.0-rc1
          imagePullPolicy: Always
          command:
            - /kiam
          args:
            - agent
            - --iptables
            - --host-interface=!eth15   # https://github.com/uswitch/kiam/pull/112/files#r238483446
            - --json-log
            - --level=info
            - --port=8181
            - --cert=/etc/kiam/tls/tls.crt
            - --key=/etc/kiam/tls/tls.key
            - --ca=/etc/kiam/tls/ca.crt
            - --server-address=kiam-server:443
            - --prometheus-listen-addr=0.0.0.0:9620
            - --prometheus-sync-interval=5s
            - --gateway-timeout-creation=1s
          env:
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          volumeMounts:
            - mountPath: /etc/ssl/certs
              name: ssl-certs
            - mountPath: /etc/kiam/tls
              name: tls
            - mountPath: /var/run/xtables.lock
              name: xtables
          livenessProbe:
            httpGet:
              path: /ping
              port: 8181
            initialDelaySeconds: 3
            periodSeconds: 3

Unless BOTH host and agent are both set to use 'host' networking, you can't expect both to collapse onto a single node.

Hope this helps. For us we wanted a design pattern that can deploy an EKS cluster with a single worker node in the ASG, and then as devs load more pods the K8 autoscaler will bring up more nodes via the AWS ASG - costs start low and can fan out automatically. Until AWS provides their native solutio and we can sunset KIAM ideally.

@mustafaakin

This comment has been minimized.

Copy link

mustafaakin commented Apr 10, 2019

We like to manage worker groups with subnet, security group and IAM segregations and keep a set of nodes to run prvilleged stuff (like kiam), or must work uninterrupted (like prometheus) and do not let anyone tu submit YAMLs to kubernetes itself, only via peer-reviewed automation.

@aavileli

This comment has been minimized.

Copy link

aavileli commented Apr 11, 2019

@rtkgjacobs . Its not a good idea to run both the agent and server on the same node.
Also would you not give all nodes to assume the server IAM role which defeats the purpose of securing the pods

@aavileli

This comment has been minimized.

Copy link

aavileli commented Apr 11, 2019

@adaniline-traderev can you explain how you run kiam under eks

@chrissnell

This comment has been minimized.

Copy link

chrissnell commented Apr 11, 2019

Can we take the kiam discussion somewhere else please? I want to keep watching this issue to track AWS's progress. I don't care about third-party efforts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.