Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker images with non-root account fails to read token file #8

Open
serialx opened this issue Sep 11, 2019 · 11 comments
Open

Docker images with non-root account fails to read token file #8

serialx opened this issue Sep 11, 2019 · 11 comments

Comments

@serialx
Copy link

serialx commented Sep 11, 2019

What happened:
Tried to install External DNS to my EKS cluster with amazon-eks-pod-identity-webhook

What you expected to happen:
External DNS working with IAM credentials provided by amazon-eks-pod-identity-webhook

How to reproduce it (as minimally and precisely as possible):

  1. Enable IAM Roles for Service Accounts on your Cluster
  2. Setup amazon-eks-pod-identity-webhook to the cluster.
  3. Use the latest master branch of External DNS
  4. Annotate the external-dns service account:
apiVersion: v1
kind: ServiceAccount
metadata:
  name: external-dns
  namespace: external-dns
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::<REDACTED>:role/K8sExternalDNSRole
  1. External DNS pod logs show:
time="2019-09-11T07:31:53Z" level=error msg="WebIdentityErr: unable to read file at /var/run/secrets/eks.amazonaws.com/serviceaccount/token\ncaused by: open /var/run/secrets/eks.amazonaws.com/serviceaccount/token: permission denied"

Note: I confirmed amazon-eks-pod-identity-webhook is working properly when using root accounts in the pods.

Anything else we need to know?:
I've reported this issue to external dns project: kubernetes-sigs/external-dns#1185
And it seems odd that kubernetes service account volume mounted token files are with permission 0644 and the EKS tokens are with permission 0600.

EKS tokens seems to be 0600:

~ $ ls -al /var/run/secrets/eks.amazonaws.com/serviceaccount/
total 0
drwxrwxrwt    3 root     root           100 Sep 11 06:40 .
drwxr-xr-x    3 root     root            28 Sep 11 06:40 ..
drwxr-xr-x    2 root     root            60 Sep 11 06:40 ..2019_09_11_06_40_49.865776187
lrwxrwxrwx    1 root     root            31 Sep 11 06:40 ..data -> ..2019_09_11_06_40_49.865776187
lrwxrwxrwx    1 root     root            12 Sep 11 06:40 token -> ..data/token
~ $ ls -al /var/run/secrets/eks.amazonaws.com/serviceaccount/..data/token
-rw-------    1 root     root          1028 Sep 11 06:40 /var/run/secrets/eks.amazonaws.com/serviceaccount/..data/token

Kubernetes tokens seems to be 0644:

/run/secrets/kubernetes.io/serviceaccount $ ls -al
total 0
drwxrwxrwt    3 root     root           140 Sep 11 07:31 .
drwxr-xr-x    3 root     root            28 Sep 11 07:31 ..
drwxr-xr-x    2 root     root           100 Sep 11 07:31 ..2019_09_11_07_31_49.116680770
lrwxrwxrwx    1 root     root            31 Sep 11 07:31 ..data -> ..2019_09_11_07_31_49.116680770
lrwxrwxrwx    1 root     root            13 Sep 11 07:31 ca.crt -> ..data/ca.crt
lrwxrwxrwx    1 root     root            16 Sep 11 07:31 namespace -> ..data/namespace
lrwxrwxrwx    1 root     root            12 Sep 11 07:31 token -> ..data/token
/run/secrets/kubernetes.io/serviceaccount $ ls -al ..data/
total 12
drwxr-xr-x    2 root     root           100 Sep 11 07:31 .
drwxrwxrwt    3 root     root           140 Sep 11 07:31 ..
-rw-r--r--    1 root     root          1025 Sep 11 07:31 ca.crt
-rw-r--r--    1 root     root            12 Sep 11 07:31 namespace
-rw-r--r--    1 root     root           875 Sep 11 07:31 token

I've seen the code, and the code doesn't specify the token file permission. So the default 0644 should be used. Maybe this is a k8s upstream bug?

Environment:

  • AWS Region: ap-northeast-1
  • EKS Platform version (if using EKS, run aws eks describe-cluster --name <name> --query cluster.platformVersion): "eks.1"
  • Kubernetes version (if using EKS, run aws eks describe-cluster --name <name> --query cluster.version): "1.14"
  • Webhook Version: master (Commit ID: d2d6039)
@alfredkrohmer
Copy link

Problem seems to be here:
https://github.com/kubernetes/kubernetes/blob/cf76868b3430be015cc1e8443a216bf863a9b9f7/pkg/volume/projected/projected.go#L357

For serviceAccountToken, the mode is hardcoded to octal 0600.

@alfredkrohmer
Copy link

See kubernetes/kubernetes#82573

@micahhausler
Copy link
Member

Hey thanks for this report, we'll leave this open and resolve once Kubernetes fixes the access mode

@siwyd
Copy link

siwyd commented Sep 13, 2019

A pity this kind of blocks the usage of the native IAM integration until fixed. Any workable solutions until this is fixed upstream?

@alfredkrohmer
Copy link

Seems like this is kind-of intended.

There is a workaround here:
kubernetes-sigs/external-dns#1185 (comment)

@serialx
Copy link
Author

serialx commented Sep 16, 2019

Adding securityContext to the yaml file resolved this issue. Updated the PR to update the AWS tutorial:
kubernetes-sigs/external-dns#1185

@micahhausler
Copy link
Member

I see a lot of people cross-referencing issues to this. For anyone else stumbling across this, can you reference the upstream issue too? (kubernetes/kubernetes#82573) That will give more signal to upstream that this is a painful issue

@hasheddan
Copy link

Just an update for folks here, new file permission strategy has been implemented for 1.19 in this PR: kubernetes/kubernetes#89193

Taking a look at the KEP (kubernetes/enhancements#1598), it appears that the following chain of permission checking will take place (mostly copied directly from KEP):

  1. If fsGroup is set, then essentially the same behavior you see currently occurs: set mode to 0600, chown to fsGroup, and fsGroup as supplemental group for container (this is how the fsGroup workaround described above works).
  2. If all containers in a Pod have the same RunAsUser set, then the token file will be chowned to be accessed by that UID.
  3. Set to 0644. This is a change from previous implementation as this replaces setting to 0600 by default. This decision was made to reflect the behavior of Secret.

vkhromov added a commit to Yelp/paasta that referenced this issue Aug 13, 2020
AWS pod identity creates a token with `0600` permissions and
`root:root`:
```
-rw------- 1 root root 1498 Aug 13 12:47 /run/secrets/eks.amazonaws.com/serviceaccount/..2020_08_13_12_47_11.793276204/token
```
This prevents programs running inside containers using a non-root
account to read the token.  See [1] for details.

This CR working around that by adding
```
      securityContext:
              fsGroup: 33
```
into the pod spec.
After that the token is owned by the given group and has `0640`
permissions:
```
-rw-r----- 1 root www-data 1498 Aug 13 14:20 /run/secrets/eks.amazonaws.com/serviceaccount/..2020_08_13_14_20_31.793276204/token
```

The id of the group can be changed via the `fs_group` service parameter.

[1] aws/amazon-eks-pod-identity-webhook#8
vkhromov added a commit to Yelp/paasta that referenced this issue Aug 13, 2020
AWS pod identity creates a token with `0600` permissions and
`root:root`:
```
-rw------- 1 root root 1498 Aug 13 12:47 /run/secrets/eks.amazonaws.com/serviceaccount/..2020_08_13_12_47_11.793276204/token
```
This prevents programs running inside containers using a non-root
account to read the token.  See [1] for details.

This CR working around that by adding
```
      securityContext:
              fsGroup: 65534
```
into the pod spec.
After that the token is owned by the given group and has `0640`
permissions:
```
-rw-r----- 1 root nobody 1498 Aug 13 14:20 /run/secrets/eks.amazonaws.com/serviceaccount/..2020_08_13_14_20_31.793276204/token
```

The id of the group can be changed via the `fs_group` service parameter.

[1] aws/amazon-eks-pod-identity-webhook#8
@hsezhiyan
Copy link

@hasheddan came late to the discussion, but I'm wondering which Kubernetes version does the

If all containers in a Pod have the same RunAsUser set, then the token file will be chowned to be accessed by that UID.

solution work in? Is this solution only available in v1.19? If so, wondering if there is any other workaround. For some context, our team is using Kubeflow pipelines, which is running on top of K8s, and one limitation is that we're not able to set an fsGroup through the Kubeflow SDK, but we can set RunAsUser.

@metacyclic
Copy link

metacyclic commented Feb 11, 2021

This fixed the issue for me ( unrelated to external dns ), where the container runs as user 2000, adding fsGroup made the token accessible for iam service accounts

securityContext:
  fsGroup: 2000
  runAsUser: 2000

Perhaps this page should be updated to include the extra step for EKS < 1.19

@mcristina422
Copy link

This is actually found in the official docs now

AWS also recently posted 1.19 which fixes this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants