Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ephemeral containers volume mount permissions differ from normal containers #32351

Closed
howardjohn opened this issue Nov 11, 2021 · 23 comments · Fixed by #35053
Closed

Ephemeral containers volume mount permissions differ from normal containers #32351

howardjohn opened this issue Nov 11, 2021 · 23 comments · Fixed by #35053
Assignees
Labels
kind/documentation Categorizes issue or PR as related to documentation. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/docs Categorizes an issue or PR as relevant to SIG Docs. sig/storage Categorizes an issue or PR as relevant to SIG Storage. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@howardjohn
Copy link
Contributor

What happened?

Working: a pod with multiple containers, mounting a projected volume mount. Note the different UID/GID


apiVersion: apps/v1
kind: Deployment
metadata:
  name: shell-2
  namespace: noproxy
spec:
  selector:
    matchLabels:
      app: shell-2
  template:
    metadata:
      labels:
        app: shell-2
    spec:
      containers:
      - args:
        - /bin/sleep
        - infinity
        image: howardjohn/alpine-shell
        imagePullPolicy: IfNotPresent
        name: shell
        resources:
          limits:
            cpu: 500m
            memory: 500M
          requests:
            cpu: 10m
            memory: 128Mi
        securityContext:
          runAsUser: 1338
      - args:
        - /bin/sleep
        - infinity
        image: howardjohn/alpine-shell
        imagePullPolicy: Always
        name: ephemeral
        resources: {}
        securityContext:
          runAsUser: 1337
          runAsGroup: 1337
        volumeMounts:
        - mountPath: /var/run/secrets/tokens
          name: token
      volumes:
      - name: token
        projected:
          defaultMode: 420
          sources:
          - serviceAccountToken:
              audience: test
              expirationSeconds: 43200
              path: token

This works since 1.19+. On 1.18 and old we need fsGroup: 1337 as well.

Not working: doing the same with ephemeral containers

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
  name: shell
  namespace: noproxy
spec:
  selector:
    matchLabels:
      app: shell
  template:
    metadata:
      labels:
        app: shell
    spec:
      containers:
      - args:
        - /bin/sleep
        - infinity
        image: howardjohn/alpine-shell
        imagePullPolicy: IfNotPresent
        name: shell
        resources:
          limits:
            cpu: 500m
            memory: 500M
          requests:
            cpu: 10m
            memory: 128Mi
        securityContext:
          runAsUser: 1338
      volumes:
      - name: token
        projected:
          defaultMode: 420
          sources:
          - serviceAccountToken:
              audience: test
              expirationSeconds: 43200
              path: token

Then attach ephemeral container:

name: ephemeral
securityContext:
  runAsUser: 1337
  runAsGroup: 1337
image: howardjohn/alpine-shell
args:
  - /bin/sleep
  - infinity
volumeMounts:
- mountPath: /var/run/secrets/tokens
  name: token

When doing this, accessing the token from the ephemeral containers gives a permission denied.

If we add fsGroup: 1337 to the pod spec, it does work.

What did you expect to happen?

File permissions are the same on ephemeral containers and normal containers

How can we reproduce it (as minimally and precisely as possible)?

See above

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.0", GitCommit:"c2b5237ccd9c0f1d600d3072634ca66cefdf272f", GitTreeState:"clean", BuildDate:"2021-08-04T18:03:20Z", GoVersion:"go1.16.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"23+", GitVersion:"v1.23.0-alpha.4", GitCommit:"359b722c19ff163fb267f2747d794bd667d4f91d", GitTreeState:"clean", BuildDate:"2021-11-04T17:26:07Z", GoVersion:"go1.17.2", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

Kind

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

@howardjohn howardjohn added the kind/bug Categorizes issue or PR as related to a bug. label Nov 11, 2021
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Nov 11, 2021
@k8s-ci-robot
Copy link
Contributor

@howardjohn: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Nov 11, 2021
@howardjohn
Copy link
Contributor Author

/sig storage

cc @verb

@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 11, 2021
@liggitt
Copy link
Member

liggitt commented Nov 24, 2021

cc @verb

does this need to be resolved before 1.23, since ephemeral containers is beta this release?

/milestone v1.23

@verb
Copy link
Contributor

verb commented Nov 25, 2021

Thanks for pinging this, I missed it. Investigating...

/assign

@verb
Copy link
Contributor

verb commented Nov 29, 2021

I'm having trouble reproducing this. The projected volumes doc says that file mode should be set to the RunAsUser in the pod's (not container) SecurityContext.

The owner is sometimes taken from the container's RunAsUser, though, I suspect when there's only one volumeMount for the volume. If I create a pod with a container RunAsUser and projected volume:

apiVersion: v1
kind: Pod
metadata:
  name: t1
spec:
  containers:
  - name: shell
    image: busybox
    args:
    - sleep
    - "86400"
    securityContext:
      runAsUser: 1
    volumeMounts:
    - name: all-in-one
      mountPath: "/vol"
      readOnly: true
  volumes:
  - name: all-in-one
    projected:
      defaultMode: 0440
      sources:
      - serviceAccountToken:
          audience: test
          expirationSeconds: 43200
          path: token

Then the token gets the mode (owner is correct, but mode is not the defaultMode):

-rw-------    1 daemon   root         885 Nov 29 23:12 token

But if I try to reference the volume twice using different RunAsUsers:

apiVersion: v1
kind: Pod
metadata:
  name: t2
spec:
  containers:
  - name: shell
    image: busybox
    args:
    - sleep
    - "86400"
    securityContext:
      runAsUser: 1
    volumeMounts:
    - name: all-in-one
      mountPath: "/vol"
      readOnly: true
  - name: shell2
    image: busybox
    args:
    - sleep
    - "86400"
    securityContext:
      runAsUser: 2
    volumeMounts:
    - name: all-in-one
      mountPath: "/vol"
      readOnly: true
  volumes:
  - name: all-in-one
    projected:
      defaultMode: 0440
      sources:
      - serviceAccountToken:
          audience: test
          expirationSeconds: 43200
          path: token

Then the container gets the mode (in both containers):

-r--r-----    1 root     root         885 Nov 29 23:12 token

Now if I add an ephemeral container to t1:

  ephemeralContainers:
  - args:
    - /bin/sleep
    - infinity
    image: busybox
    imagePullPolicy: Always
    name: ephemeral
    securityContext:
      runAsUser: 5
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /vol
      name: all-in-one

Then I see the same file ownership as before:

-rw-------    1 daemon   root         885 Nov 29 23:12 token

So this differs, but is it the incorrect behavior for ephemeral containers? I'm not sure. I'm also not sure what behavior you're seeing for your first test case (shell-2) because when I try it on my test cluster I get the same behavior as t2, regardless of whether there are 1 or 2 volumeMounts, except that the file gets mode 0644.

@howardjohn Could you provide more succinct test cases with pods rather than deployments, and include the output of ls -l $path/..data/?

@msau42 @jpeeler Any idea what's the intended pod-vs-container securityContext behavior for file ownership of projected volumes?

@howardjohn
Copy link
Contributor Author

t2 reproducer:

apiVersion: v1
kind: Pod
metadata:
  name: shell
spec:
  containers:
  - args:
    - /bin/sleep
    - infinity
    image: howardjohn/alpine-shell
    name: shell
    securityContext:
      runAsUser: 1338
  - args:
    - /bin/sleep
    - infinity
    image: howardjohn/alpine-shell
    name: ephemeral
    securityContext:
      runAsUser: 1337
      runAsGroup: 1337
    volumeMounts:
    - mountPath: /var/run/secrets/tokens
      name: token
  volumes:
  - name: token
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          audience: test
          expirationSeconds: 43200
          path: token

Testing we can access the token

$ kubectl exec shell -c ephemeral -- ls -l var/run/secrets/tokens/..data/
total 4
-rw-r--r--    1 root     root           878 Nov 29 23:38 token
$ kubectl exec shell -c ephemeral -- head -c5 var/run/secrets/tokens/token
eyJhb

With ephemeral containers:

apiVersion: v1
kind: Pod
metadata:
  name: shell
spec:
  containers:
  - args:
    - /bin/sleep
    - infinity
    image: howardjohn/alpine-shell
    name: shell
    securityContext:
      runAsUser: 1338
  volumes:
  - name: token
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          audience: test
          expirationSeconds: 43200
          path: token

attach ephemeral container:

name: ephemeral
securityContext:
  runAsUser: 1337
  runAsGroup: 1337
image: howardjohn/alpine-shell
args:
  - /bin/sleep
  - infinity
volumeMounts:
- mountPath: /var/run/secrets/tokens
  name: token

Testing access:

$ kubectl exec shell -c ephemeral -- ls -l var/run/secrets/tokens/..data/
total 4
-rw-------    1 1338     root           878 Nov 29 23:41 token
$ kubectl exec shell -c ephemeral -- head -c5 var/run/secrets/tokens/token
head: var/run/secrets/tokens/token: Permission denied
command terminated with exit code 1

@msau42
Copy link
Member

msau42 commented Nov 29, 2021

There is special logic to handle setting permissions for projected service account tokens. I wonder if the logic is not handling ephemeral containers properly. kubernetes/kubernetes#89193

@liggitt
Copy link
Member

liggitt commented Nov 30, 2021

The logic in https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/2451-service-account-token-volumes/README.md#file-permission (implemented in kubernetes/kubernetes#89193) requires visibility to all containers at pod creation time.

If all non-ephemeral containers run as a unified, non-root uid, the token will only be readable by that uid.

@verb
Copy link
Contributor

verb commented Dec 3, 2021

Thanks for the test cases and the links to the KEP, they made it much easier to figure out what's going on here.

It sounds like everything is working as intended. The KEP even mentions that ephemeral containers are excluded from the heuristic. The only question is whether we should update the projected volume docs to mention the special handling of tokens.

@howardjohn you mentioned that fsGroup works as you expected. Is that an acceptable work around for your use case?

@howardjohn
Copy link
Contributor Author

Not really, since it impacts all volumes. Basically the use case is a sidecar wants access to a token. With setting fsGroup at the pod level, it impacts the main application's volumes which typically breaks everything. The changes in KEP 2451 fix this, but not with ephemeral containers.

That being said, I do not personally have a real use case for this, unless the volume could also be added dynamically as well, so if this is the intended behavior that is ok for me

@liggitt
Copy link
Member

liggitt commented Dec 3, 2021

is running the ephemeral container as a non-0 uid that is different than the non-0 uids of the other containers common?

@howardjohn
Copy link
Contributor Author

I highly doubt it. For context on how I found this, I was experimenting with running a service mesh sidecar as an ephemeral container. That was almost entirely just an experiment and clearly an abuse of what ephemeral containers were design for though - I was mostly just playing around with things

@liggitt liggitt added kind/documentation Categorizes issue or PR as related to documentation. and removed kind/bug Categorizes issue or PR as related to a bug. labels Dec 4, 2021
@liggitt
Copy link
Member

liggitt commented Dec 4, 2021

since this is not a bug, switching type to documentation and removing from milestone

@sftim
Copy link
Contributor

sftim commented Dec 15, 2021

I recommend opening a documentation issue against k/website that explains the docs change to make here.

Ideally: be reasonably thorough, as the technical details here seem quite subtle.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 15, 2022
@verb
Copy link
Contributor

verb commented Mar 18, 2022

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 18, 2022
@verb
Copy link
Contributor

verb commented Mar 18, 2022

/transfer website

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Mar 18, 2022
@k8s-ci-robot k8s-ci-robot transferred this issue from kubernetes/kubernetes Mar 18, 2022
@verb
Copy link
Contributor

verb commented Mar 18, 2022

Cool, the transfer worked. Here's a summary for k/website:

Problem:

Projected Volumes have specially handling for serviceAccountTokens that causes them to disregard defaultMode when all containers in a pod have the same runAsUser. This undocumented behavior led to a bug report in kubernetes/kubernetes#106365

Proposed Solution:

Update projected volumes doc to explain special behavior:

  • When all containers specified at pod creation have the same runAsUser, the serviceAccountToken will use this user as the file owner and set mode 0600.
  • Ephemeral containers are never present at pod creation, so ephemeralContainer runAsUser aren't considered in this behavior.

Page to Update:
https://kubernetes.io/docs/concepts/storage/projected-volumes/

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 16, 2022
@verb
Copy link
Contributor

verb commented Jul 15, 2022

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 15, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 13, 2022
@sftim
Copy link
Contributor

sftim commented Oct 29, 2022

/sig docs

@k8s-ci-robot k8s-ci-robot added the sig/docs Categorizes an issue or PR as relevant to SIG Docs. label Oct 29, 2022
@sftim
Copy link
Contributor

sftim commented Oct 29, 2022

/triage accepted
/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/documentation Categorizes issue or PR as related to documentation. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/docs Categorizes an issue or PR as relevant to SIG Docs. sig/storage Categorizes an issue or PR as relevant to SIG Storage. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
7 participants