Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested data volumes (e.g. /secrets & /secrets/more) cause unexpected behavior #57421

Closed
joelsmith opened this issue Dec 19, 2017 · 2 comments · Fixed by #57422
Closed

Nested data volumes (e.g. /secrets & /secrets/more) cause unexpected behavior #57421

joelsmith opened this issue Dec 19, 2017 · 2 comments · Fixed by #57422
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/storage Categorizes an issue or PR as relevant to SIG Storage.

Comments

@joelsmith
Copy link
Contributor

joelsmith commented Dec 19, 2017

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:

Data volumes backed by the API such as secret, configmap, downwardAPI and projected volumes, appear to do strange things when nested. Since each mounted volume is intended to correspond 1:1 with the API object data. Any change to the data causes the kubelet to load a new version of the data and to replace all of the old volume data with the newly-loaded data.

  1. When the kubelet asks Docker to mount the nested volume into the base volume, Docker creates a mount point in the base directory upon which it can mount the nested volume.

  2. The kubelet views the existence of the mount point directory in the base directory as a change that puts the volume out of sync with the data.

  3. The kublet attempts to remove the old versions of the secrets and re-symlink the data.
    a. on an old-enough kernel (e.g. 3.10.0-514.21.1.el7.x86_64), the remove attempt fails because the directory is an in-use mount point and gives an error of "Device or resource busy".
    b. on a newer kernel (e.g. 3.10.0-693.5.2.el7.x86_64), the remove succeeds despite the directory being an active mount point and the nested mount is no longer visible from within the container, despite its being mounted.

  4. On the old kernel, the kubelet continuously repeats step 3 because every re-sync iteration makes it think that it needs a refresh due to the constant presence of the mountpoint directory. Additionally, every iteration leaves behind an old copy of the data since an error is encountered prior to its removal. For example, you might end up with 100 directories that look like /secrets/..YYYY_MM_DD_HH_MM_SS.XXXXXXXXX after a few hours.

What you expected to happen:
As long as the name of a nested volume doesn't interfere with the name of a data item from a parent volume, I would expect the nested volume to work.

How to reproduce it (as minimally and precisely as possible):

  1. Create a yaml file with 2 secrets and a pod that makes volumes for the 2 secrets, with one of them mounted within the other:
apiVersion: v1
kind: List
items:
- apiVersion: v1
  kind: Secret
  data:
    secret: bXkgc2VjcmV0Cg==
  metadata:
    name: secrets
  type: Opaque
- apiVersion: v1
  kind: Secret
  data:
    secret: c2VjcmV0Cg==
  metadata:
    name: moresecrets
  type: Opaque
- apiVersion: v1
  kind: Pod
  metadata:
    name: busybox-pod
  spec:
    containers:
    - image: gcr.io/google_containers/busybox:latest
      name: busybox
      command: ["/bin/sh"]
      args: ["-c", "while sleep 3600; do true; done"]
      volumeMounts:
      - mountPath: /secrets
        name: secrets
      - mountPath: /secrets/more
        name: moresecrets
    restartPolicy: OnFailure
    volumes:
    - name: secrets
      secret:
        secretName: secrets
    - name: moresecrets
      secret:
        secretName: moresecrets
  1. Create the 2 secrets and the pod. Assuming you named the above file pod.yaml:
kubectl create -f pod.yaml
  1. Verify that the bug is happening. Wait a few minutes to make sure that a sync cycle has happened, then look at the containers /secrets directory:
kubectl exec -ti busybox-pod -- sh -c 'find /secrets; echo -n Secret:; cat /secrets/more/secret'

If you have a new kernel, /secrets/more/secret won't exist. If you have an old kernel, it will exist but you should see several timestamp directories in /secrets (one for each sync iteration) that aren't being cleaned up.

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"10+", GitVersion:"v1.10.0-alpha.1.90+157fb7b66cff84", GitCommit:"157fb7b66cff84e77f02110718509facd13681c5", GitTreeState:"clean", BuildDate:"2017-12-19T20:57:27Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10+", GitVersion:"v1.10.0-alpha.1.90+157fb7b66cff84", GitCommit:"157fb7b66cff84e77f02110718509facd13681c5", GitTreeState:"clean", BuildDate:"2017-12-19T20:57:27Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Dec 19, 2017
@k8s-github-robot k8s-github-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Dec 19, 2017
@joelsmith
Copy link
Contributor Author

/sig storage

@k8s-ci-robot k8s-ci-robot added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label Dec 19, 2017
@k8s-github-robot k8s-github-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Dec 19, 2017
@joelsmith
Copy link
Contributor Author

/sig storage
/sig node
/assign joelsmith

@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Dec 21, 2017
k8s-github-robot pushed a commit that referenced this issue Jan 18, 2018
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Rework method of updating atomic-updated data volumes

**What this PR does / why we need it**:

This change affects the way that secret, configmap, downwardAPI and projected volumes (which all use the same underlying code) implement their data update functionality.

* Instead of creating a subdirectory hierarchy that will contain symlinks to each actual data file, create only symlinks to items in the root of the volume, whether they be files or directories.
* Rather than comparing the user-visible data directory to see if an update is needed, compare with the current version of the data directory.
* Fix data dir timestamp format year
* Create `..data` symlink even when a data volume has no data so consumers can have simplified update watch logic.

**Which issue(s) this PR fixes**:
Fixes #57421

**Release note**:
```release-note
Correct issues that arise when volumes are mounted beneath another secret, configmap, downwardAPI or projected volume
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants