New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested volumes sometimes not mounted correctly #71800

Open
pascalgn opened this Issue Dec 6, 2018 · 2 comments

Comments

Projects
None yet
3 participants
@pascalgn

pascalgn commented Dec 6, 2018

What happened:
When running Jenkins in Kubernetes on Azure, we have two volumes, /var/jenkins_home (which is on a CIFS network share) and /var/jenkins_home/workspaces (which is from a persistent volume). When the Jenkins pod is starting, for example after the node has rebooted, it can happen that /var/jenkins_home gets mounted, but /var/jenkins_home/workspaces does not.

What you expected to happen:
The path /var/jenkins_home/workspaces should only get mounted after /var/jenkins_home has been mounted. I suspect that there's some timing issue where sometimes /var/jenkins_home/workspaces gets mounted after /var/jenkins_home is being mounted (especially considering /var/jenkins_home is from a network share).

How to reproduce it (as minimally and precisely as possible):

apiVersion: apps/v1beta2
kind: StatefulSet
metadata:
  name: jenkins
spec:
  serviceName: jenkins
  selector:
    matchLabels:
      app: jenkins
  replicas: 1
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: jenkins
    spec:
      containers:
        - name: jenkins
          image: jenkins:2.154
          ports:
            - containerPort: 80
              name: jenkins-http
          volumeMounts:
            - mountPath: /var/jenkins_home
              name: jenkins-home
            - mountPath: /var/jenkins_home/cache
              name: jenkins-home-caches
            - mountPath: /var/jenkins_home/caches
              name: jenkins-home-caches
            - mountPath: /var/jenkins_home/tools
              name: jenkins-home-tools
            - mountPath: /var/jenkins_home/updates
              name: jenkins-home-updates
            - mountPath: /var/jenkins_home/workspace
              name: jenkins-home-workspace
      volumes:
        - name: jenkins-home
          azureFile:
            secretName: storage-secret
            shareName: jenkins-home
            readOnly: false
        - name: jenkins-home-cache
          emptyDir: {}
        - name: jenkins-home-caches
          emptyDir: {}
        - name: jenkins-home-updates
          emptyDir: {}
  volumeClaimTemplates:
    - metadata:
        name: jenkins-home-workspace
        annotations:
          volume.beta.kubernetes.io/storage-class: managed-premium
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 75Gi
    - metadata:
        name: jenkins-home-tools
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 1Gi
---
apiVersion: v1
kind: Service
metadata:
  name: jenkins
  labels:
    app: jenkins
spec:
  type: LoadBalancer
  ports:
    - port: 80
      name: service-http
  selector:
    app: jenkins

I have omitted the parts that are (probably) irrelevant.

Note that there is also a second mount, /var/jenkins_home/tools, which did get mounted even when /var/jenkins_home/workspaces was not mounted.

Anything else we need to know?:
The issue does not occur all the time. It can be resolved by connecting to the pod and killing Jenkins, so that the container restarts. After the container restart, all volumes are mounted correctly.

Environment:

  • Kubernetes version (use kubectl version): Client: v1.12.2, Server: v1.11.1
  • Cloud provider or hardware configuration: Azure
  • OS (e.g. from /etc/os-release): Debian GNU/Linux 9 (stretch)
  • Kernel (e.g. uname -a): Linux jenkins-0 4.15.0-1032-azure #33~16.04.1-Ubuntu SMP Fri Nov 9 22:36:11 UTC 2018 x86_64 GNU/Linux
  • Install tools: acs-engine
  • Others: -

/kind bug

@andyzhangx

This comment has been minimized.

Member

andyzhangx commented Dec 8, 2018

@pascalgn since jenkins-home is an existing azure file and jenkins-home-workspace is a new disk, usually mounting jenkins-home should be faster. While all mount points should not have mounting time sequence.

@pascalgn

This comment has been minimized.

pascalgn commented Dec 8, 2018

I already suspected this might not necessarily be a Kubernetes issue, at least not exclusively. I searched a bit more and found:

The issue is that from time to time the "/home/bigdir" folder will be empty, even though mtab think that the share is still mounted. Only thing that works is by unmounting, and then (re)mounting the bigdir share.
NFS mount mounted inside another NFS mount disappears randomly

A similar entry is NFS mounts depending on other NFS mount fail to mount at boot.
There are workarounds mentioned, like using autofs or netfs (not sure if they also support CIFS).

So this seems to be a general issue when doing nested mounting, that is mounting network shares inside other mounted network shares.

I think it would make sense to do one or more of the following:

  1. Update the documentation to warn users that network volumes should not be nested, i.e. a network volume should not be mounted inside another network volume mount point
  2. Issue a warning event when a network volume is mounted inside another network volume
  3. Implement some kind of retry logic, so when the parent directory gets remounted (for example due to network issues), the child directories should also get remounted

WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment