Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pending pvc in WaitForFirstConsumer if pod scheduled by custom scheduler #86262

Closed
tianhe-oi opened this issue Dec 13, 2019 · 4 comments
Closed
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling.

Comments

@tianhe-oi
Copy link

tianhe-oi commented Dec 13, 2019

What happened:
Pod specifies a pvc using a storageclass with WaitForFirstConsumer=true. The pod specifies to use a custom scheduler.

  • Pod stuck in ContainerCreating state with message Unable to mount volumes for pod: timeout expired waiting for volumes to attach or mount for pod
  • Pvc stuck in Pending state with message waiting for first consumer to be created before binding

What you expected to happen:
Pod gets scheduled with pv attached.

How to reproduce it (as minimally and precisely as possible):
Create storageclass with WaitForFirstConsumer enabled

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
  labels:
    k8s-addon: storage-aws.addons.k8s.io
  name: test-gp2
parameters:
  type: gp2
provisioner: kubernetes.io/aws-ebs
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

Create PVC using the storageclass

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: task-pv-claim
spec:
  storageClassName: test-gp2
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

Create pod using the pvc and custom (non-default) scheduler

apiVersion: v1
kind: Pod
metadata:
  name: task-pv-pod
spec:
  schedulerName: YOUR_CUSTOM_SCHEDULER
  volumes:
    - name: task-pv-storage
      persistentVolumeClaim:
        claimName: task-pv-claim
  containers:
    - name: task-pv-container
      image: nginx
      ports:
        - containerPort: 80
          name: "http-server"
      volumeMounts:
        - mountPath: "/usr/share/nginx/html"
          name: task-pv-storage

Our customized scheduler policy:

    {
    "kind" : "Policy",
    "apiVersion" : "v1",
    "metadata" : {
        "name": "streamy-scheduler",
        "namespace": "kube-system"
        },
    "predicates" : [
        {"name" : "NoVolumeZoneConflict"},
        {"name" : "MaxEBSVolumeCount"},
        {"name" : "MaxGCEPDVolumeCount"},
        {"name" : "MaxAzureDiskVolumeCount"},
        {"name" : "MatchInterPodAffinity"},
        {"name" : "NoDiskConflict"},
        {"name" : "GeneralPredicates"},
        {"name" : "PodToleratesNodeTaints"}
        ],
    "priorities" : [
        {"name" : "NodePreferAvoidPodsPriority", "weight" : 1},
        {"name" : "NodeAffinityPriority", "weight" : 10},
        {"name" : "TaintTolerationPriority", "weight" : 1},
        {"name" : "InterPodAffinityPriority", "weight" : 20},
        {"name" : "MostRequestedPriority", "weight" : 5}
        ],
    "hardPodAffinitySymmetricWeight" : 10,
    "alwaysCheckAllPredicates" : false
    }

Anything else we need to know?:
We are using kubernetes version 1.14.9.
This works fine in version 1.13.12 but does not work for version 1.14.9
By looking at the controller manager log, it seems the volume manager never get notified the pod is scheduled.

Environment:

  • Kubernetes version (use kubectl version): 1.14.9
  • Cloud provider or hardware configuration: aws
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Network plugin and version (if this is a network-related bug):
  • Others:
@tianhe-oi tianhe-oi added the kind/bug Categorizes issue or PR as related to a bug. label Dec 13, 2019
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Dec 13, 2019
@tianhe-oi
Copy link
Author

/sig scheduling

@k8s-ci-robot k8s-ci-robot added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Dec 13, 2019
@tianhe-oi
Copy link
Author

Any ideas for this?

@tianhe-oi
Copy link
Author

It happens we need to enable CheckVolumeBinding in the scheduler predicates.

@tianhe-oi
Copy link
Author

Closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling.
Projects
None yet
Development

No branches or pull requests

2 participants