Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic volume provisioning creates EBS volume in the wrong availability zone #39178

Closed
jimmycuadra opened this issue Dec 23, 2016 · 79 comments
Closed
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@jimmycuadra
Copy link
Contributor

What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.): dynamic volume provisioning


Is this a BUG REPORT or FEATURE REQUEST? (choose one): bug report

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1", GitCommit:"82450d03cb057bab0950214ef122b67c83fb11df", GitTreeState:"clean", BuildDate:"2016-12-22T13:59:22Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1", GitCommit:"82450d03cb057bab0950214ef122b67c83fb11df", GitTreeState:"clean", BuildDate:"2016-12-14T00:52:01Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration: AWS
  • OS (e.g. from /etc/os-release): CoreOS 1185.5.0
  • Kernel (e.g. uname -a): Linux ip-10-0-1-121.ec2.internal 4.7.3-coreos-r3 Unit test coverage in Kubelet is lousy. (~30%) #1 SMP Wed Dec 7 09:29:55 UTC 2016 x86_64 Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz GenuineIntel GNU/Linux
  • Install tools: kaws
  • Others:

What happened:

Created a stateful set with a persistent volume claim. Dynamic volume provisioning created an EBS volume in the us-east-1a availability zone, despite all the masters and nodes in the cluster being in us-east-1e. I tried it twice with the same results both times.

The PVC:

$ kubectl describe pvc -n errbit
Name:		mongodb-mongodb-0
Namespace:	errbit
StorageClass:	standard
Status:		Bound
Volume:		pvc-5fc8e90b-c8aa-11e6-9924-069508572ed2
Labels:		app=errbit
		component=mongodb
Capacity:	1Gi
Access Modes:	RWO
No events.

The PV created by dynamic provisioning:

$ kubectl describe pv
Name:		pvc-5fc8e90b-c8aa-11e6-9924-069508572ed2
Labels:		failure-domain.beta.kubernetes.io/region=us-east-1
		failure-domain.beta.kubernetes.io/zone=us-east-1a
StorageClass:	standard
Status:		Bound
Claim:		errbit/mongodb-mongodb-0
Reclaim Policy:	Delete
Access Modes:	RWO
Capacity:	1Gi
Message:
Source:
    Type:	AWSElasticBlockStore (a Persistent Disk resource in AWS)
    VolumeID:	aws://us-east-1a/vol-0d32c25dc73f029af
    FSType:	ext4
    Partition:	0
    ReadOnly:	false
No events.

The stateful set:

$ kubectl describe statefulset mongodb -n errbit
Name:			mongodb
Namespace:		errbit
Image(s):		mongo:3.4.0
Selector:		app=errbit,component=mongodb
Labels:			app=errbit,component=mongodb
Replicas:		1 current / 1 desired
Annotations:		kubectl.kubernetes.io/last-applied-configuration={"kind":"StatefulSet","apiVersion":"apps/v1beta1","metadata":{"name":"mongodb","namespace":"errbit","creationTimestamp":null,"labels":{"app":"errbit","component":"mongodb"}},"spec":{"replicas":1,"template":{"metadata":{"creationTimestamp":null,"labels":{"app":"errbit","component":"mongodb"}},"spec":{"containers":[{"name":"mongodb","image":"mongo:3.4.0","args":["--auth"],"ports":[{"name":"mongodb","containerPort":27017}],"resources":{},"volumeMounts":[{"name":"mongodb","mountPath":"/data/db"}]}]}},"volumeClaimTemplates":[{"metadata":{"name":"mongodb","creationTimestamp":null},"spec":{"accessModes":["ReadWriteOnce"],"resources":{"requests":{"storage":"1Gi"}}},"status":{}}],"serviceName":"mongodb"},"status":{"replicas":0}}
CreationTimestamp:	Thu, 22 Dec 2016 16:53:24 -0800
Pods Status:		0 Running / 1 Waiting / 0 Succeeded / 0 Failed
No volumes.
Events:
  FirstSeen	LastSeen	Count	From		SubObjectPath	Type		Reason			Message
  ---------	--------	-----	----		-------------	--------	------			-------
  12m		12m		1	{statefulset }			Normal		SuccessfulCreate	pet: mongodb-0
  11m		11m		1	{statefulset }			Normal		SuccessfulCreate	pvc: mongodb-mongodb-0

The stateful set's pod, pending due to the volume being in the wrong zone:

$ kubectl describe pod mongodb-0 -n errbit
Name:		mongodb-0
Namespace:	errbit
Node:		/
Labels:		app=errbit
		component=mongodb
Status:		Pending
IP:
Controllers:	StatefulSet/mongodb
Containers:
  mongodb:
    Image:	mongo:3.4.0
    Port:	27017/TCP
    Args:
      --auth
    Volume Mounts:
      /data/db from mongodb (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-mdqdk (ro)
    Environment Variables:	<none>
Conditions:
  Type		Status
  PodScheduled 	False
Volumes:
  mongodb:
    Type:	PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:	mongodb-mongodb-0
    ReadOnly:	false
  default-token-mdqdk:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	default-token-mdqdk
QoS Class:	BestEffort
Tolerations:	<none>
Events:
  FirstSeen	LastSeen	Count	From			SubObjectPath	Type		Reason			Message
  ---------	--------	-----	----			-------------	--------	------			-------
  15m		15m		3	{default-scheduler }			Warning		FailedScheduling	[SchedulerPredicates failed due to PersistentVolume 'pvc-a5f5e714-c8a8-11e6-9924-069508572ed2' not found, which is unexpected., SchedulerPredicates failed due to PersistentVolume 'pvc-a5f5e714-c8a8-11e6-9924-069508572ed2' not found, which is unexpected.]
  14m		14m		1	{default-scheduler }			Warning		FailedScheduling	[SchedulerPredicates failed due to PersistentVolumeClaim is not bound: "mongodb-mongodb-0", which is unexpected., SchedulerPredicates failed due to PersistentVolumeClaim is not bound: "mongodb-mongodb-0", which is unexpected.]
  14m		2s		52	{default-scheduler }			Warning		FailedScheduling	pod (mongodb-0) failed to fit in any node
fit failure summary on nodes : NoVolumeZoneConflict (2)

The nodes, all in us-east-1e:

$ kubectl describe nodes | grep zone
			failure-domain.beta.kubernetes.io/zone=us-east-1e
			failure-domain.beta.kubernetes.io/zone=us-east-1e
			failure-domain.beta.kubernetes.io/zone=us-east-1e
			failure-domain.beta.kubernetes.io/zone=us-east-1e

What you expected to happen:

Dynamic volume provisioning should have created the required volume in the us-east-1e availability zone.

How to reproduce it (as minimally and precisely as possible):

Add the following storage class to the cluster:

---
kind: "StorageClass"
apiVersion: "storage.k8s.io/v1beta1"
metadata:
  name: "standard"
  annotations:
    storageclass.beta.kubernetes.io/is-default-class: "true"
provisioner: "kubernetes.io/aws-ebs"
parameters:
  type: "gp2"
  encrypted: "true"

Create the following stateful set and service:

---
kind: "Namespace"
apiVersion: "v1"
metadata:
  name: "errbit"
---
kind: "Service"
apiVersion: "v1"
metadata:
  name: "mongodb"
  namespace: "errbit"
  labels:
    app: "errbit"
    component: "mongodb"
spec:
  ports:
    - name: "mongodb"
      port: 27017
  clusterIP: "None"
  selector:
    app: "errbit"
    component: "mongodb"
---
kind: "StatefulSet"
apiVersion: "apps/v1beta1"
metadata:
  name: "mongodb"
  namespace: "errbit"
  labels:
    app: "errbit"
    component: "mongodb"
spec:
  serviceName: "mongodb"
  replicas: 1
  template:
    metadata:
      labels:
        app: "errbit"
        component: "mongodb"
    spec:
      containers:
        - name: "mongodb"
          image: "mongo:3.4.0"
          args:
            - "--auth"
          ports:
            - containerPort: 27017
              name: "mongodb"
          volumeMounts:
            - name: "mongodb"
              mountPath: "/data/db"
  volumeClaimTemplates:
    - metadata:
        name: "mongodb"
      spec:
        accessModes:
          - "ReadWriteOnce"
        resources:
          requests:
            storage: "1Gi"
@zachaller
Copy link
Contributor

I am having this happen however we have nodes in each region and it seems to not be able to get the pod and the volume together in the same az, was setup with kubeadm with modifications to work with aws

@justinsb
Copy link
Member

Never heard of kaws, but kubeadm is still in alpha and cloudprovider integration is not supported (i.e. expected to be broken).

You should probably be using kops to install on AWS.

@jimmycuadra
Copy link
Contributor Author

kaws is our own installation system we've been using from the start. I don't think this bug has anything to do with the cluster creation tool. The correct cloud provider flags are passed to each Kubernetes component and other AWS-specific cloud provider functionality works.

@jimmycuadra
Copy link
Contributor Author

If someone can point me to the area of the codebase where the decision about where to provision a dynamic volume in AWS is made, I can try to figure this out myself!

@jimmycuadra
Copy link
Contributor Author

I think I see where the issue is. The docs for EBS provisioning say:

zone: AWS zone. If not specified, a random zone from those where Kubernetes cluster has a node is chosen.

However, I have not found any logic that chooses the zone that way.

aws.Cloud.CreateDisk calls its own aws.Cloud.getAllZones method to populate the list of zones to choose from when creating a disk when the storage class/PVC doesn't request a specific zone. But getAllZones gets the zones of all EC2 instances, not filtered to the Kubernetes cluster by any means at all. This list of zones is passed to volume.util. ChooseZoneForVolume to pick a zone from the collection, but that function only attempts to distribute PVs across the provided zones. As such, if you have any EC2 instances running in a zone other than where your Kubernetes nodes are running, Kubernetes may pick the wrong zone.

@Dmitry1987
Copy link
Contributor

+1 , happened to me too, trying to setup mongodb from a helm chart. I have a few test nodes of K8s, all in same zone, but the volumes it provisioned are in another zone, so creation of pods got stuck on
fit failure summary on nodes : NoVolumeZoneConflict (2), PodToleratesNodeTaints (1)

the only way to overcome this for now is to have minions in all zones, so one of them can accept your dynamic volume with the pod? It's still a problem on AWS... if I store a few TB of data on a volume, and trust it to be migrated during failover to another node in cluster (where failed pod will be re-created), I will be surprised to see it stuck, because K8s will try to launch a pod on any other node, with no regards to its AZ.

But this issue is related more to cloud provider than K8s itself ... maybe on GCE it will not happen.

@Dmitry1987
Copy link
Contributor

btw, I also tried to set a default storage class, in the same availability zone where I have the nodes, with this settings:

kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
  name: default
  annotations:
    storageclass.beta.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
  zone: us-west-2a

But the 3 volumes got created in all 3 AZs, (us-west-2a, 2b, and 2c). Weird... I have only this storage class, why would it create the volumes in 3 AZs even when explicitly told to use 'zone: us-west-2a' in storage class...

@jsafrane
Copy link
Member

@jimmycuadra, you correctly found getAllZones, however you missed the part where it filters out all instances that are not tagged with "KubernetesCluster" tag with a specific value, it's well hidden :-).

So, tag all your AWS instances that are part of your cluster with "KubernetesCluster=jimmy" (incl. masters!) and restart Kubernetes. It should create volumes only in zones where there is an instance with the tag. You can run multiple clusters under one AWS project, as long as they have different values of KubernetesCluster tag.

@justinsb, btw, is it documented anywhere?

@jsafrane
Copy link
Member

@Dmitry1987, that's odd, your PVs should respect parameters of storage class. Show me your PVC. It should ignore any storage class only if you use annotation volume.alpha.kubernetes.io/storage-class (note alpha there, it's important).

@Dmitry1987
Copy link
Contributor

Hi @jsafrane , yes, it was alpha in that mongodb helm chart that I used, noticed it only after some time. thanks!

@Dmitry1987
Copy link
Contributor

it worked well when i forked it and changed to "beta", then used in helm as a file, not from repo link. i have this all working now :) .

@jsafrane
Copy link
Member

@Dmitry1987. thanks for confirmation.

@jimmycuadra, did you try KubernetesCluster tag on your AWS instances? Can we close this issue?

@jimmycuadra
Copy link
Contributor Author

@jsafrane No, I have not confirmed that that works yet. In either case, this should be left open until the need for a KubernetesCluster tag is documented, as that is apparently critical for custom clusters. This was supposed to be tracked in #11884, but it was closed and no one has responded to my questions, which were not answered before the issue was closed.

jimmycuadra added a commit to InQuicker/kaws that referenced this issue Jan 11, 2017
A tag with this specific key is expected by Kubernetes cloud provider logic and
used to determine which cluster a given AWS resource belongs to, even if the
clusters are isolated by VPC.

See kubernetes/kubernetes#39178 and kubernetes/kubernetes#11884.
@jimmycuadra
Copy link
Contributor Author

I'm afraid I can't confirm that adding the KubernetesCluster tag with a unique value per cluster results in the behavior I'd expect. I've tagged our clusters accordingly, restarted the Kubernetes components (apiservers, controller managers, and schedulers), and created a new stateful set with the same configuration from the issue description, but Kubernetes still creates the PV in us-east-1a despite all nodes being in us-east-1e.

@jimmycuadra
Copy link
Contributor Author

Would it be possible for the cloud provider to just use the Kubernetes labels on nodes to determine a zone, rather than using AWS API calls to try to determine which nodes should be used? It would need to look for any schedulable nodes (i.e. not --register-schedulable=false) and look at the failure-domain.beta.kubernetes.io/region and failure-domain.beta.kubernetes.io/zone labels.

@jsafrane
Copy link
Member

@justinsb, this idea of using Kunernetes nodes + labels instead of loading instances from AWS looks interesting to me. Could it fit into your attempt to add a caching layer to AWS provider?

@jimmycuadra
Copy link
Contributor Author

Any updates from the storage and/or AWS teams on this? We're currently unable to use dynamic volume provisioning because of this problem.

@kingdonb
Copy link

kingdonb commented Mar 1, 2017

@jimmycuadra Are you adding the label to a node of an already started cluster?

I'm not sure this will be effective. You might have to add the label before your kube-system namespace is started for the first time. Not sure if you're using kubeadm or something else to start your cluster, but for me, I definitely hit this issue immediately on the first try for my AWS project team that has instances in multiple AZ's.

But if I add the KubernetesCluster label before bringing the cluster up with kubeadm --cloud-provider=aws, I get:

I0301 13:41:22.736527       1 aws.go:836] AWS cloud filtering on tags: map[KubernetesCluster:kube-kingdon]

in my logs of kube-controller-manager-ip-172-29-151-151.ec2.internal pod. Now my PVCs seem to be creating PVs dynamically without issue, in the right AZ every time.

@kingdonb
Copy link

kingdonb commented Mar 1, 2017

Something else changes after I bring up my nodes, it might be DNS propagation delay,

When my EC2 host comes up and I've configured it using kubeadm, it's a node named ip-172-29-151-151. I may kubeadm reset and kubeadm init --cloud-provider=aws again, and after some time elapses it names the node ip-172-29-151-151.ec2.internal instead. (It appears that the right thing happens if I just ensure that my hostname is set to the FQDN before bringing up the kube node.)

This cluster is private-only and exists inside of a VPC, that's why I'm using private addresses...

This DNS name does not seem to resolve at any time before or after I notice this change, I haven't been able to narrow down why this happens or what else is changing to make the node change its mind about what the actual hostname is, but after whatever it is happens, I have noticed surprises in some of the pod logs to the effect of "couldn't find the node you're talking about" with references to the new/old name. That might be resulting in permissions errors for the backend processes that are supposed to be creating and destroying ELBs, PVs, and SGs in my AWS team... since it looks up the FQDN version of the node's hostname, and can't find a node with that name.

All of this confusion has basically convinced me that I don't want to use an alpha version of kubeadm for any serious production deployment, or anything else custom, and I'll be using kops or kube-aws to build my permanent AWS deployment instead (as the official documentation recommends.)

@jimmycuadra
Copy link
Contributor Author

We use an in-house cluster deploy tool called kaws. I made sure to restart the kubernetes system components (e.g. controller manager) after applying the EC2 tags, so they should be visible for the cloud provider logic. Our nodes have always been named with the EC2 internal DNS. No problems with the cloud provider logic there—it's just about dynamic volume placement.

@jfoy
Copy link
Contributor

jfoy commented Apr 7, 2017

Under k8s 1.5.2, we have found that tagging EC2 instances with KubernetesCluster=foo works for us, with one wrinkle -- we also have to ensure that we are exclusively tagging the Kubernetes controller and worker instances, instead of all instances under the VPC. (All instances are tagged by default when you set the tag via a CloudFormation option.)

With instance tags correctly set, we see the expected behavior: PVCs using storage-provisioner aws-ebs are bound to PVs created in AZs containing cluster nodes.

@k8s-github-robot k8s-github-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label May 31, 2017
@PaulFurtado
Copy link

It looks like PVC logic will also create the volume in the wrong availability zone if you specify a nodeSelector on the pod to attach it to.

There must be a better way to select the availability zone than querying the AWS API for the KubernetesCluster tag so that kubernetes pod placement logic is actually considered in the process.

@jimmycuadra
Copy link
Contributor Author

I think using the information the API server has about the nodes is a more resilient approach. See my previous comment: #39178 (comment)

@donjohnson
Copy link

I am seeing the same underlying behavior (full unfiltered list of zones being passed to ChooseZoneForVolume, volume placed in non-k8s zone) with GKE on Google Cloud.

Can we treat this issue as platform-agnostic, or should I create a separate issue?

@spiffxp
Copy link
Member

spiffxp commented Jun 16, 2017

/sig aws

@msau42
Copy link
Member

msau42 commented Mar 27, 2018

@gnufied @jsafrane has the first issue been fixed for aws?

@StephanX
Copy link

@msau42 In my experience, it isn't simply that volumes are provisioned in availability zones where nodes do not exist, but also that pods are scheduled independently from PVs (on statefulset creation), then PVs are created without regard for scheduled pod locations. Later, if a pod is reaped, there's no guarantee that it will be rescheduled in an AZ that matches the existing PV.

My best effort work around has been to create custom storage classes to pin the statefulset to a zone, which negates the benefits of a cluster that spans multiple AZs.

@msau42
Copy link
Member

msau42 commented Mar 27, 2018

@StephanX agree, but because the solutions for the two are completely different, I want to split them out into separate issues and track them separately.

@ankurgadgilwar
Copy link

ankurgadgilwar commented Apr 18, 2018

Hello all,
I am pretty new to Kubernetes. What I am trying to do is get a VolumeClaimTemplate to attach to a storage class that i have defined already.
The question I have is, would the claim bind itself to the StorageClass mentioned in the YAML file if I haven't defined the volumes part before the volumeClaimTemplate part?
volumeMounts: name: zk-datadir annotations: volume.alpha.kubernetes.io/storage-class: gold - mountPath: /var/lib/zookeeper volumeClaimTemplates:
- metadata: name: zk-datadir spec: accessModes: [ "ReadWriteOnce" ] resources: requests: storage: 1Gi
Will this work if i havent defined the volume?

@ghost
Copy link

ghost commented May 1, 2018

I'm seeing this issue as well, but the only workaround when dealing with large clusters across all AZs is to use an EFS mount, which is less than ideal.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 30, 2018
@007
Copy link

007 commented Jul 30, 2018

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 30, 2018
@piraluc
Copy link

piraluc commented Aug 27, 2018

Just in case somebody wants to cross-check: After removing the - from the name of my release it worked. So instead of helm install ./foo --name foo-bar --namespace foo I used helm install ./foo --name foobar --namespace foo.

@edwardsmit
Copy link

Found this issue whilst researching the curious problem that I have a free node where my stateful set instance could be created, but that kubernetes keeps reporting that it can't schedule the instance as there are no nodes with sufficient memory available (there is one in zone-a) but that the PVC creates a Volume in zone-c. Resulting in the conclusion that there aren't any nodes available for this instance.
I found enough pointers here to understand the problem and to manual circumvent my current problem, but I was wondering if there is any progress been made on this issue.

@msau42
Copy link
Member

msau42 commented Nov 5, 2018

@edwardsmit Topology aware volume provisioning in 1.12 should help with provisioning volumes in zones that can meet your Pod scheduling requirements.

@edwardsmit
Copy link

Thank you for the response @msau42, do you know of an issue I can follow?

@msau42
Copy link
Member

msau42 commented Nov 5, 2018

Feature issue is here: kubernetes/enhancements#490
1.12 blog post with examples is here: https://kubernetes.io/blog/2018/10/11/topology-aware-volume-provisioning-in-kubernetes/
Official documentation is here: https://kubernetes.io/docs/concepts/storage/storage-classes/#volume-binding-mode

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 3, 2019
@lig
Copy link

lig commented Feb 4, 2019

@msau42 this one looks resolved

@bartelsb
Copy link

@msau42 Does the new feature handle the case where a node with an existing pod/volume goes down and the pod gets moved to a node in a different AZ? Specifically, does Kubernetes handle copying the existing volume to a new volume in the new AZ? If not, when the pod gets redeployed in a new AZ, it would no longer have access to any data from the old volume.

I reviewed both the blog post and the official documentation, but I didn't see anything that addressed this specific case. Thanks!

@msau42
Copy link
Member

msau42 commented Feb 19, 2019

No, data migration is not part of this feature. The feature only handles initial provisioning of a volume. Once the volume is provisioned, it must always be scheduled to a node in the same zone.

If you need to handle zone outages, you will need to use a storage system that does cross zone replication.

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 21, 2019
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@silashansen
Copy link

We have different node-types for different workloads. This is accomplished by using node taints.
Some node-types run in multiple zones and others don't.
Since the unique list of zones for the entire cluster make up the list of possible zones for the PV, we ended up with PV's in an zone that didn't have any nodes that our workload could be scheduled on.

Only looking at zones for nodes that where the pod can actually be scheduled, when selecting the zone for the PV, would be great!

@RobertFr0st
Copy link

This appears to me to still be an ongoing issue.

@Efrat19
Copy link

Efrat19 commented Nov 16, 2021

still happens on EKS 1.21

@cprak-nydig
Copy link

still running into this issue on 1.21.5

@onelapahead
Copy link

So the issue has been solved as mentioned in #39178 (comment)

You need to have a StorageClass which uses WaitForFirstConsumer as the bindingMode. Otherwise with the Immediate mode that most default classes use, the volume provisioner will select any AZ which might not align with where the consumer of the volume ends up getting scheduled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests