-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS cluster fails to create - ebs-csi-controller stays pending #15335
Comments
Can you describe the pod to see why its pending? It may be that your cluster doesn't have enough capacity. |
I'm seeing the same error as @siddharth-sable and @daniejstriata as well. I've run the install process 3x in different regions and accounts. I ran through the same steps as above. My versions are a bit different though. kops version: 1.27.0 |
I have the very samne issue. After all these months, why is this still a problem? How do we overcome this problem? Launched in us-west-1b. NODE STATUS VALIDATION ERRORS Validation Failed Client Version: v1.28.2 |
Describing the pod mentioned an untolerated taint, just like the error message above. This seems to be a kops bug. How can it be overcome? Type Reason Age From Message Warning FailedScheduling 10m default-scheduler 0/1 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.. |
Used kops to stand the cluster up in a different region. us-east-2a. Very same error. This seems to be a pervasive kops issue, not an AWS region status issue. NODE STATUS VALIDATION ERRORS Validation Failed |
OK, I think I found a workaround. You must specify more than one availability zone when you stand up the cluster. When I specified three instead of one, it worked! INSTANCE GROUPS NODE STATUS Your cluster myfirstcluster.k8s.local is ready |
Take a look at #15852 |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/kind bug
1. What
kops
version are you running? The commandkops version
, will displaythis information.
1.25.4
1.26.2
2. What Kubernetes version are you running?
kubectl version
will print theversion if a cluster is running or provide the Kubernetes version specified as
a
kops
flag.Client Version: v1.26.4
Kustomize Version: v4.5.7
Server Version: v1.25.8
3. What cloud provider are you using?
aws
4. What commands did you run? What is the simplest way to reproduce this issue?
kops-1.25.4 create cluster --name=${NAME} --cloud=aws --zones=us-east-2a --discovery-store=s3://k8s-oidc-store --ssh-public-key ~/.ssh/srv.k8s.pub --yes
5. What happened after the commands executed?
The cluster started with a master and node but it never completed as the process does not get past:
Pod kube-system/ebs-csi-controller-6c85d9666b-6bbk7 system-cluster-critical pod "ebs-csi-controller-6c85d9666b-6bbk7" is pending
6. What did you expect to happen?
Creation of cluster
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml
to display your cluster manifest.You may want to remove your cluster name and other sensitive information.
```W0418 16:58:49.238805 13028 get.go:78]
kops get [CLUSTER]
is deprecated: use `kops get all [CLUSTER]`apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
creationTimestamp: "2023-04-18T20:47:23Z"
name: k8s.com
spec:
api:
dns: {}
authorization:
rbac: {}
channel: stable
cloudProvider: aws
configBase: s3://kubert-store/k8s.com
etcdClusters:
etcdMembers:
instanceGroup: master-us-east-2a
name: a
memoryRequest: 100Mi
name: main
etcdMembers:
instanceGroup: master-us-east-2a
name: a
memoryRequest: 100Mi
name: events
iam:
allowContainerRegistry: true
legacy: false
useServiceAccountExternalPermissions: true
kubelet:
anonymousAuth: false
kubernetesApiAccess:
kubernetesVersion: 1.25.8
masterPublicName: api.k8s.com
networkCIDR: 172.20.0.0/16
networking:
kubenet: {}
nonMasqueradeCIDR: 100.64.0.0/10
serviceAccountIssuerDiscovery:
discoveryStore: s3://kubert-oidc-store/k8s.com
enableAWSOIDCProvider: true
sshAccess:
subnets:
name: us-east-2a
type: Public
zone: us-east-2a
topology:
dns:
type: Public
masters: public
nodes: public
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: "2023-04-18T20:47:23Z"
labels:
kops.k8s.io/cluster: k8s.com
name: master-us-east-2a
spec:
image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20230302
instanceMetadata:
httpPutResponseHopLimit: 3
httpTokens: required
machineType: t3.medium
maxSize: 1
minSize: 1
role: Master
subnets:
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: "2023-04-18T20:47:23Z"
labels:
kops.k8s.io/cluster: k8s.com
name: nodes-us-east-2a
spec:
image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20230302
instanceMetadata:
httpPutResponseHopLimit: 1
httpTokens: required
machineType: t3.medium
maxSize: 1
minSize: 1
role: Node
subnets:
The text was updated successfully, but these errors were encountered: