Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster does not come up in Rancher when flatcar-linux-stable-2605.6.0 nodes are used #29841

Closed
sowmyav27 opened this issue Oct 29, 2020 · 6 comments
Assignees
Labels
kind/bug-qa Issues that have not yet hit a real release. Bugs introduced by a new feature or enhancement
Milestone

Comments

@sowmyav27
Copy link
Contributor

What kind of request is this (question/bug/enhancement/feature request): bug

Steps to reproduce (least amount of steps as possible):

  • Bring up a custom cluster on 2.5-head using flatcar-linux-stable-2605.6.0 nodes
  • 1 etcd, 1 control and 3 worker nodes.
  • etcd and worker nodes are seen Active state.
  • But the Cluster is in error state with error Cluster health check failed: Failed to communicate with API server: Get "https://<controlplane ip>:6443/api/v1/namespaces/kube-system?timeout=45s": context deadline exceeded

Expected Result:
Cluster should deploy successfully.

Other details that may be helpful:

Environment information

  • Rancher version (rancher/rancher/rancher/server image tag or shown bottom left in the UI): 2.5-head - commit id: 0465f0
  • Installation option (single install/HA): Single node

Cluster information

  • Cluster type (Hosted/Infrastructure Provider/Custom/Imported): custom cluster
  • Kubernetes version (use kubectl version):
1.19.3
@sowmyav27 sowmyav27 added the kind/bug-qa Issues that have not yet hit a real release. Bugs introduced by a new feature or enhancement label Oct 29, 2020
@sowmyav27 sowmyav27 added this to the v2.5.2 milestone Oct 29, 2020
@sowmyav27 sowmyav27 self-assigned this Oct 29, 2020
@maggieliu maggieliu modified the milestones: v2.5.2, v2.5.3 Oct 29, 2020
@sowmyav27 sowmyav27 assigned aaronyeeski and unassigned sowmyav27 Nov 11, 2020
@superseb
Copy link
Contributor

@sowmyav27 @aaronrancher Can we get the exact AMI and instance types used for this?

@superseb
Copy link
Contributor

Can we also confirm this was applied? rancher/rke#1744 (comment)

@sowmyav27
Copy link
Contributor Author

Post offline discussion with @superseb Will validate this usecase after applying this cluster.yml

@sowmyav27
Copy link
Contributor Author

On 2.5.2, when this cluster.yml was applied, the clusters failed to come up.

  • Deploy custom cluster with ^^^ cluster.yaml
  • instance_type: t3a.medium/t3.xlarge. 1 etcd/control plane 2 worker nodes
  • Cluster shows error Cluster health check failed: Failed to communicate with API server: Get "https://<controlplane-ip>:6443/api/v1/namespaces/kube-system?timeout=45s": context deadline exceeded

@sowmyav27
Copy link
Contributor Author

  • On 2.5.2, Apply this cluster.yml and after running systemctl enable docker.service on the node, the node registered in the cluster successfully.
  • instance type: t3a.medium

@sowmyav27
Copy link
Contributor Author

sowmyav27 commented Nov 26, 2020

On 2.5-head - commit id: 9ee39b9f
Downstream clusters

  • Verified deploying downstream clusters using cluster.yml.
  • Validation tests passed on these clusters(automation check)

HA setup:

  • Deployed HA using this network setting in rke local cluster using cluster.yml.
  • Deployed a downstream cluster using the same config. Validation tests passed on this cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug-qa Issues that have not yet hit a real release. Bugs introduced by a new feature or enhancement
Projects
None yet
Development

No branches or pull requests

4 participants