Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Release-1.19] Restoration fails on etcd only nodes due to invalid state of loadbalancers #1659

Closed
galal-hussein opened this issue Aug 18, 2021 · 1 comment
Assignees
Labels
kind/bug Something isn't working

Comments

@galal-hussein
Copy link
Contributor

Environmental Info:
RKE2 Version: v1.19.x

Cluster Configuration:

  • 1 etcd only node server
  • 1 controlplane node servers

Describe the bug:
upon restoring on an existing etcd only node, it tries to setup internal load balancer with existing configuration, and tries to connect to old supervisor servers, which may or may not exist and fails to restore wiht the following error:

msg=\"Waiting to retrieve agent configuration; server is not ready: Get \\\"https://127.0.0.1:6444/v1-rke2/serving-kubelet.crt\\\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)\""
``
@galal-hussein galal-hussein added the kind/bug Something isn't working label Aug 18, 2021
@galal-hussein galal-hussein self-assigned this Aug 18, 2021
@galal-hussein galal-hussein added this to To Triage in Development [DEPRECATED] via automation Aug 18, 2021
@galal-hussein galal-hussein moved this from To Triage to Working in Development [DEPRECATED] Aug 18, 2021
@brandond brandond added this to the v1.19.14+rke2r1 milestone Aug 18, 2021
@brandond brandond moved this from Working to To Test in Development [DEPRECATED] Aug 18, 2021
@rancher-max
Copy link
Contributor

Validated on v1.19.14-rc1+rke2r1

Confirmed during restoration load balancer gets reset:

...
INFO[0000] Running load balancer 127.0.0.1:6444 -> [127.0.0.1:9345] 
...
INFO[0000] Running load balancer 127.0.0.1:6445 -> [127.0.0.1:6443] 
...

Then when restarting the service afterwards without restoration flags, the loadbalancer looks for the control-plane again:

...
Aug 24 18:42:25 ip-172-31-28-36 rke2[6247]: time="2021-08-24T18:42:25Z" level=info msg="Running load balancer 127.0.0.1:6444 -> [172.31.28.195:9345 127.0.0.1:9345]"
...

Development [DEPRECATED] automation moved this from To Test to Done Issue / Merged PR Aug 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
No open projects
Development [DEPRECATED]
Done Issue / Merged PR
Development

No branches or pull requests

3 participants