Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

capo: Failure to add kubeapi server members to LB #265

Closed
garloff opened this issue Sep 7, 2022 · 3 comments
Closed

capo: Failure to add kubeapi server members to LB #265

garloff opened this issue Sep 7, 2022 · 3 comments
Assignees
Labels
Container Issues or pull requests relevant for Team 2: Container Infra and Tooling enhancement New feature or request on hold Is on hold Sprint Montreal Sprint Montreal (2023, cwk 40+41) upstream Implemented directly in the upstream

Comments

@garloff
Copy link
Contributor

garloff commented Sep 7, 2022

In my testing with cluster-API, I have seen several times that control nodes that were spun up successfully by capo did not show up as backend members in the loadbalancer's pool that is in front of the kubernetes API.
Manually adding them quickly would help the cluster creation to succeed. (The health check does test and mark the members ONLINE and the further cluster roll-out succeeds.)
If that is not done, the control-plane machines go into a failed state and the cluster needs manual intervention to recover :-(
This is with capi-1.1.5 and capo-0.6.2.

This needs investigation:

  • Why is the member not registered? Does capo try and fail (e.g. on an PENDING_UPDATE LB state)?
  • Why is this not reconciled? A totally healthy LB that lacks a member can easily be added one, no difficult considerations here ...
@garloff
Copy link
Contributor Author

garloff commented Sep 7, 2022

Adding @batistein, @janiskemper, @curx, @flyersa just in case they have seen this happening before and want to add color.

One additional side-note: This seems to be a new phenomenon -- I have not observed this behavior with capi-1.0.5+capo-0.5.3 which we had in R2.

@garloff
Copy link
Contributor Author

garloff commented Sep 12, 2022

Well, the test environment has this "LB and VMs sometimes takes 5 additional minutes" right now, capi/capo do not seem to be very robust against this.

@batistein
Copy link
Contributor

This is unfortunately only due to capo... we should definitely put resources into fixing the known issues in capo.

@garloff garloff added enhancement New feature or request upstream Implemented directly in the upstream Container Issues or pull requests relevant for Team 2: Container Infra and Tooling labels Sep 18, 2022
@batistein batistein removed their assignment Jan 18, 2023
@jschoone jschoone added the on hold Is on hold label Oct 10, 2023
@jschoone jschoone reopened this Oct 11, 2023
@jschoone jschoone closed this as not planned Won't fix, can't repro, duplicate, stale Oct 11, 2023
@jschoone jschoone added the Sprint Montreal Sprint Montreal (2023, cwk 40+41) label Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Container Issues or pull requests relevant for Team 2: Container Infra and Tooling enhancement New feature or request on hold Is on hold Sprint Montreal Sprint Montreal (2023, cwk 40+41) upstream Implemented directly in the upstream
Projects
Archived in project
Development

No branches or pull requests

6 participants