Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write ADR for load-balancers #227

Open
9 tasks
garloff opened this issue Dec 5, 2022 · 6 comments · Fixed by SovereignCloudStack/standards#169
Open
9 tasks

Write ADR for load-balancers #227

garloff opened this issue Dec 5, 2022 · 6 comments · Fixed by SovereignCloudStack/standards#169
Assignees
Labels
Container Issues or pull requests relevant for Team 2: Container Infra and Tooling SCS is standardized SCS is standardized standards Issues / ADR / pull requests relevant for standardization & certification

Comments

@garloff
Copy link
Contributor

garloff commented Dec 5, 2022

As a KaaS operator, I want to understand exactly what standards my load-balancer must fulfill by reading the relevant ADR.

TODO: Write ADR (according to SCS-0001 standard) to document the requirements tested for in #212.

Definition of Ready:

  • User Story is small enough to be finished within one sprint
  • User Story is clear and understood by the whole team
  • Acceptance criteria are defined
  • Acceptance criteria are clear and understood by the whole team

Definition of Done:

  • All acceptance criteria are met
  • Changes have been reviewed
  • CI tests have run successfully
  • Documentation has been updated
  • Release Notes have been updated
@garloff garloff added Container Issues or pull requests relevant for Team 2: Container Infra and Tooling SCS is standardized SCS is standardized standards Issues / ADR / pull requests relevant for standardization & certification labels Dec 5, 2022
@garloff garloff assigned garloff and joshmue and unassigned garloff Dec 5, 2022
@garloff
Copy link
Contributor Author

garloff commented Dec 7, 2022

nginx-ingress expects that the single nginx instance on a multi-node cluster works despite externalTrafficPolicy: Local (which means kube-proxy is NOT forwarding connections) on a multi-node cluster. Apparently, nginx-ingress expects the LoadBalancer to detect which nodes are up and load-balance only to the ones where nginx is listening.
This is the most efficient way (by avoiding unneeded forwards), but one could question whether loadbalancers should always detect nodes being down. With the default setting of the OpenStack Cloud Controller Manager, loadbalancers do NOT fulfill this: They don't automatically get a health monitor. This is not unreasonable either -- health monitors can always be requested, though unfortunately, this is not standardized but can only be achieved by an OpenStack specific annotation. Which makes things non-portable :-(
So we will have to default to health-monitors be enabled ...

@garloff
Copy link
Contributor Author

garloff commented Dec 7, 2022

TODO:

  • Investigate why the default to NOT have health monitors was chosen.
  • Investigate whether opt-out (by specific annotation) is still possible
  • Compare to other CCMs and their defaults

@garloff
Copy link
Contributor Author

garloff commented Dec 19, 2022

@garloff and @joshmue to look at OCCM in first week of Jan.

@garloff
Copy link
Contributor Author

garloff commented Jan 16, 2023

Needs more discussion:

  • Making traffic flow work (with the help of the health monitor) may result in something half working for the user:
    • externalTrafficPolicy: Local results in working traffic
    • BUT the client IPs are not preserved!

So: Investigate whether we can address the client IP thing:

  • proxy protocol (working for http/https like protocols at least)
  • generic approach using L3 loadbalancer (DSR like), maybe possible with ovn magic ...

@jschoone
Copy link

Hi @joshmue. Are we still on track with this?

@joshmue
Copy link

joshmue commented Apr 14, 2023

The ADR in SovereignCloudStack/standards#169 is pretty much complete. IIRC, the decision that is documented in this record did also match the consensus in a (team?) meeting.
Hence, it just needs to be pushed forward according to the according process.

@jschoone jschoone linked a pull request Aug 25, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Container Issues or pull requests relevant for Team 2: Container Infra and Tooling SCS is standardized SCS is standardized standards Issues / ADR / pull requests relevant for standardization & certification
Projects
Status: Blocked / On hold
Development

Successfully merging a pull request may close this issue.

3 participants