Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(kuma-cp): reserve VIPs from all meshes before allocating a new one #3949

Merged
merged 3 commits into from
Mar 1, 2022

Conversation

jakubdyszkiewicz
Copy link
Contributor

Summary

createOrUpdateVIPConfigs executed with single mesh only reserved existing addresses based on this one mesh instead of all meshes.
When allocating a new address, we were allocating an IP that was already issued for a service in a different mesh. I think that technically it's not a problem, frontend.mesh from mesh-1 can receive 240.0.0.1 and backend.mesh from mesh-2 can resolve to 240.0.0.1 as well and it will work because there is no cross mesh communication. I'd argue that it's better to try to keep separate VIPs because

  • It's consistent with Universal. Notice how Universal calls CreateOrUpdateVIPConfigs() and then createOrUpdateVIPConfigs(mesh) with all the meshes. They never overlap in that scenario
  • It's easier to troubleshoot
  • It get rid of potential blocker to implement direct multizone communication
  • reverse lookup might be confusing

Review commit by commit.

Issues resolved

No issues reported.

Documentation

No docs.

Testing

  • Unit tests
  • E2E tests
  • Manual testing on Universal
  • Manual testing on Kubernetes

Backwards compatibility

- [ ] Update UPGRADE.md with any steps users will need to take when upgrading.
- [ ] Add backport-to-stable label if the code follows our backporting policy

Signed-off-by: Jakub Dyszkiewicz <jakub.dyszkiewicz@gmail.com>
Signed-off-by: Jakub Dyszkiewicz <jakub.dyszkiewicz@gmail.com>
Signed-off-by: Jakub Dyszkiewicz <jakub.dyszkiewicz@gmail.com>
@jakubdyszkiewicz jakubdyszkiewicz requested a review from a team as a code owner February 28, 2022 14:22
Copy link
Contributor

@bartsmykla bartsmykla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@codecov-commenter
Copy link

Codecov Report

Merging #3949 (f720c77) into master (e1a88c3) will decrease coverage by 0.00%.
The diff coverage is 69.69%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3949      +/-   ##
==========================================
- Coverage   55.83%   55.83%   -0.01%     
==========================================
  Files         916      916              
  Lines       54526    54531       +5     
==========================================
- Hits        30447    30445       -2     
- Misses      21636    21642       +6     
- Partials     2443     2444       +1     
Impacted Files Coverage Δ
pkg/dns/vips_allocator.go 73.28% <69.69%> (-0.48%) ⬇️
pkg/plugins/runtime/gateway/route/sorter.go 61.53% <0.00%> (-5.13%) ⬇️
pkg/mads/server/server.go 82.40% <0.00%> (-2.78%) ⬇️
pkg/core/resources/manager/cache.go 83.11% <0.00%> (-2.60%) ⬇️
pkg/plugins/leader/postgres/leader_elector.go 97.87% <0.00%> (-2.13%) ⬇️
pkg/core/resources/model/rest/resource.go 69.23% <0.00%> (+1.28%) ⬆️
pkg/core/tokens/default_signing_key.go 72.22% <0.00%> (+5.55%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e1a88c3...f720c77. Read the comment docs.

@lahabana
Copy link
Contributor

I actually want to argue the other way round :)

The only reason why the global view exists is cp dns which should go away so the global view should disappear.

The reason for removing global state:

  • Scales with the number of meshes (it'd be nice to have meshes be autonomous and avoid things that scale across all Meshes).
  • The VIPs are only used on the DP so it doesn't really block cross mesh communication just the VIP may be different across meshes which I don't believe is an issue.
  • I don't think virtual-outbound would work cross mesh (you want DPs to be accessible Xmesh maybe not the hostnames).

This in-itself doesn't mean we shouldn't merge this. I only argue that the global state should be removed soon so maybe this wasn't worth it.

@jakubdyszkiewicz
Copy link
Contributor Author

Those are good points that I agree with.

However, I think it's slightly better to have consistency introduced in this PR for the current state of Kuma.

@lobkovilya
Copy link
Contributor

But if we prioritize removing CP DNS (so removing GlobalView), then I guess this PR becomes obsolete?

@jakubdyszkiewicz
Copy link
Contributor Author

if we prioritize

then question is when we do it and if we just remove DNS server of keep it still for smooth deprecation.

Refactor is still relevant

@jakubdyszkiewicz
Copy link
Contributor Author

after talking with @lahabana offline, we want to merge it as it's a minor improvement, but we probably get rid of DNS server in the end sooner or later.

@jakubdyszkiewicz jakubdyszkiewicz merged commit 5bf19c2 into master Mar 1, 2022
@jakubdyszkiewicz jakubdyszkiewicz deleted the fix/reserve-all branch March 1, 2022 15:13
SallyBlichWalkMe pushed a commit to SallyBlichWalkMe/kuma that referenced this pull request Apr 14, 2022
kumahq#3949)

Signed-off-by: Jakub Dyszkiewicz <jakub.dyszkiewicz@gmail.com>
Signed-off-by: Sally Blich <sally.blich@walkme.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants