New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix cleaning up workloadentry with same ip and network #43951
fix cleaning up workloadentry with same ip and network #43951
Conversation
🤔 🐛 You appear to be fixing a bug in Go code, yet your PR doesn't include updates to any test files. Did you forget to add a test? Courtesy of your friendly test nag. |
|
bb4a4a4
to
7946e7d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Release note looks good
/test integ-security |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like maybe if we have:
wg1-1.2.3.4
and we get a connection for
wg2-1.2.3.4
we should just remove wg1-1.2.3.4 immediately? Its not valid to have overlapping IP in the same network, violating that may lead to strange behavior
Should we remove the existing one assuming that the new one is replacing it, and we're just waiting for the old one to go away? Or should we ignore the new one so we can't have something new kick an existing workload? The former case sounds more realistic I guess |
That (old one is stale and should be removed) would be my guess, but I am not sure I fully understand the cases that lead to this. I do know we make a strong assumption that "IP is unique within network" in a variety of places though |
We donot have such case, first #43950 experiment is not valid, we do not allow running multi sidecar/ztunnel in a VM. For VM workloadentry autoregister, we have gracefully handle that reconnect. |
Agreed with @hzxuzhonghu - what is the user case of running multiple docker contains and each has its own ztunnel? |
The docker experiment was a way for me to test running zTunnel on "vms" or in "dedicated mode" locally. Regardless of the fact that it's not a usecase to have multiple things with the same IP, we shouldn't have invalid workloadentries automatically created by istiod that never get cleaned up. |
The case I ran into would only occur if a proxy with the same network/ip connects but asks for a different auto-register group. So maybe the same IP ends up getting re-used to be part of a different app or something.
Regardless of all that logic, I still think this PR makes things more robust against misconfiguration. A user installing things on VMs for the first time could make the same mistake I did. Having a manual step of cleaning up resources that supposedly are managed my istiod doesn't seem right. One question I have is for the new check, should we consider only same-namespace WorkloadEntries as duplicates? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I donot see any bad effect with more meta data in key
Yeah I think the key should map 1:1 with the WLE. It could just be the WLE name. Even if we have some logic to prevent duplicates, there can still be races when the old instance and new instance with the same IP are on different istiod instances. The most common case that could cause something like this is editing the metadata to move the VM onto a different workload group and restarting the proxy. |
fixes #43950
Could also use the
autoregisteredWorkloadEntryName
with less validation.This keys the resource to be cleaned up as well as the input.
If there were already multiple same network/IP workloadentries, that's it's own problem. This just keeps it from being worse with stuck resources.