-
Notifications
You must be signed in to change notification settings - Fork 464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom ipset sets and entries get reverted periodically #1677
Comments
kube-router only touches it's own ipsets. You can see this from the delete logic here: https://github.com/cloudnativelabs/kube-router/blob/master/pkg/controllers/netpol/network_policy_controller.go#L709-L714 The only way that I could see this happening is if you got incredibly lucky and you somehow consistently managed to add your ipset in the microseconds between when kube-router saves the ipset's and the restores them (see logic above). But for this to be consistently happening seems very, very unlikely unless something else is also going wrong on the node causing a severe performance regression. Additionally, I tested this on my own cluster just now by creating an ipset and then forcing a network policy sync and seeing that the ipset survived. Maybe this is something that k3s is doing? If you want to be super sure, you can actually increase the logging verbosity level 3 and see the entire output of the ipset restore that kube-router is doing (https://github.com/cloudnativelabs/kube-router/blob/master/pkg/utils/ipset.go#L595). |
I'd like to increase log verbosity level and inspect the full restored ipset save string. But I didn't find a place to inject |
To be honest, I'm not really familiar with k3s's options as I don't use it myself. |
I just dialed up the verbosity of k3s agent and found it works too. So here's the captured ipset restore string:
I see it uses a temporary ipset |
The k3s code for netpol looks suspicious to me in its Run function. Can you confirm if the call of set.Save() prior to controller initialization may be causing the result? |
I'm glad that you were able to figure out how to increase the logging verbosity for kube-router within k3s, that is definitely helpful. So yes, I can confirm that it is expected that you would see your ipset in the output for kube-router ipset restore function. This is because it reads all ipsets during the save, and then only modifies its own. What is unexpected is that it would revert the value to a previous value. So following through the logic we can look at how kube-router encounters / mutates ipsets in the network policy controller:
So I can't see any obvious place where it would be reverting user chain information. One thing that I did think of when reviewing the code, is maybe if https://github.com/cloudnativelabs/kube-router/blob/master/pkg/controllers/netpol/network_policy_controller.go#L229-L243 code block was excluded from your build of k3s then it might cause issues for you? Because saved data would have been added into the ipset object from one run, and then possibly restored in the next at: https://github.com/cloudnativelabs/kube-router/blob/master/pkg/controllers/netpol/policy.go#L149-L151 ? I can see that upstream, they have this block in their code because they are linking against kube-router v2.1.2 which has this fix in it. But maybe the version of k3s that you're running is using an older version of kube-router? |
Thank you for the detailed explanation. I think the initial call to Another confusing thing from Let me include the full log that contains hints about the calling lines from the source files:
I'm inclined to believe the k3s version I run doesn't actually link against kube-router v2.1: If you look at my log it shows "ipset restore looks like: ...", which is consistent with the log format in v2.0, whereas in v2.1 the format was updated to contain "ipv6?". Will v2.0 explain the we're seeing here? |
I haven't gone in-depth on what changes between versions for this one, but I believe that v2.0 could explain it. Maybe try building k3s from HEAD and see if the problem goes away? |
If my understanding is correct, the problem won't go away even if I rebuild k3s from HEAD because of the initial call of To stop
Either options would be sufficient to solve the problem I encountered. I'm personally inclined to the second option because logically it may be more correct for kube-router to only save anything it creates. |
I don't believe that first save does anything. Saving the ipset only puts info in the struct, but that struct doesn't get used again. I'm nearly positive that the problem you're experiencing is do to an old version of kube-router being used. |
Good. Let me try upgrading my k3s and report back to you. |
I tried upgrading to the latest k3s version (v1.30.0+k3s1) and still see the same problem occurring. And the logging of ipset restore string is still in the old format. Upon closer inspection I found their |
What happened?
I'm running k3s agent on a gateway node in my homelab. Recently I added a feature in my firewall script that uses ipset. However, I noticed that the ipset entries get reverted back every few minutes. Further digging brought me to the ipset utility defined in kube-router.
What did you expect to happen?
I expect that kube-router can keep ipsets not defined by itself untouched.
How can we reproduce the behavior you experienced?
Steps to reproduce the behavior:
ipset create test hash:net; ipset add test 1.2.3.4
ipset add test 5.6.7.8
ipset list test
and observe the output to only include1.2.3.4
System Information:
kube-router --version
): probably v2.1.0kubectl version
) :k3s v1.29.4+k3s1
Logs:
The fork log during the ipset reversion:
execsnoop-bpfcc -T -t
:Based on my analysis, the ipset was created, and saved the state on k3s agent startup: https://github.com/k3s-io/k3s/blob/master/pkg/agent/netpol/netpol.go#L55, and then each time the ipset Restore(), the existing ip set gets replaced by the saved ipset, regardless whether or not a third-party has modified the entries. This means any modifications made after k3s agent startup are lost after some time. Please correct me if I'm wrong.
Possible fix:
My suggested solution to this is to add a filter in ParseIPSetSave and make it only handle sets created by kube-router itself.
Workaround:
My current workaround this problem is to restart
k3s agent
after changing ipset so the its internally saved ipset would reflect the latest state. This is not a perfect solution unfortunately because there are obvious places for race conditions, and restarting k3s agent just for this simple goal seems gratuitous to me.The text was updated successfully, but these errors were encountered: