-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(core): ignore duplicate ADDED pod watch events #1553
Conversation
Hey, are there any updates on this PR? It's been 2 months and this seems like a major issue. Without this fix we couldn't run KubeArmor for >8 hours in production before this resource leak started causing significant issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @dqsully,
This completely makes sense. Apologies for missing it out earlier, the holiday season got to us and this PR got overlooked.
We had been considering moving to informers, even have a commit for that 6dd22ef But that hasn't gone through yet.
I have a small nitpick, it might be a bit opinionated but maybe is easier to understand as well?
I tried using a Kubernetes informer instead of the raw pod watch, and while it fixed the memory leak, it did use significantly more memory in our Kubernetes cluster compared to the raw watch. You can see that in the cliff around 12/14 17:45 UTC on my original screenshot. (Although this probably wouldn't matter nearly as much if the KubeArmor DaemonSet pods weren't watching all pods and added a filter to the informer to only watch pods on the current node. I've actually already done this filtering in my fork and I'll be making a PR for it soon, along with lots of other bugfixes I've made in my environment over the past couple months.) |
Agreed, this was discussed in one of the community calls recently, but seems like we forgot to open an issue.
I am really looking forward to them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will wait for the CI to pass before we merge this. Appreciate your help here. I will wait for CI to pass before we merge this.
I just went through a main...dqsully:KubeArmor:other-improvements and am wanting to pick up a lot of changes. I would love to chat with you and get feedback on KubeArmor in general.
You can join us on our Slack if you want real time interaction on anything.
I think I need to rebase my changes for CI to pass, lemme do that real quick. |
Signed-off-by: Dakota Sullivan <djqballer@outlook.com>
Sorry, had to do one last force push to fix the commit verification (I need to fix my GitKraken config...). |
Looks good now, will merge after the CI passes 🙌🏽 Great to know that you are trying to run KubeArmor in production. How are you using KubeArmor? We maintain a ADOPTERS.md and would love to hear the use-cases if it's fine with you. |
…armor#1553) Signed-off-by: Dakota Sullivan <djqballer@outlook.com>
…armor#1553) Signed-off-by: Dakota Sullivan <djqballer@outlook.com>
…armor#1553) Signed-off-by: Dakota Sullivan <djqballer@outlook.com>
Purpose of PR?:
Fixes a memory leak I identified on our infrastructure while trying to deploy KubeArmor in our environment:
The
WatchK8sPods()
function uses raw Kubernetes watches instead of the Kubernetes SDK informer to track pods in a cluster. This isn't inherently an issue, and in KubeArmor's case, this actually results in significantly lower memory usage since KubeArmor doesn't care about most pod information. However, raw Kubernetes watches will eventually get stopped by the Kubernetes API server and you have to start a new watch, at which point every existing pod will be sent back as an "ADDED" event in the watch.KubeArmor already restarts these watches, and even deduplicates stored pod information in an array, but it forwards every "ADDED" event verbatim to the
dm.UpdateEndPointWithPod()
function which has no inherent event deduplication. This means that every time a watch restarts, duplicate endpoint configurations are added in memory for every pod in the cluster. And in a large cluster, this memory leak can grow pretty quickly.My PR uses the duplicate status computed when storing pod information to determine if an "ADDED" event should be forwarded to
dm.UpdateEndPointWithPod()
as a "MODIFIED" event instead, which greatly reduces KubeArmor memory leakage in our environment while still fully processing every pod change in the cluster.Does this PR introduce a breaking change?
No
If the changes in this PR are manually verified, list down the scenarios covered::
Here's the average memory usage of the KubeArmor DaemonSet containers on our non-production Kubernetes cluster over the past 4 days, where I initially observed this memory leak and have been doing some testing on how to solve it:
tl;dr: the flat parts are my own changes, mostly just the changes in this PR. The positive slopes are KubeArmor v1.0.3 and v1.1.1, both with memory leaks after each redeployment.
Exactly what I did and when in this graph:
main
but using k8s SDK informers instead of raw watchesmain
but ignoring duplicate "ADDED" events fordm.UpdateEndPointWithPod()
main
but with this PR's changesThis shows that with the unmodified v1.0.3 or v1.1.1 KubeArmor images, there's a clear memory leak, but with this PR's changes, while there appears to be a small memory leak still, the overall memory usage is much better.
We also ran KubeArmor v1.0.3 in production for only 8 hours until we had to take it down due to this memory leak and a knock-on CPU leak effect as well. This PR's changes have been running in our production environment for at least 24 hours now with very reasonable CPU and memory usage.