-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xt_ndpi module: Memory leak from never freed flow allocations after linux kernel v5.17-rc1 #147
Comments
This is very bad news, since xt_ndpi won't work without a kernel patch :( |
Fix for issue #147 Starting from the kernel 5.15.32 the use of nf_ct_hook->destroy was stopped in the conntrack subsystem.
It seems that the problem is not so serious. |
@vel21ripn After torvalds/linux@1bc91a5 (v5.18-rc1) and torvalds/linux@1015c3d, the extension register api has been removed (according to the commit message, the destroy hook is replaced with nat_ops?), so the xt_ndpi module failed to compile on my linux PC (v6.0.8) v5.15.y seems not introducing these commits by far. |
This is really bad news. |
I was able to replace the ndpi_nf_ct_destroy() function with livepatch. For now, I consider this option as a temporary solution for newer kernels. |
I investigated a bit into this problem. Seems that the extension register apis are removed with following considerations:
So our problem is that with only NAT free logic hard coded into
Therefore, we may focus more on making |
There are two options.
The second option has less overhead and can do without the rcu_pointer. |
I've started working on support for 6.x kernels. The live-patching mechanism is used. |
If possible, then look at the commit 3e6d0f1 |
For me build with Arch kernel and enabled livepatch feature now successful |
Tested on Arch kernel 6.1.12 and Debian kernel 6.0.8, it successfully works. Thanks for the work! |
Describe the bug
In linux kernel, after commit torvalds/linux@6ae7989:
The
nf_conntrack_put
andnf_ct_put
api no longer have the same behavior oncect_hook->destroy
is changed.This caused
xt_ndpi
kernel module to show memory leak and cannot be unloaded once the leak happened, because conntrack code usingnf_ct_put
api will not call flow_free code ofxt_ndpi
.Obtained behavior
Abnormally large memory consumption and module used counts.
In a 4G Pi device, after
xt_ndpi
been running for some days, 86% (3.2G/3.8G) of the memory has been occupied without any userspace process occupying more than 100MB memory.Investigating kernel module usage by
lsmod | grep xt_ndpi
, it shows a used count of 840620:After reading the source code, I learned that such large usage count exactly derived from xt_ndpi flows that are not freed:
1481895-633595=848300, very close to the module usage of 840620. (In fact, during my reboot debug test, they are strictly equal, e.g.
created-deleted = 214-103 = 111 == 111 = lsmod usage
)conntrack -F
and conntrack timeout does not reduce module usage at all.This is because both flush and timeout will call
nf_ct_delete
, which in turn will callnf_ct_put
, so the hookedndpi_destroy_conntrack
will not be called at all.Expected behavior
Figure out some way to clean the unused flow. I have no idea how to detour the
nf_ct_put
to call hooked ndpi function again. If hook no longer works, maybe some GC or some iptables targets that manually free the flow would help?Environment:
How to reproduce the reported bug
conntrack -F
, see whether the module used count reduced.The text was updated successfully, but these errors were encountered: