Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leaking #146

Closed
klaus3000 opened this issue Feb 24, 2017 · 13 comments
Closed

Memory leaking #146

klaus3000 opened this issue Feb 24, 2017 · 13 comments
Assignees

Comments

@klaus3000
Copy link

Hi!

dsc seems to leak memory:
image

We are seeing similiar behavior on all our nodes. The more traffic a node has, the faster the memory usage grows. We are using version 2.4.0, but we saw this behavior also with old versions (e.g pre2.0). I can't remember if we had this problem also with very old Debian version (201203250530).

I once had a problem with the old 201203250530 where I found out this special memory leak (not slowly over days but consuming all memory within a few minutes) was caused by some crappy TCP traffic which caused dsc to loop inside the TCP reassembly code.

So, now I do not have an idea where this is coming from. Maybe you add some memory debugging to DSC (eg reported also in the XML files) to find out where this leak is coming from.

@klaus3000
Copy link
Author

The recovery happens when we restart the processes or when OOM killer kills one of the running DSCs.

@jelu
Copy link
Member

jelu commented Feb 24, 2017

Wow^Hopsi, will look into it!

@jelu jelu self-assigned this Feb 24, 2017
@klaus3000
Copy link
Author

Note. I just checked a server where we had old 201203250530-3 running and the old version did not had the leak.

I now restarted one server without all the geoIP features - on Monday I will see if the leak is related to the new geoIP datasets.

@jelu
Copy link
Member

jelu commented Feb 24, 2017

Great, thanks! I will run some large captures through valgrind dsc on monday to see if I can spot anything.

@jelu
Copy link
Member

jelu commented Feb 27, 2017

Okay, I've gone through all the code and found one issue with the IPv4 fragment reassembly code, old fragments are not freed. TCP reassembly code is OK since it clears them after 60 seconds. GeoIP seems OK also, checked their latest code on GitHub, sure you might run an older version with memory leaks (haven't dug through their changelog).

jelu added a commit to jelu/dsc that referenced this issue Feb 27, 2017
@jelu
Copy link
Member

jelu commented Feb 27, 2017

Can you test the latest develop and tell me how it goes?

There may be a small performance impact if you have a lot of fragments that are not getting reassembled because it will need to iterate and clear the old.

If you see drops in packets captured during interval we can try add an option to disable reassembly of ipv4 fragments (v6 are ignored) and only process the first segment.

@klaus3000
Copy link
Author

I disabled TCP capturing and all the geoip features and memory still leaks. Hence, the sgementation resambling may be indeed the problem.
I always thought DSC did not handle segements at all, hence my pcap filter currently does not capture segements:
bpf_program "(host x.x.x.x) and (udp port 53 or tcp port 53)";
May this be the problem, that our filter only sees the first segment, which then is never freed?
Which filter are you using to capture segments too?

btw: does DSC handle segmented IPv6 packets?

@jelu
Copy link
Member

jelu commented Feb 28, 2017

There is two different reassembles going on, one for the TCP segments and one for IP fragmentation. The IP fragmentation reassembly has been in the code for many years, don't know if it has always been enabled or not.

I can't see why your filter would block IP fragments, maybe you need to tcpdump with the same filter on the same link and see if you see a lot of fragments. If your using a spam/dump port from routers using jumbo frames into a normal link it may break up a lot of packets.

The current code in DSC drops all IPv6 packets that has a fragmentation header.

jelu added a commit to jelu/dsc that referenced this issue Feb 28, 2017
- add conf option `drop_ip_fragments` to control IP fragmentation reassembly
  within pcap_layers
- Fix spacing in conf man-page
jelu added a commit to jelu/dsc that referenced this issue Feb 28, 2017
- Issue DNS-OARC#146: add conf option `drop_ip_fragments` to control IP
  fragmentation reassembly within pcap_layers
- Fix spacing in conf man-page
@jelu jelu mentioned this issue Feb 28, 2017
@jelu
Copy link
Member

jelu commented Feb 28, 2017

Current develop has drop_ip_fragments; now if you want to try.

@klaus3000
Copy link
Author

a) I did test with your mem-leak-fix, and it seems to work (only running for 24h now)

b) My filter (udp port 53 or tcp port 53) captures only the first segment, as the remaining segments do not have an udp/tcp header - hence the filter for port will match only the first fragment.

c) What does drop_ip_fragments exactly do? Does it drop all fragements, so also the first fragment, or only the remaining segments?

Will DSC analyze an incomplete answer (e.g. only the first fragment was seen)?

@jelu
Copy link
Member

jelu commented Mar 1, 2017

a) Great! then I consider this issue resolved :)
b) True
c) It will drop all fragments

DSC needs the header and first question to process the query/response, otherwise it is marked as malformed.

Making it process only the first fragment and skip reassembly would be another feature request for which I currently can't say when I will have time for.

@jelu jelu closed this as completed Mar 1, 2017
@klaus3000
Copy link
Author

so, when using (udp port 53 or tcp port 53), DSC will usually process framgented packets as the first fragement should contain the header, the question and the answer section, right?
So, will DSC wait for the other segments until a timeout and then process the first fragemtn only?

@jelu
Copy link
Member

jelu commented Mar 1, 2017

No, any fragmented packets are put on a list until they can be fully reassembled. Only after that are they processed. If they timeout, they are dropped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants