-
Notifications
You must be signed in to change notification settings - Fork 819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
arpwatch 100% cpu with libpcap >=1.5.1 #333
Comments
This is in the TPACKET_V3 code, which is new in 1.5, and has not been changed much since 1.5.0, so 1.5.2 probably has the same problem. Are the machines on which you couldn't reproduce this running Linux with a 3.2 or later kernel (and thus with TPACKET_V3 support), or are they either not running Linux or running with an earlier kernel (and thus without TPACKET_V3 support)? |
The machine where I couldn't reproduce is mainly the same, up-to-date archlinux i686 with 3.12 kernel except it's a vm and the network stack. But after more try, I can now reproduce it on a VM. Recipe:
The key ingredient is to put an iface in a bridge. |
Sébastien Luttringer notifications@github.com wrote:
So, my guess is that it is cycling through the receive ring without stop, not I know little about the TPACKET_V3 code, so this is a guess. Clearly, you need to have an interface on the bridge interface so that at ] Never tell me the odds! | ipv6 mesh networks [ |
OK, so on my Fedora 16 (3.6.11-4.fc16.x86_64 kernel) virtual machine (VMware Fusion) I did
and tried running tcpdump (with both it and the libpcap it uses built from the trunk) and arpwatch (installed from the standard RPM, and probably using the libpcap the OS comes with, which isn't 1.5) on So I'm not sure what's required to provoke the bug. Perhaps it's kernel-version related. Debugging this might require running modified versions of libpcap on the machine on which you reproduced it. |
I.e., you run strace against the arpwatch process, and it prints nothing, meaning it's probably executing no system calls, not even a |
I mean, when the process reach 100% CPU, poll output stop.
and the process is running (not in a syscall)
An important ingredient to trigger the 100% seems to send arp request.
|
Another important ingredient is to run a version of arpwatch that's actually linked with a new version of libpcap. :-) I was able to reproduce it, and it was a simple error (the negation of Arpwatch is what triggered it because arpwatch was passing a count of 0, not -1, to Fixed in 1a52c9a. |
I've backported 1a52c9a to the 1.5 branch. |
I was able to reproduce this without any bridging, so the bridging had nothing to do with it. |
I found a regression in libpcap between 1.4.0 and >=1.5.1 which cause arpwatch to consume 100% of CPU and stop working when listening on a bridge interface on i686.
I'm currently not able to reproduce it on another computer (vm).
I tested with fresh 1.5.2 and regression is still present.
I also tested with different linux kernel version 3.9, 3.10, 3.11 and 3.12.
All of this is tested on i686 archlinux host which act as router/firewall.
strace and ltrace give nothing. No output when process is running at 100%. So I guess the process run in loop inside a library call.
The gdb backtrace (see lower) show that the program is trapped in pcap_read_linux_mmap_v3 around ./pcap-linux.c:44xx (1.5.2 tarball)
Do you need more information?
[1] https://projects.archlinux.org/svntogit/packages.git/tree/trunk/PKGBUILD?h=packages/libpcap
The text was updated successfully, but these errors were encountered: