Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

False packet loss when using shared libpcap from Ubuntu 14.04 #34

Open
nnposter opened this Issue Dec 31, 2014 · 11 comments

Comments

Projects
None yet
5 participants

Nmap compiled on Ubuntu 14.04 with the shared instance of libpcap is reporting lost packets, which in turn significantly impacts performance and quality of results. However, simultaneously running Wireshark on the same host does see them. The same issue is not experienced when compiled with the included instance of libpcap.

Details are available at http://seclists.org/nmap-dev/2014/q2/538

Possibly related:
http://seclists.org/nmap-dev/2014/q2/548
http://seclists.org/nmap-dev/2014/q2/341

Confirmed on Ubuntu 14.04 on VirtualBox with NAT adapter. Dynamic-linked with system libpcap sends about 2x as many packets and takes 4 times as long.

Regarding the 2x as many packets: A dropped probe results in an increased number of retries. Since most ports are closed, this means most ports will get an extra retry. I'm seeing anywhere between 1 and 3 extra retries for the system-libpcap, which if the default is 1 retry, means between 1.5 and 2.5 times as many packets sent. Still no clue as to why it's happening, though.

Ok, I'm not sure why packets are being dropped: maybe that's a bug on our end. But I do see a solid reproducible difference between the two libpcaps in terms of time taken to get one packet. Command to reproduce: nmap -n -Pn -p 80 -d --packet-trace scanme.nmap.org. This sends exactly 1 SYN packet, receives exactly 1 SYNACK, and then the OS sends a RST (assuming no real network problems).

Within readip_pcap, we call pcap_select (from libnetutil) to wait for a pcap fd to be available for reading. With our included libpcap, select returns almost immediately, since a packet is available on the pcap fd. With the system libpcap, it waits the entire timeout (1 second for the first packet at least) before returning. The receive time on the packet is the same, so there's no difference in how fast the packet is processed; the select call is the only thing delaying things.

I don't know if this is related, but just above the pcap_select call in readip_pcap, we try to do a non-blocking read by doing pcap_setnonblock followed by pcap_next. But the man page for pcap_setnonblock says that pcap_next "will not work in non-blocking mode." It looks like we should be able to eliminate that whole code block, but I don't think it will affect this issue.

berdario pushed a commit to berdario/dotfiles that referenced this issue Apr 11, 2015

Ok, I've got good news and bad news, but I'm not sure which is which.

First, the difference between our included libpcap 1.5.3 and the system one is the result of a bug in our version. We introduced a configure option to turn off packet ring capture support for some 2.6 kernels that can't produce a good 32-bit binary. This is implemented as a preprocessor macro, PCAP_SUPPORT_PACKET_RING, but we forgot to put it into libpcap/configure.h.in so effectively this support is always turned off with the included libpcap. If I "fix" this problem, then we get the packet loss events with both included and system libpcaps.

Second, I believe the problem is related to this libpcap issue: the-tcpdump-group/libpcap#335. Note: related, but not identical. The "fix" for that issue is already in libpcap 1.5.3, so we have it. There's a couple related changes to the Linux kernel this year that address the deficiency (one linked from this one: torvalds/linux@41a50d6)

@dmiller-nmap dmiller-nmap added this to the Release codename Auxesis milestone Apr 29, 2015

A few final notes on this issue:

  • The most relevant bug report on libpcap for this issue is the-tcpdump-group/libpcap#380. The suggested workaround there is using a very short timeout on select(). I can confirm that this works to avoid packet loss if we add a short-timeout select before the primary one in readip_pcap, but I can't figure out how to make this a general solution that doesn't alter the timing too much or just become a busy wait. Example patch:
--- a/tcpip.cc
+++ b/tcpip.cc
@@ -1710,6 +1710,9 @@ char *readip_pcap(pcap_t *pd, unsigned int *len, long to_usec,

     if (p == NULL) {
       /* Nonblocking pcap_next didn't get anything. */
+      if (to_usec < 200000 && pcap_select(pd, to_usec) > 0)
+        p = (char *) pcap_next(pd, &head);
+      else
       if (pcap_select(pd, to_usec) == 0)
         timedout = 1;
       else
  • As noted in that libpcap issue, the underlying problem is a bug in Linux TPACKET_V3 mmapped packet capture. This bug was fixed in Linux 3.19, and I can confirm that Nmap has no further issues on Linux 4.0. Please file bug reports with your distros to backport the patch if possible: torvalds/linux@da413ee
  • As a workaround, you can configure Nmap --disable-packet-ring --with-libpcap=included (an option that is passed along to libpcap's configure script). This disables the packet ring mmapped packet capture, which could slow down Nmap in very-high-packet-rate cases, but will be far less troublesome than this particular bug. Alternatively, you could try to muck around in libpcap/pcap-linux.c to try to downgrade to TPACKET_V2 which does not have this problem, but I have not tried that.

Given that we have these workarounds, and the bug is demonstrated to be in the Linux kernel code, not in Nmap or libpcap, I am removing the release milestone. I will leave the bug open until either several major distros backport the kernel fix or we find a suitable workaround.

@dmiller-nmap dmiller-nmap removed this from the Release codename Auxesis milestone May 21, 2015

nnposter commented Jul 7, 2015

FWIW I have experimented with downgrading to TPACKET_V2, instead of disabling the packet ring. The downgrade does rectify the issue but in my light testing I have not noticed any material performance advantages.

Here is the corresponding patch if anybody cares for it.

--- a/libpcap/pcap-linux.c
+++ b/libpcap/pcap-linux.c
@@ -188,6 +188,8 @@
 # endif /* PACKET_HOST */


+# undef TPACKET3_HDRLEN
+
  /* check for memory mapped access avaibility. We assume every needed
   * struct is defined if the macro TPACKET_HDRLEN is defined, because it
   * uses many ring related structs and macros */

Is this fixed in Nmap 7?

mpontillo commented Sep 14, 2016 edited

Cross-posting here from the related Launchpad issue. Inspired by the flow-disruptor workaround, I did a proof-of-concept nmap workaround as follows:

$ svn diff libnetutil/netutil.cc
Index: libnetutil/netutil.cc
===================================================================
--- libnetutil/netutil.cc   (revision 36280)
+++ libnetutil/netutil.cc   (working copy)
@@ -4073,7 +4073,8 @@
   Strncpy(pcapdev, device, sizeof(pcapdev));
 #endif
   do {
-    pt = pcap_open_live(pcapdev, snaplen, promisc, to_ms, err0r);
+    //pt = pcap_open_live(pcapdev, snaplen, promisc, to_ms, err0r);
+    pt = pcap_create(pcapdev, err0r);
     if (!pt) {
       failed++;
       if (failed >= 3) {
@@ -4084,6 +4085,11 @@
       sleep( compute_sleep_time(failed) );
     }
   } while (!pt);
+  pcap_set_promisc(pt, promisc);
+  pcap_set_timeout(pt, to_ms);
+  pcap_set_snaplen(pt, snaplen);
+  pcap_set_immediate_mode(pt, 1);
+  pcap_activate(pt);

 #ifdef WIN32
   if (wait == WAIT_ABANDONED || wait == WAIT_OBJECT_0) {

Obviously, this is nowhere near production-ready code, but I wanted to convince myself that the pcap_set_immediate_mode() workaround could work in nmap as well.

This caused the scan of a single host (using an Ubuntu 16.04 "Xenial" host running kernel 4.4.0) to go from taking ~45 seconds to ~5 seconds.

For the record, the test case I used was: sudo time ./nmap -sS -vv <host>.

@pontillo Thanks for notifying us! I'd like to play around with immediate mode a bit more to see how it could best work for us, but for now I'd settle for reproducing the original bug on a >3.19 kernel or at least fully describing and isolating it. Can you provide the output of nmap --version and uname -a for the setup that causes 45-second scans? Thanks!

mpontillo commented Sep 15, 2016 edited

Sure; below are some additional details.

First, here is my uname -a:

Linux xenial 4.4.0-36-generic #55-Ubuntu SMP Thu Aug 11 18:01:55 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

I tested with both the current version of nmap in Ubuntu 16.04, and a version I built from source (using the nmap-7.10 branch in Subversion).

The version packaged with Ubuntu (which, after running apt-get source nmap and checking debian/rules, you can tell is compiled with --with-liblua --with-liblinear --enable-ipv6) is:

Nmap version 7.01 ( https://nmap.org )
Platform: x86_64-pc-linux-gnu
Compiled with: liblua-5.2.4 openssl-1.0.2g libpcre-8.38 libpcap-1.7.4 nmap-libdnet-1.12 ipv6
Compiled without:
Available nsock engines: epoll poll select

With that version, I first saw the symptom: long scan times, and messages like the following printed to the console:

Increasing send delay for 192.168.0.9 from 0 to 5 due to 92 out of 306 dropped probes since last increase.
...
Nmap done: 1 IP address (1 host up) scanned in 45.53 seconds
           Raw packets sent: 2754 (121.160KB) | Rcvd: 1266 (50.652KB)
0.08user 0.08system 0:45.55elapsed 0%CPU (0avgtext+0avgdata 14852maxresident)k
1336inputs+0outputs (0major+1982minor)pagefaults 0swaps

I saw the same issue in the nmap-7.10 branch when I compiled from source. Then I compiled with the workaround I posted, (with no ./configure arguments, so it would use the shared library version of libpcap) as follows:

Nmap version 7.12 ( https://nmap.org )
Platform: x86_64-unknown-linux-gnu
Compiled with: nmap-liblua-5.2.4 openssl-1.0.2g libpcre-8.38 libpcap-1.7.4 nmap-libdnet-1.12 ipv6
Compiled without:
Available nsock engines: epoll poll select

The output from this version was normal, such as:

Nmap done: 1 IP address (1 host up) scanned in 5.99 seconds
           Raw packets sent: 1339 (58.900KB) | Rcvd: 1001 (40.052KB)
0.05user 0.02system 0:06.02elapsed 1%CPU (0avgtext+0avgdata 14472maxresident)k
2696inputs+0outputs (1major+1951minor)pagefaults 0swaps
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment