Remove use of pcap-int.h on Windows #426

Closed
dmiller-nmap opened this Issue Jun 17, 2016 · 9 comments

Comments

Projects
None yet
3 participants

pcap-int.h contains pcap "internals" that we shouldn't really be using. On Windows, we use it to peek into the struct pcap in order to retrieve the ADAPTER *adapter member to use in a call to PacketSetReadTimeout in tcpip.cc. The reason for this call is explained in this nmap-dev post from 2010.

Since this header contains internal implementation-specific details, we shouldn't rely on it not changing (though it hasn't in almost a decade). Instead, we should find a way to accomplish the same task (setting varying timeouts on a single pcap descriptor) in a way that works on all platforms. This might be accomplished via #123, but it might require separate or extra effort.

@hsluoyz hsluoyz added a commit to nmap/npcap that referenced this issue Jun 17, 2016

@hsluoyz hsluoyz Made pcap_set_timeout() function work even after the pcap device is o…
…pened. This patch is for the issue: nmap/nmap#426
b400111
Member

hsluoyz commented Jun 17, 2016 edited

Hi @dmiller-nmap ,

I have fixed this issue, starting from Npcap 0.07 R16, pcap_set_timeout() function can be called after the pcap device is opened. pcap_set_timeout() will set the timeout into the driver immediately just like PacketSetReadTimeout. The commit is here: nmap/npcap@b400111

Try this feature with latest Npcap installer at:
https://github.com/nmap/npcap/releases

As Nmap will use Npcap definitely, I hope you could replace the PacketSetReadTimeout call with pcap_set_timeout() and remove pcap-int.h's reference in Nmap's code, and then I will remove pcap-int.h in the Npcap SDK.

The reason for this call is explained in this nmap-dev post from 2010.

OK, a couple of points:

Whether a given UN*X platform supports select() may depend not only on the platform but also on the OS version. The pcap_get_selectable_fd() man page discusses this in detail; newer versions of *BSD and OS X do support it correctly. (I need to update the man page to note that the workaround it describes doesn't work on Snow Leopard, due to some breakage in non-blocking mode. The breakage was fixed in Lion - and select()/poll() were also fixed to work, without the workaround, on Lion. The fix is based on the FreeBSD fix for the same problem, and is, as I remember, similar to the fixes done on other *BSDs.) I haven't tested whether select() works on BPF devices on Solaris 11 (and can't look at the source to check - thanks, Larry!). I should fire up by Solaris 11 virtual machine and try it.

If you are trying to get notified of packet arrival as soon as the packet is seen by the networking software, you can, with newer versions of libpcap, turn on "immediate mode", which means BPF immediate mode with BPF, using TPACKET_V2 on Linux with memory-mapped capture, using a zero bufmod timeout with Solaris 10 and earlier, and not turning on ENBATCH on Tru64 UNIX (and making some attempts on SunOS prior to 5.0, but I'm just guessing there, and with Endace DAG cards). On Windows, it calls PacketSetMinToCopy() with a minimum-to-copy of 0; however, WinPcap 4.1.3 isn't based on a recent enough version of libpcap to have "immediate mode", and the current Npcap code isn't based on a recent enough version of libpcap to have it, either.

"Immediate mode" has to be set before activating the pcap_t; on most platforms that's not a requirement, but it is a requirement on Linux with TPACKET_V3 (you can't just switch to TPACKET_V2, you'd have to completely destroy the ring buffer, close the PF_PACKET socket, and open a new socket and set it up from scratch).

"Immediate mode" may still do buffering, if packets arrive between the point at which the kernel delivers a wakeup and the point at which pcap actually tries getting packets, so it makes no guarantee of one-packet-per-wakeup. All it does is indicate that you should get a wakeup as soon as the first new packet arrives.

Since this header contains internal implementation-specific details, we shouldn't rely on it not changing (though it hasn't in almost a decade).

Actually, it has changed significantly. In libpcap 1.5, for example, a lot of variables were moved into per-pcap-XXX.c structures, with a void * pointer to that structure replacing them in struct pcap. Those structures are even more opaque, as in "they're not declared in any header file".

(setting varying timeouts on a single pcap descriptor) in a way that works on all platforms.

Unfortunately, on Linux, with TPACKET_V3, it looks as if that can only be set when the packet ring buffer is created, and can't be adjusted afterwards.

On {OS X, *BSD}, it could be done, and it's probably supported on Solaris 11 as well. On earlier versions of Solaris, I think it could be done as well.

But "on Linux, with TPACKET_V3" is a big roadblock, unless you don't need to change the timeouts after pcap_activate()/pcap_open_live() on Linux.

So what is the underlying issue nmap is trying to solve here?

BTW, be careful when extending libpcap APIs for Npcap - they may not be as extensible as you'd like, and you may have to maintain the source changes indefinitely if you don't discuss them on tcpdump-users first.

If supporting changing the read timeout after opening/activating, on platforms where it's possible, is necessary, that would, in libpcap, have to be done by adding a new function pointer for the "change timeout" operator, and the Linux version would, if TPACKET_V3 is being used (which it would be, with kernel versions that support it and libpcap versions that know about it, unless immediate mode is specified), just return PCAP_ERROR_ACTIVATED.

Member

hsluoyz commented Jun 28, 2016 edited

I didn't fully understand the details here, but what seems to be is that the latest libpcap has solved this issue? Then I think the current manner is just a workaround. In fact, I'm working on the latest libpcap's integration with Npcap. So I personally think this issue will be gone automatically when the work is done.


UPDATE:

Now Npcap has moved to latest libpcap 1.8.0. So it seems that we can use the new immediate mode @guyharris mentioned now. I think all we need to do is to call pcap_set_immediate_mode before pcap_activate, then remove the PacketSetReadTimeout call, right?

But according to @dmiller-nmap , our goal is setting varying timeouts on a single pcap descriptor. So it seems that we need to change this timeout when the adapter is opened, not just setting the mode to immediate before activating the adapter. Does libpcap have a way to do this?

Member

hsluoyz commented Aug 10, 2016 edited

Hi @bonsaiviking @dmiller-nmap ,

I have found out why Nmap crashes with Npcap with libpcap 1.8.0. Nmap includes a internal header of libpcap: pcap-int.h. And this file has changed the offset of the adapter member in the pcap struct:

In the old 1.0.0 pcap-int.h: the offset is 0.
In the new 1.8.0 pcap-int.h, the offset is 8 bytes. There are two new function pointers before the adapter member.

So Nmap got the wrong adapter value when linking with the new Npcap. The solutions are:

  1. Update Nmap' repo's pcap-int.h to the latest trunk one.
  2. Remove the dependence on pcap-int.h.

I think we can go 1. as a workaround. But I think 2. is better. Because Nmap should not rely on an internal-use header. pcap-int.h is not a part of Npcap SDK either.

I didn't add the feature of pcap_set_timeout() function can be called after the pcap device is opened in current libpcap code (so that trick ends in 0.08) because none of you seems to support it. Let me know if you need any help.

OK, so:

  1. There are platforms on which the timeout cannot be adjusted after calling pcap_activate(), so doing so wouldn't be a portable solution.

  2. With WinPcap, there are no select()able handles, because select() takes fd_sets as arguments, and those are bit sets based on SOCKET values, and WinPcap doesn't use sockets. There are WaitForSingleHandle()/WaitForMultipleHandles()/MsgWaitForMultipleHandles()/etc.-able handles; you get them with pcap_getevent(). So, if you're using an event loop in a program using WinPcap, don't use select(). (Yes, this means you can't write an event loop that works exactly the same in Windows and UNX. Welcome to the world of programming for Windows and UNX - something you just have to do things differently on the two platforms.)

  3. On most if not all current versions of various UN*Xes, pcap_get_selectable_fd() should return a selectable handle.

BPF device FDs, which are what you'd get from pcap_get_selectable_fd() (or pcap_fileno()), originally weren't 100% selectable, because, when you wait for a BPF device to be readable, you wait for one of two things to happen - the BPF device's buffer filling up or the timeout expiring - and the BPF driver would not start the timeout when the select() was done, so the select() would only indicate that the BPF device was readable when its buffer filled up, which could take indefinitely long.

That bug was fixed in FreeBSD 4.6, NetBSD 3.0, OpenBSD 2.4, and OS X 10.7, and may never have been present in DragonFly BSD (if it forked from FreeBSD in 4.6 or later). So, with Lion and later, OS X does have selectable pcap file handles.

I don't know whether the other UN*Xes using BPF - AIX and Solaris 11 - the FD is selectable. (I don't have an AIX system on which I have privileges to test this, but I should try it on my Solaris 11 VM.)

I don't know whether Tru64 UNIX's packetfilter descriptors are selectable. The packetfilter(7) man page says that "Normally, a packet filter application blocks in the read system call until a received packet is available for reading. There are several ways to avoid blocking indefinitely: an application can use the select(2) system call, it can set a ``timeout'' for the packet filter file, or it can request the delivery of a signal (see sigvec(2)) when a packet matches the filter.", but I don't know whether that means thatselect() and a timeout work together.

On other UN*Xes, the FDs should always have been selectable, with select() honoring the pcap_t's timeout, if the platform supports a timeout at all (sockets on Linux/IRIX, STREAMS devices on Solaris prior to Solaris 11/HP-UX).

This means that my_pcap_get_selectable_fd() and pcap_selectable_fd_valid() should really be updated to check, at your choice of run time or compile time, which version of the BPF-using OS you're using is being used, and base their return values on that. (If you do it at run time, you probably don't want to waste cycles by doing a uname() every time it's called; avoiding that is left as an exercise to the reader.)

As for pcap_selectable_fd_one_to_one(), if you mean that it should return 1 if pcap_dispatch() will never read more than one packet into its buffer and 0 if it could possibly read more than one packet into its buffer, then it should return 1 on:

  • all systems using BPF, with the possible exception of AIX (i.e., *BSD, OS X, and Solaris 11);
  • Windows;
  • Solaris prior to Solaris 11;
  • Linux, if the kernel and libpcap support memory-mapped capture;
  • Tru64 UNIX, as far as I know;

because all of those systems do buffering. (Well, on Linux with memory-mapped capture, it's not reading packets into the buffer - the buffer is a ring shared between the kernel and userland - but, as far as I know, it's possible that if you do a select(), get a "readable" notification, call pcap_next() or pcap_next_ex() (which calls pcap_dispatch() with a count of 1), and then do another select(), the select() could block even though there are more packets that would be picked up by pcap_next()/pcap_next_ex() without blocking, because there are no packets to pick up from the kernel, even though there are packets that have already been picked up but not processed yet.)

  1. The primary purpose of the timeout in pcap is not to prevent waiting forever if no traffic arrives, it's to prevent waiting forever for a bufferful of packets to arrive. Code should not expect a pcap_next() or pcap_next_ex() or pcap_dispatch() call to be guaranteed to return after the timeout expires - it won't necessarily do so. It does happen to do so on systems using BPF, and on Windows with WinPcap. I infer from the quote above from the Tru64 UNIX packetfilter(7) man page that it does so on Tru64 UNIX as well. It is not guaranteed to do so on Linux, and I don't think it does so on Solaris 10 and earlier (I think the bufmod STREAMS module starts the timer when the first packet arrives, not when a getmsg() is done).

So, if the goal is to wait no more than X seconds (where X could be a fraction) for a packet, in a loop that doesn't have to multiplex operations on other descriptors (which appears to be the case, from a quick look at readip_pcap() in tcpip.cc), on systems using BPF, on Windows, and on Tru64 UNIX, set the timeout to X*1000 milliseconds and just do pcap_next() in the loop (as, on those systems, the timer starts when the read is done, not when the first packet is seen) and, on other systems:

  1. put the pcap_t into non-blocking mode;
  2. do a pcap_next();
  3. if that returns NULL, assume it means "no packets available now" (or use pcap_next_ex(), which can actually distinguish between errors and "no packets available now", and do a select() with the timeout in question (as, on most if not all non-BPF/non-Windows systems, select() should work).

guyharris commented Aug 11, 2016 edited

But that all raises the question "what is it you're trying to do here?" What, at a high level (above all of the implementation details of pcap on different platforms) is readip_pcap() doing? What, at a high level, are the requirements for it?

There may be a different, and better, way of accomplishing this.

dmiller-nmap commented Aug 19, 2016 edited

@guyharris Thank you so much for your analysis. I think that what I would like to do is accomplish this in 2 stages: First, a stop-the-bleeding fix that gets us functional on both WinPcap and Npcap without the use of pcap-int.h, and secondly a re-evaluation of the purpose and design of readip_pcap and the functions that call it in order to see if there is a better way to do things (i.e. #123)

To accomplish the first goal, I think it will be possible to take this approach:

  1. Remove the calls to PacketSetReadTimeout,
  2. #ifdef WIN32, use pcap_getevent and WaitForSingleObject to replace the pcap_select calls we use on all other platforms.

I'd appreciate your thoughts on that approach. As far as the intent of readip_pcap, I do believe it could be better served with a callback-style approach, since we want to:

  • process any packets received until some absolute time, at which time we want to pause and send packets.
  • take advantage of buffering to not drop any packets during the time when we are sending our own.
  • process the packets in as close to real time as possible.

I think, for example, that pcap_dispatch could be a good way to handle this, but I'd need more time to grok immediate mode and the various types of timeouts to know whether we'd need to keep time ourselves and use pcap_breakloop to stop at the appointed time.

nmap-bot closed this in f4c5c83 Aug 25, 2016

guyharris commented Aug 27, 2016 edited

we want to:

  • process any packets received until some absolute time, at which time we want to pause and send packets.

The timeout mechanisms in various capture mechanisms don't handle waiting until some absolute time; they wait for some relative time (mostly "relative to when a read is done", although on at least some versions of Solaris it's "relative to when the first packet arrives").

So you'd have to use some other mechanism for that.

If your goal is to ensure that, when that absolute time arrives, the main loop will not be blocked waiting for packets, the best way to do that, on platforms where either select()/poll() or WaitForSingleObject() can be used on the capture device, would, I suspect, be:

put the `pcap_t` in non-blocking mode;

for (;;) {
    wait for packets to arrive, using `select()`/`poll()` or `WaitForSingleObject()`,
         with a timeout that expires when the absolute time in question arrives;

    if packets have arrived, process them with `pcap_dispatch()`, which will
         process only the packets that can be immediately read;

    if the absolute time has arrived, send packets;
}

The "process input packets" and "if the absolute time has arrived, send packets" can be swapped here if that's what's appropriate.

  • take advantage of buffering to not drop any packets during the time when we are sending our own.

The capture mechanisms should all buffer packets that arrive while a read isn't pending, so that should be OK.

  • process the packets in as close to real time as possible.

If you mean "see incoming packets as soon as possible after they arrive", the best way to do that is to use immediate mode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment