Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Npcap 0.9990: Disrupts loopback connectivity/consumes whole core on WS2016 #26

Closed
akontsevoy opened this issue May 12, 2020 · 14 comments
Closed
Labels

Comments

@akontsevoy
Copy link

@akontsevoy akontsevoy commented May 12, 2020

Greetings,

We have reports from multiple users that Npcap version 0.9990 intermittently disrupts connectivity on the loopback interface (causing connections made on 127.0.0.1 to time out). The issue seems isolated to Windows Server 2016, with isolated reports for WS2019. When it happens, the System process is seen to consume a whole CPU. The problem is resolved when Npcap is downgraded to 0.993 (the version we bundled previously). Has this been a known issue, possibly addressed in 0.9991? The similar issues I'm finding are either Windows 10 or Windows 7.

P.S. This happens despite our software not capturing anything from NPF_Loopback (not even opening a pcap handle on that adapter).

@guyharris
Copy link

@guyharris guyharris commented May 13, 2020

What happens with Npcap 0.9991? They made some changes in 0.9991 that might address this.

@dmiller-nmap
Copy link
Contributor

@dmiller-nmap dmiller-nmap commented May 14, 2020

Thanks for the report. We will look into this; I made a fix (5a90689) to an issue nobody reported related to loopback traffic filtering, but I'm not sure if it's related.

@akontsevoy
Copy link
Author

@akontsevoy akontsevoy commented May 19, 2020

@guyharris Unfortunately, Npcap 0.9991 is affected as well.

@dmiller-nmap Not sure if it's the same issue, but on our end, we ran this problem down to open handle and memory leaks in pcap_findalldevs(). Our software calls this periodically (approx. every 30 seconds) to discover new adapters to capture traffic from. But each time this is called, it leaves behind a few open handles and blocks of memory, even after pcap_freealldevs() is called. Over time, the amount of open handles gets into 100000s range, severely impacting system performance.

On Windows Server 2016, this issue consistently reproduces with Npcap 0.9990 and 0.9991, but not 0.993. But on Windows 10 1909, it reproduces with 0.993 as well -- and even with WinPcap 4.1.3. As the number of open handles accumulates, network performance degrades significantly with 0.999x, but not with other versions.

As we need to support WinPcap as well as Npcap, we are currently developing a workaround that avoids the usage of pcap_findalldevs() altogether in favor of the usual GetAdaptersAddresses() -- relying on the fact that the adapter UUIDs returned by pcap_findalldevs() and GetAdaptersAddresses() are the same. It is understood that not all devices returned by GetAdaptersAddresses() can be opened with pcap_create("\Device\NPF_<uuid>") / pcap_activate(), but given that we are only interested in enabled non-loopback devices that have an IP address, I think that's reasonably safe (objections?). As an added bonus, GetAdaptersAddresses() returns a lot more information about adapters (including proper device descriptions) and permits a more fine-grained control over which devices we want to include or exclude. It also appears to be a lot faster, and (unlike pcap_findalldevs()) does not seem to affect the performance of the pcap handles that are currently open and capturing, and is not subject to old bugs in WinPcap where IP addresses and loopback flags are returned inconsistently.

@guyharris
Copy link

@guyharris guyharris commented May 20, 2020

As an added bonus, GetAdaptersAddresses() returns a lot more information about adapters (including proper device descriptions) and permits a more fine-grained control over which devices we want to include or exclude. It also appears to be a lot faster, and (unlike pcap_findalldevs()) does not seem to affect the performance of the pcap handles that are currently open and capturing, and is not subject to old bugs in WinPcap where IP addresses and loopback flags are returned inconsistently.

pcap_findalldevs() doesn't directly fetch the interface list; it just calls pcap_platform_finddevs() to find all the regular interfaces (as well as calling the "find devices" routines for various pcap "modules", if configured in), and the NPF version (in pcap-npf.c) calls PacketGetAdapterNames() twice - once to get the size of the adapter list so that it can allocate a buffer for the list, and once to read the list into the buffer.

PacketGetAdapterNames() calls PacketPopulateAdaptersInfoList(). PacketPopulateAdaptersInfoList() always empties out the current list and then calls, in order:

  • PacketGetAdaptersNPF();
  • PacketGetAdaptersIPH();
  • PacketAddAdapterNPF() to add the loopback adapter, if loopback support is configured in.;
  • PacketGetAdaptersAirpcap(), if the AirPcap DLL is loaded;
  • PacketGetAdaptersDag(), if the Endace DAG card support is available;

to fill in the list.

PacketGetAdaptersNPF() digs through the Registry to find adapters; for each adapter it finds, it calls PacketAddAdapterNPF().

PacketAddAdapterNPF() first iterates over the list to see if it's already been added, and just returns if it has. Otherwise, it attempts to open the adapter, using PacketOpenAdapterNPF() and, if that succeeds, adds it. (I've omitted some details such as the INFO_FLAG_DONT_EXPORT flag.)

PacketAddAdapterNPF() tries to find the IPv4 and IPv6 addresses for the adapter.

To find the IPv4 addresses, it calls PacketGetAddressesFromRegistry(). PacketGetAddressesFromRegistry() calls IsIPv4Enabled(), which calls...

...GetAdaptersAddresses().

To get all the IPv6 addresses for the adapter, it calls PacketAddIP6Addresses().

PacketAddIP6Addresses() gets the adapter's IPv6 addresses by...

...calling GetAdaptersAddresses(), looking for the adapter in the list, and getting the IPv6 addresses from that list.

So we end up calling GetAdaptersAddresses() in this process. We end up calling it twice per interface in this process.

Oh, and then pcap_platform_finddevs()() calls packet_add_if_npf() per interface; that then calls PacketGetNetInfoEx(), which again calls PacketGetAddressesFromRegistry() and PacketAddIP6Addresses(), so there are even more per-adapter GetAdaptersAddresses() calls.

This code needs some work. I have a version of libpcap's pcap-npf.c where pcap_platform_finddevs() calls GetAdaptersAddresses() directly and iterates once through the list it provides. Unfortunately, it currently has to call PacketOpenAdapter() to check whether the interface is supported by Npcap, and that ends up calling PacketFindAdInfo(), which ends up calling PacketPopulateAdaptersInfoList() if the list is empty. That's used to determine whether the device is a regular NPF device, a DAG device, or an AirPcap device.

The current master branch of libpcap includes direct support for AirPcap devices (tested on a Windows VM; it works). If we were to remove the AirPcap support from Npcap, and leave that up to libpcap, that would leave only the DAG devices. Endace currently doesn't support Windows, only Linux and FreeBSD, and libpcap already has DAG support, so that might be another case to leave up to libpcap; were Endace to support Windows again, they'll probably either contribute whatever changes are needed to pcap-dag.c or I'd nag them into doing so. :-)

So the first question is whether, were packet.dll not to provide PacketGetAdapterNames() at all, the list of adapters would be necessary.

If not, it could be removed, and PacketOpenAdapter() could just try opening the device through the driver, with the new pcap-npf.c used; that would probably be significantly faster (the profiling I did with a test program that repeatedly called pcap_findalldevs() found a lot of the time in pcap_findalldevs() is spent in repeated calls to GetAdaptersAddresses()`).

If it is necessary, perhaps it could just be a list of names, not full interface information.

Alternatively, a PacketGetAdapterInfo() routine could be provided, which just hands back a buffer full of IP_ADAPTER_ADDRESSES structures fetched from a single GetAdaptersAddresses() call. It could allocate a buffer large enough for all the adapters found by GetAdaptersAddresses() plus an extra entry, added at the beginning, for the loopback device, if present. In addition, it could adjust the "next interface" pointers so that the list in that buffer doesn't include interfaces not supported by Npcap. It would return both a pointer to the raw buffer, for the caller to hand to a packet.dll routine to free, and a pointer to the first interface supported by Npcap. That would let the caller decide what information it wants; some of the information in the IP_ADAPTER_ADDRESSES structure wouldn't be useful to pcap_findalldevs(), but might be useful to future libpcap APIs that would return more information (e.g., many of the options in a pcapng Interface Description Block).

@akontsevoy
Copy link
Author

@akontsevoy akontsevoy commented May 20, 2020

...oh, and PacketGetAdaptersIPH() also calls GetAdaptersInfo(), which is probably a wrapper for GetAdaptersAddresses() since Windows XP. :D

@guyharris
Copy link

@guyharris guyharris commented May 20, 2020

...oh, and PacketGetAdaptersIPH() also calls GetAdaptersInfo(), which is probably a wrapper for GetAdaptersAddresses() since Windows XP. :D

I'm not sure we need PacketGetAdaptersIPH() and PacketGetAdaptersNPF(), unless there are Npcap-usable adapters that PacketGetAdaptersIPH() would find but PacketGetAdaptersNPF() wouldn't find and Npcap-usable adapters that PacketGetAdaptersNPF() would find but PacketGetAdaptersIPH() would find.

@dmiller-nmap
Copy link
Contributor

@dmiller-nmap dmiller-nmap commented May 20, 2020

As a temporary workaround, you can disable all Loopback capture functions by setting the Registry DWORD value HKLM\SYSTEM\CurrentControlSet\Services\npcap\Parameters\LoopbackSupport to 0. This will take effect when the driver is restarted, but will be overwritten at installation time. This does not correspond to the /loopback_support=no installer command-line option, which instead disables the "Legacy loopback support" option, preventing the installation of the Npcap Loopback Adapter.

@fyodor fyodor transferred this issue from nmap/nmap May 20, 2020
@fyodor fyodor added the bug label May 20, 2020
@akontsevoy
Copy link
Author

@akontsevoy akontsevoy commented May 21, 2020

@dmiller-nmap This happens to remediate the leak on WS2016 where it was originally reported, but on W10 1909 the leak still reproduces as of 0.9991 (and even 0.993 which did not have loopback capture functionality in that form). So this must not be the only leak in that mess of a function...

@guyharris
Copy link

@guyharris guyharris commented May 21, 2020

I've filed #169 for the interface-enumeration changes I mentioned.

@fyodor
Copy link
Member

@fyodor fyodor commented May 26, 2020

Dan (@dmiller-nmap) had some great ideas on this when we chatted today:

I've tried in the past to make the loopback initialization functions conditional on actually needing them, but it didn't work exactly right, so I ended up backing those changes off a bit. The changes I've made recently are the latest attempt. We really want to avoid registering all our WFP resources until they're really needed, because some people have problems if that's even touched. But for some reason, it seems the registration doesn't work. I've got a debugger paused at the moment about to find out if the very first registration works and subsequent ones fail, or what. But #169 may allow us to back out these changes and still keep most of the benefit. Because it wouldn't matter very much if everything gets initialized right away when you open an adapter except that each adapter gets opened multiple times in a single pcap_findalldevs() call. So what I'm envisioning based on Guy's suggestions is this:

  1. we go through Packet.dll and reduce the adapter discovery routines for NPF and IPH (IP Helper API) to a single routine, keeping track of what information is discovered there.
  2. Replace the GetAdaptersAddresses() and Registry stuff with a IoControl that the driver can answer with the information it already has on hand.
  3. Implement a way to handle a CreateFile (Open) call for a special name like \Devices\NPF_Info that can handle this new IoControl but not any of the other capture stuff.

@guyharris
Copy link

@guyharris guyharris commented May 26, 2020

  1. Replace the GetAdaptersAddresses() and Registry stuff with a IoControl that the driver can answer with the information it already has on hand.

So the per-interface information that libpcap currently uses or may use in the future is:

  • the name to use in pcap_open_live()/pcap_open()/pcap_create();
  • the "friendly name" (that's what Microsoft calls it);
  • the "description" (hardware description);
  • the MAC address;
  • the transmit and receive link speeds;
  • all the IPv4 and IPv6 addresses, netmasks, and broadcast addresses assigned to the adapter;
  • something that can be mapped to a value like the UN*X "UP" and "RUNNING" flags - the libpcap master branch's pcap-npf.c maps the result of OID_GEN_HARDWARE_STATUS to those flags;
  • the time stamp types supported by the adapter (currently the same for all adapters, and dependent on whether the kernel provides high-precision values of the system clock, but if Microsoft ever supports getting packet time stamps from adapters that support it, that should also be provided);
  • the NdisMediumXXX value for the adapter;
  • if the adapter can support monitor mode, the NdisMediumXXX value that would be used in monitor mode, otherwise a "no monitor mode" indication.

If there were both an ioctl - and corresponding packet.dll routine, unless you decide to directly expose the BIOC ioctls and a routine to allow direct ioctl calls - to get the information for all adapters and a routine to get the information only for the currently open adapter, that would be useful (either now or in the future).

@fyodor
Copy link
Member

@fyodor fyodor commented Jun 4, 2020

From chat with @dmiller-nmap today: "We partially addressed this in Npcap 0.9992 by gating off the loopback functionality, but there's still a chance that if someone explicitly started a loopback capture there it would have the same result. It's possible even a MODE_STAT capture (just packet statistics) could trigger it, like Wireshark uses for their sparkline graphs. We just really don't know what the underlying cause is. Who knows, was a weird interaction between the WFP callback stuff and the worker thread queue that only existed in 0.9991."

So we're leaving this open for now and watching to see whether we get further reports that we can investigate.

@guyharris
Copy link

@guyharris guyharris commented Jun 4, 2020

From chat with @dmiller-nmap today: "We partially addressed this in Npcap 0.9992 by gating off the loopback functionality, but there's still a chance that if someone explicitly started a loopback capture there it would have the same result. It's possible even a MODE_STAT capture (just packet statistics) could trigger it, like Wireshark uses for their sparkline graphs.

Wireshark doesn't use MODE_STAT for the sparklines; it uses regular capture and, for each packet seen, just counts it, and the counts are used for the sparklines. (It has to work on UN*X, so it has to work on platforms without pcap_setmode(), so the code to do that had to be written anyway.)

@dmiller-nmap
Copy link
Contributor

@dmiller-nmap dmiller-nmap commented Jul 16, 2020

I wish I could have found this before Npcap 0.9995, but it'll be fixed in next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants