Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numeric adapter names can't be used for capture #522

Closed
nbastin opened this issue Jun 4, 2016 · 14 comments
Closed

Numeric adapter names can't be used for capture #522

nbastin opened this issue Jun 4, 2016 · 14 comments

Comments

@nbastin
Copy link

nbastin commented Jun 4, 2016

$ tcpdump -i 3703570267332                                                                                                                                                                                                                                                               
tcpdump: Invalid adapter index

tcpdump should try to open the name before it decides it isn't there:

31231: 3703570267332: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master ovs-system state UP mode DEFAULT group default qlen 1000
    link/ether c2:22:16:68:ec:d4 brd ff:ff:ff:ff:ff:ff

Also, adapter indices aren't ifindex numbers (this is clear-ish from the man page, but not from the error message), they're some version of a number that tcpdump makes on its own, which you can only get by running tcpdump -D, which takes literally minutes on systems with lots of interfaces:

$ time sudo tcpdump -D
...
real    2m10.018s
user    0m0.048s
sys     0m4.680s
@infrastation
Copy link
Member

The reason for the tcpdump -D interface number not being the OS ifindex is that libpcap supports other capture channels besides the common network interfaces:

$ ./tcpdump -D
1.enp0s3 [Up, Running]
2.any (Pseudo-device that captures on all interfaces) [Up, Running]
3.lo [Up, Running, Loopback]
4.nflog (Linux netfilter log (NFLOG) interface)
5.nfqueue (Linux netfilter queue (NFQUEUE) interface)
6.usbmon1 (USB bus number 1)

In the list above only numbers 1 and 3 have a corresponding ifindex.

That said, when a network interface name is a number, the existing convention does not work. As the comment in tcpdump.c around line 930 goes:

            /*
             * If the argument is a number, treat it as
             * an index into the list of adapters, as
             * printed by "tcpdump -D".
             *
             * This should be OK on UNIX systems, as interfaces
             * shouldn't have names that begin with digits.
             * It can be useful on Windows, where more than
             * one interface can have the same name.
             */

In this case either the existing convention should be reviewed or the system in question should use a more traditional network interface naming pattern.

@infrastation
Copy link
Member

How many network interfaces does it take to slow tcpdump -D down?

@guyharris
Copy link
Member

The number is not, and has never been intended to be, an ifindex. It is, and has always been intended to be, an ordinal number in the tcpdump -D list of interfaces.

On systems where interfaces have short simple names, the number convention isn't all that useful; however, tcpdump also runs on systems where interfaces have ugly names with UUIDs in them, namely all NT flavors of Windows starting with Windows 2000.

So, yes, tcpdump should try to open the name before it decides it's an ordinal number in the interface list.

@guyharris
Copy link
Member

$ time sudo tcpdump -D
...
real    2m10.018s
user    0m0.048s
sys     0m4.680s

OK, that's a little over 4.7 seconds of CPU time - mostly spent fetching interface information from the kernel, given that almost all of that is system time.

Most of that 2 minutes 10 seconds is presumably spent printing the list of interfaces, as 130 >> 4.7, so the fact that it takes minutes isn't the result of tcpdump or libpcap being slow, it's the result of the list being long, and therefore there's not much tcpdump can do about making it faster.

@nbastin
Copy link
Author

nbastin commented Jun 5, 2016

The time printing the interfaces is not that long - you can try this even on a system with only 100 interfaces or so and the user time is much longer than the time spent printing the interfaces - the lag time between running the command and any output showing up is quite a while. Much of the time seems to be spent in some never ending sequence of socket operations that look a little odd (although this isn't my jam, so maybe this looks ok):

...
socket(PF_PACKET, SOCK_RAW, 768)        = 3
ioctl(3, SIOCGIFINDEX, {ifr_name="lo", ifr_index=1}) = 0
ioctl(3, SIOCGIFHWADDR, {ifr_name="995970708465", ifr_hwaddr=4a:f0:05:46:e0:e7}) = 0
ioctl(3, SIOCGIFINDEX, {ifr_name="995970708465", ifr_index=31143}) = 0
bind(3, {sa_family=AF_PACKET, proto=0x03, if31143, pkttype=PACKET_HOST, addr(0)={0, }, 20) = 0
getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
setsockopt(3, SOL_PACKET, PACKET_AUXDATA, [1], 4) = 0
getsockopt(3, SOL_PACKET, PACKET_HDRLEN, [36], [4]) = 0
setsockopt(3, SOL_PACKET, PACKET_VERSION, [2], 4) = 0
setsockopt(3, SOL_PACKET, PACKET_RESERVE, [4], 4) = 0
setsockopt(3, SOL_PACKET, PACKET_RX_RING, "\0\0\2\0\20\0\0\0\0\0\2\0\20\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 28) = 0
mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x7f508a960000
setsockopt(3, SOL_PACKET, PACKET_RX_RING, {block_size=0, block_nr=0, frame_size=0, frame_nr=0}, 16) = -1 EINVAL (Invalid argument)
munmap(0x7f508a960000, 2097152)         = 0
close(3)                                = 0
socket(PF_PACKET, SOCK_RAW, 768)        = 3
ioctl(3, SIOCGIFINDEX, {ifr_name="lo", ifr_index=1}) = 0
ioctl(3, SIOCGIFHWADDR, {ifr_name="118947056582", ifr_hwaddr=0a:a8:d9:86:1d:0c}) = 0
ioctl(3, SIOCGIFINDEX, {ifr_name="118947056582", ifr_index=31144}) = 0
bind(3, {sa_family=AF_PACKET, proto=0x03, if31144, pkttype=PACKET_HOST, addr(0)={0, }, 20) = 0
getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
setsockopt(3, SOL_PACKET, PACKET_AUXDATA, [1], 4) = 0
getsockopt(3, SOL_PACKET, PACKET_HDRLEN, [36], [4]) = 0
setsockopt(3, SOL_PACKET, PACKET_VERSION, [2], 4) = 0
setsockopt(3, SOL_PACKET, PACKET_RESERVE, [4], 4) = 0
setsockopt(3, SOL_PACKET, PACKET_RX_RING, "\0\0\2\0\20\0\0\0\0\0\2\0\20\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 28) = 0
mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x7f508a960000
setsockopt(3, SOL_PACKET, PACKET_RX_RING, {block_size=0, block_nr=0, frame_size=0, frame_nr=0}, 16) = -1 EINVAL (Invalid argument)
munmap(0x7f508a960000, 2097152)         = 0
close(3)                                = 0
socket(PF_PACKET, SOCK_RAW, 768)        = 3
ioctl(3, SIOCGIFINDEX, {ifr_name="lo", ifr_index=1}) = 0
ioctl(3, SIOCGIFHWADDR, {ifr_name="8666457062610", ifr_hwaddr=62:e6:2e:ab:54:39}) = 0
ioctl(3, SIOCGIFINDEX, {ifr_name="8666457062610", ifr_index=31913}) = 0
bind(3, {sa_family=AF_PACKET, proto=0x03, if31913, pkttype=PACKET_HOST, addr(0)={0, }, 20) = 0
getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
setsockopt(3, SOL_PACKET, PACKET_AUXDATA, [1], 4) = 0
getsockopt(3, SOL_PACKET, PACKET_HDRLEN, [36], [4]) = 0
setsockopt(3, SOL_PACKET, PACKET_VERSION, [2], 4) = 0
setsockopt(3, SOL_PACKET, PACKET_RESERVE, [4], 4) = 0
setsockopt(3, SOL_PACKET, PACKET_RX_RING, "\0\0\2\0\20\0\0\0\0\0\2\0\20\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 28) = 0
mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x7f508a960000
setsockopt(3, SOL_PACKET, PACKET_RX_RING, {block_size=0, block_nr=0, frame_size=0, frame_nr=0}, 16) = -1 EINVAL (Invalid argument)
munmap(0x7f508a960000, 2097152)         = 0
close(3)                                = 0
...

@nbastin
Copy link
Author

nbastin commented Jun 5, 2016

How many network interfaces does it take to slow tcpdump -D down?

288 interfaces (not counting "any") seems to take ~10 seconds on a 24 core system:

$ time sudo tcpdump -D > foo.txt

real    0m9.995s
user    0m0.053s
sys     0m0.344s
$ wc -l foo.txt
289 foo.txt

2500 interfaces takes a few minutes.

@nbastin
Copy link
Author

nbastin commented Jun 5, 2016

Some strace summary output for reference:

$ time sudo strace -c tcpdump -D > foo.txt
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 40.90    0.165999         115      1445       289 setsockopt
 17.91    0.072697         250       291           munmap
 14.21    0.057693         188       307           mmap
  7.05    0.028623          95       302           close
  6.08    0.024675         160       154           recvmsg
  3.85    0.015623          12      1305           ioctl
  3.47    0.014073          49       289           bind
  2.83    0.011484          26       439           lstat
  1.82    0.007404          13       577           getsockopt
  1.46    0.005923          20       295           socket
  0.11    0.000447         112         4           getdents
  0.11    0.000428          16        26        21 open
  0.08    0.000324          81         4           sendto
  0.03    0.000118          12        10           mprotect
  0.02    0.000092          46         2           write
  0.02    0.000072          36         2           openat
  0.01    0.000056           9         6           brk
  0.01    0.000050           8         6           fstat
  0.01    0.000044           7         6         6 access
  0.01    0.000027          14         2           recvfrom
  0.01    0.000024           6         4           read
  0.01    0.000021          21         1           execve
  0.00    0.000009           9         1           getsockname
  0.00    0.000008           8         1           arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00    0.405914                  5479       316 total

real    0m9.788s
user    0m0.074s
sys     0m0.810s

Something in there waits for some period of time that isn't charged to sys, and it isn't write that's driving up the real time. It seems to scale with the number of interfaces, so my guess is setsockopt, but that's a pretty uninformed guess.

@infrastation
Copy link
Member

Could you confirm those are the latest master branch builds of tcpdump and libpcap and also tell the distribution and version of the OS? Thank you.

@nbastin
Copy link
Author

nbastin commented Jun 5, 2016

They're definitely not:

$ tcpdump -h
tcpdump version 4.5.1
libpcap version 1.5.3

The userland is Ubuntu 14.04.4 server LTS.

@infrastation
Copy link
Member

Could you please check using the latest versions?

@nbastin
Copy link
Author

nbastin commented Jun 7, 2016

I certainly can't change the libpcap version on production systems - I'll see what I can do about building some static binaries for testing.

That being said, the code is the same now - as identified earlier in the thread. There's nothing that has changed that will change the fact that tcpdump can't capture on these interfaces without finding their "indices" according to tcpdump. I don't care how long it takes tcpdump to generate these index numbers - I care that I can't use the actual name.

@infrastation
Copy link
Member

To build the latest binary you would need a git clone of libpcap and a git clone of tcpdump (next to each other such that tcpdump configure script can find libpcap in ../libpcap. Then if you build libpcap first and then tcpdump, the tcpdump library will have libpcap linked in so you should be able just to copy it around. But your other point is clear.

@infrastation
Copy link
Member

(If anybody feels up to working on this, please feel free to do, I am mainly collecting the details when I have a spare minute).

bijalthanawala pushed a commit to bijalthanawala/tcpdump that referenced this issue Jul 17, 2016
…d only on failure look up numeric indexed device
guyharris added a commit that referenced this issue Aug 9, 2016
Some UN*Xes apparently can have purely numeric interface names, so we
can't unconditionally treat purely numeric interface names as interface
indices.  Do so only for "names" that don't correspond to interfaces.

Fixes GitHub issue #522.
guyharris added a commit that referenced this issue Aug 9, 2016
If N is a number, and is the name of an interface on the system, "-i N"
will now attempt to open that interface, not the Nth interface in the
list of interfaces.  See GitHub issue #522.
ghost pushed a commit to wireshark/wireshark that referenced this issue Jun 2, 2020
…umber.

Our current assumption that a numeric argument to -i must be an index
interface is not valid on Linux; see

	the-tcpdump-group/tcpdump#522

Change-Id: Ieb6e17e6ceb23095a463336f0c88182373503aa6
Reviewed-on: https://code.wireshark.org/review/37369
Reviewed-by: Guy Harris <gharris@sonic.net>
@guyharris
Copy link
Member

Much of the time seems to be spent in some never ending sequence of socket operations that look a little odd (although this isn't my jam, so maybe this looks ok):

...
socket(PF_PACKET, SOCK_RAW, 768)        = 3
ioctl(3, SIOCGIFINDEX, {ifr_name="lo", ifr_index=1}) = 0
ioctl(3, SIOCGIFHWADDR, {ifr_name="995970708465", ifr_hwaddr=4a:f0:05:46:e0:e7}) = 0
ioctl(3, SIOCGIFINDEX, {ifr_name="995970708465", ifr_index=31143}) = 0
bind(3, {sa_family=AF_PACKET, proto=0x03, if31143, pkttype=PACKET_HOST, addr(0)={0, }, 20) = 0
getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
setsockopt(3, SOL_PACKET, PACKET_AUXDATA, [1], 4) = 0
getsockopt(3, SOL_PACKET, PACKET_HDRLEN, [36], [4]) = 0
setsockopt(3, SOL_PACKET, PACKET_VERSION, [2], 4) = 0
setsockopt(3, SOL_PACKET, PACKET_RESERVE, [4], 4) = 0
setsockopt(3, SOL_PACKET, PACKET_RX_RING, "\0\0\2\0\20\0\0\0\0\0\2\0\20\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 28) = 0
mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x7f508a960000
setsockopt(3, SOL_PACKET, PACKET_RX_RING, {block_size=0, block_nr=0, frame_size=0, frame_nr=0}, 16) = -1 EINVAL (Invalid argument)
munmap(0x7f508a960000, 2097152)         = 0
close(3)                                = 0

At the time the bug was filed, libpcap tried to open all interfaces found by the call made to find interfaces, so that only interfaces that could be opened were added.

That caused users to see no interfaces when they didn't have permission to capture, which caused them to ask on Wireshark Q&A/ask on Wireshark mailing lists/etc. "why isn't Wireshark seeing any interfaces?", which caused Wireshark developers to spend extra time figuring out that the problem was permissions (and, if they were running on a Debian derivative, telling them to run the installer command that gave Wireshark's dumpcap program the capability needed to capture).

So I changed libpcap so that it used different mechanisms - if any - to determine whether an interface is supported by the capture mechanism (that test dates back to issues with the loopback device on Solaris 10 and earlier - it showed up as an interface but the DLPI capture mechanism didn't support it - and "don't show loopback interfaces" was clearly the wrong answer, as both the BSDs and Linux do support capturing on it).

For Linux, that meant "no mechanism"; if it shows up as an interface, you should be able to bind it to a PF_PACKET socket.

So current versions of libpcap should do much less work getting the interface list - and, in a quick test on an Ubuntu virtual machine with the tip of the master branch, they do; that change was made a while ago, so libpcap 1.9.x, at least, should skip that stuff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants