Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't filter vlan and geneve packet #1279

Open
gujun4990 opened this issue Feb 20, 2024 · 17 comments
Open

Can't filter vlan and geneve packet #1279

gujun4990 opened this issue Feb 20, 2024 · 17 comments

Comments

@gujun4990
Copy link

gujun4990 commented Feb 20, 2024

We want to capture the packets including vlan 1264 and geneve. for example:

16:08:46.074231 16:c5:84:65:a4:41 > ba:b3:1d:11:c8:43, ethertype 802.1Q (0x8100), length 160: vlan 1264, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 37785, offset 0, flags [DF], proto UDP (17), length 142)
    46.168.20.3.19312 > 46.168.20.5.6081: [udp sum ok] Geneve, Flags [C], vni 0x13, proto TEB (0x6558), options [class Open Virtual Networking (OVN) (0x102) type 0x80(C) len 8 data 00030002]
	fa:16:3e:4a:1c:dc > 3e:a5:be:e6:2a:4c, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 63, id 39488, offset 0, flags [DF], proto ICMP (1), length 84)
    192.168.101.92 > 172.46.0.1: ICMP echo request, id 1531, seq 18463, length 64
16:08:46.074470 ba:b3:1d:11:c8:43 > 16:c5:84:65:a4:41, ethertype 802.1Q (0x8100), length 160: vlan 1264, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 26501, offset 0, flags [DF], proto UDP (17), length 142)
    46.168.20.5.30750 > 46.168.20.3.6081: [udp sum ok] Geneve, Flags [C], vni 0x12, proto TEB (0x6558), options [class Open Virtual Networking (OVN) (0x102) type 0x80(C) len 8 data 00020003]
	fa:16:3e:23:13:44 > fa:16:3e:06:b1:cd, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 63, id 28164, offset 0, flags [none], proto ICMP (1), length 84)
    172.46.0.1 > 192.168.101.92: ICMP echo reply, id 1531, seq 18463, length 64

We filter the packets using ovs-tcpdump -i Bond1 -nnevv "(vlan 1264) and geneve", but it is failed. I check the related document. From the document's description, I think the above command isn't supported. So I want to know whether there are some filters expressions to support the situation.

@infrastation
Copy link
Member

   ovs-tcpdump creates switch mirror ports in the ovs-vswitchd
   daemon and executes tcpdump to listen against those ports. When
   the tcpdump instance exits, it then cleans up the mirror port it
   created.

Does ovs-tcpdump fail in the sense it prints an error message and fails to run, or in the sense it runs, but fails to capture the expected packets? What does tcpdump --version print? Is Bond1 the correct OVS interface?

@gujun4990
Copy link
Author

gujun4990 commented Feb 20, 2024

The error is below:

[root@node-2 ~]# ovs-tcpdump -i Bond1 -nnevv "(vlan 1264) and geneve"
dropped privs to tcpdump
Warning: Kernel filter failed: Invalid argument
tcpdump: listening on miBond1, link-type EN10MB (Ethernet), capture size 262144 bytes

Seems that filter isn't supported by bpf.
The tcpdump's version is below:

[root@node-2 ~]# tcpdump --version
tcpdump version 4.9.3
libpcap version 1.9.1 (with TPACKET_V3)
OpenSSL 1.1.1g FIPS  21 Apr 2020

In addition, I run ovs-tcpdump -i Bond1 -Od "(vlan 1264) and geneve" command:

[root@node-2 ~]# ovs-tcpdump -i Bond1 -Od "(vlan 1264) and geneve"
(000) ld       #0x0
(001) st       M[4]
(002) st       M[2]
(003) ldb      [-4048]
(004) jeq      #0x1             jt 17	jf 5
(005) ld       M[0]
(006) add      #4
(007) st       M[0]
(008) ld       M[1]
(009) add      #4
(010) st       M[1]
(011) ldh      [12]
(012) jeq      #0x8100          jt 17	jf 13
(013) ldh      [12]
(014) jeq      #0x88a8          jt 17	jf 15
(015) ldh      [12]
(016) jeq      #0x9100          jt 17	jf 98
(017) ldb      [-4048]
(018) jeq      #0x1             jt 19	jf 21
(019) ldb      [-4052]
(020) ja       22
(021) ldh      [14]
(022) and      #0xfff
(023) jeq      #0x4f0           jt 24	jf 98
(024) ldx      M[1]
(025) ldh      [x + 12]
(026) jeq      #0x800           jt 27	jf 58
(027) ldx      M[0]
(028) ldb      [x + 23]
(029) jeq      #0x11            jt 30	jf 58
(030) ldx      M[0]
(031) ldh      [x + 20]
(032) jset     #0x1fff          jt 58	jf 33
(033) ldx      M[0]
(034) ldb      [x + 14]
(035) and      #0xf
(036) lsh      #2
(037) add      x
(038) tax      
(039) ldh      [x + 16]
(040) jeq      #0x17c1          jt 41	jf 58
(041) ldx      M[0]
(042) ldb      [x + 14]
(043) and      #0xf
(044) lsh      #2
(045) add      x
(046) tax      
(047) ldb      [x + 22]
(048) and      #0xc0
(049) jeq      #0x0             jt 50	jf 58
(050) ldx      M[0]
(051) ldb      [x + 14]
(052) and      #0xf
(053) lsh      #2
(054) add      x
(055) tax      
(056) txa      
(057) jeq      x                jt 76	jf 58
(058) ldx      M[1]
(059) ldh      [x + 12]
(060) jeq      #0x86dd          jt 61	jf 98
(061) ldx      M[0]
(062) ldb      [x + 20]
(063) jeq      #0x11            jt 64	jf 98
(064) ldx      M[0]
(065) ldh      [x + 56]
(066) jeq      #0x17c1          jt 67	jf 98
(067) ldx      M[0]
(068) ldb      [x + 62]
(069) and      #0xc0
(070) jeq      #0x0             jt 71	jf 98
(071) ldx      M[0]
(072) ld       #0x28
(073) add      x
(074) tax      
(075) jeq      x                jt 76	jf 98
(076) add      #22
(077) tax      
(078) add      #2
(079) st       M[2]
(080) ldb      [x + 0]
(081) and      #0x3f
(082) mul      #4
(083) add      #8
(084) add      x
(085) st       M[3]
(086) ldh      [x + 2]
(087) ldx      M[3]
(088) jeq      #0x6558          jt 89	jf 94
(089) txa      
(090) add      #12
(091) st       M[2]
(092) add      #2
(093) tax      
(094) stx      M[4]
(095) ld       #0x0
(096) jeq      #0x0             jt 97	jf 98
(097) ret      #262144
(098) ret      #0

The M[0] and M[1] are not initialized to 0?

@infrastation
Copy link
Member

I understand the problem is that the filter does not match any packets. If you run the same command without the filter, does it capture any packets? Does it capture the packets you are looking for? Does the problem reproduce with the latest stable versions of tcpdump and libpcap?

@gujun4990
Copy link
Author

gujun4990 commented Feb 20, 2024

I run ovs-tcpdump -i Bond1 -nnevv command and can capture all packets. And run the command ovs-tcpdump -i Bond1 -nnevv 'vlan 1264' is also correct.
I want to filter the below packets:
企业微信截图_20240220174926
I check another tcpdump version and have the same problem:

root@work:~# tcpdump -i ens3 -nnevv "(vlan 100) and geneve"
Warning: Kernel filter failed: Invalid argument
tcpdump: listening on ens3, link-type EN10MB (Ethernet), snapshot length 262144 bytes

^C
0 packets captured
17 packets received by filter
0 packets dropped by kernel
root@work:~# 
root@work:~# tcpdump --version
tcpdump version 4.99.1
libpcap version 1.10.1 (with TPACKET_V3)
OpenSSL 3.0.2 15 Mar 2022
root@work:~# uname -a
Linux work 5.15.0-92-generic #102-Ubuntu SMP Wed Jan 10 09:33:48 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

@infrastation
Copy link
Member

If you use udp port 6081 instead of geneve in the filter, does it match the packets you expect?

@gujun4990
Copy link
Author

I use ovs-tcpdump -i Bond1 -nnevv "(vlan 1264) and (udp port 6081)" is correct. But I also want to filter the inner packet for geneve, for example, ovs-tcpdump -i Bond1 -nnevv "(vlan 1264) and geneve and icmp to filter the inner icmp packet for geneve.

@infrastation
Copy link
Member

Thank you for confirming. One more question: if you run the following, do you see the packet?

ovs-tcpdump -i Bond1 -w udp6081.pcap -c 1 "(vlan 1264) and (udp port 6081)"
tcpdump -nnevv -r udp6081.pcap "(vlan 1264) and geneve"

@gujun4990
Copy link
Author

gujun4990 commented Feb 20, 2024

ovs-tcpdump -i Bond1 -w udp6081.pcap -c 1 "(vlan 1264) and (udp port 6081)"

I can get the packet using the above command.

[root@node-2 ~]# ovs-tcpdump -i Bond1 -w udp6081.pcap -c 1 "(vlan 1264) and (udp port 6081)"
dropped privs to tcpdump
tcpdump: listening on miBond1, link-type EN10MB (Ethernet), capture size 262144 bytes
1 packet captured
16 packets received by filter
0 packets dropped by kernel
[root@node-2 ~]# vi udp6081.pcap 
[root@node-2 ~]# 
[root@node-2 ~]# tcpdump -nnevv -r udp6081.pcap "(vlan 1264) and geneve"
reading from file udp6081.pcap, link-type EN10MB (Ethernet)
dropped privs to tcpdump
18:37:46.141812 ba:b3:1d:11:c8:43 > 16:c5:84:65:a4:41, ethertype 802.1Q (0x8100), length 120: vlan 1264, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 22245, offset 0, flags [DF], proto UDP (17), length 102)
    46.168.20.5.30581 > 46.168.20.3.6081: [udp sum ok] Geneve, Flags [none], vni 0x0, proto TEB (0x6558)
	2e:ad:5e:3a:d0:20 > 00:23:20:00:00:01, ethertype IPv4 (0x0800), length 66: (tos 0xc0, ttl 255, id 0, offset 0, flags [none], proto UDP (17), length 52)
    169.254.1.1.49262 > 169.254.1.0.3784: [no cksum] BFDv1, length: 24
	Control, State Up, Flags: [none], Diagnostic: No Diagnostic (0x00)
	Detection Timer Multiplier: 3 (300 ms Detection time), BFD Length: 24
	My Discriminator: 0x8e8820d8, Your Discriminator: 0xd70dd484
	  Desired min Tx Interval:     100 ms
	  Required min Rx Interval:   1000 ms
	  Required min Echo Interval:    0 ms

@infrastation
Copy link
Member

Thank you. The problem reproduces during the live capture only, that's why offline GENEVE tests do not fail. This also involves Linux VLANs, but matching a UDP port after the VLAN header works correctly, so the root cause looks related to the combination of VLAN and GENEVE. Can you retry the last test using libcpap 1.10.4 and the master branch?

@gujun4990
Copy link
Author

I compile a tcpdump command using the master branch for libpcap and tcpdump. The same problem as before.

root@work:tcpdump# ./tcpdump --version
tcpdump version 5.0.0-PRE-GIT
libpcap version 1.11.0-PRE-GIT (with TPACKET_V3)
root@work:tcpdump# 
root@work:tcpdump# ./tcpdump -i ens3 -nnevv  "(vlan 1264) and geneve"
Warning: Kernel filter failed: Invalid argument
tcpdump: listening on ens3, link-type EN10MB (Ethernet), snapshot length 262144 bytes

^C
0 packets captured
9 packets received by filter
0 packets dropped by kernel

@guyharris
Copy link
Member

There are, I suspect, two problems here.

Warning: Kernel filter failed: Invalid argument

That's the first one - the compiler is producing code that the kernel rejects, returning an EINVAL error.

The second problem may be that libpcap then just uses that filter in userland, but that won't work, because, thanks to Linux's filtering code receiving packets with VLAN tags removed and put into metadata, different filtering code needs to be used in the kernel than in userland. The resulting filter probably won't work in userland unless it's done before libpcap inserts the VLAN tag back into the packet data, with the userland (classic) BPF interpreter supporting the special Linux BPF loads that fetch metadata.

@guyharris
Copy link
Member

Yeah, the code generator is broken for this case.

@guyharris guyharris added compiling VLAN Tagged frames requiring different filtering to match labels Feb 21, 2024
@gujun4990
Copy link
Author

Thank you for your analysis, whether there are some workarounds to avoid the issue?

@guyharris
Copy link
Member

If there's a way to disable Linux's "pull the VLAN tag out" stuff, that might work, but I'm not sure there's a way to do that.

This is the result of the BPF compiler code code to handle VLANs in the Linux live capture case using a mechanism that's also used by the BPF compiler code to handle Geneve, and their uses of that mechanism step on top of each other. This will take some detangling....

@gujun4990
Copy link
Author

Thanks, Currently we may only parse geneve messages layer by layer without using the geneve keyword.

@guyharris
Copy link
Member

A long time ago, somebody (Dave Mills?) came up with the term "Christmas tree packet":

https://en.wikipedia.org/wiki/Christmas_tree_packet

referring "a packet with every single option set for whatever protocol is in use."

I'd like to see somebody construct a "skyscraper packet" or a "Jenga packet":

https://en.wikipedia.org/wiki/Jenga

with every possible type of tunneling/encapsulation in it - VLAN, MPLS, Geneva, VXLAN, GRE, etc., etc., etc..

@infrastation
Copy link
Member

Even if a Jenga packet always uses every type of header exactly once, quite a few protocols can encapsulate each other many different ways around. Then, given a set of allowed protocols and a matrix of how they may combine, calculating the number of different possible valid Jenga packet varieties would be a matter of a practicable combinatoric exercise. If this number does not yet have a name, let's call it a Harris number. The Harris number for various parts of the Internet, or for the same parts on different years, would be different. However measured, most of the time the value is a non-decreasing function, and each time it increases it almost certainly attracts comments from network protocol analyser developers.

On the subject matter, if the same problem stands for vlan and pppoes or vlan and mpls, it may be a good idea to untangle those cases in the same go. If it does not, maybe the solution could be copied from there.

@infrastation infrastation added BPF related Linux VLAN bug and removed compiling VLAN Tagged frames requiring different filtering to match labels Feb 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants