test/K8sServices: send datagrams in one block for fragment support tests #11016

Tests for IPv4 fragments support introduced a flake in the CI. The test consists in sending a fragmented datagram and counting (from the conntrack table) that all fragments were processed as expected. But sometimes, an additional packet is observed, leading to a failure and to a message like the following: Failed to account for IPv4 fragments (in) Expected <[]int | len:2, cap:2>: [21, 20] To satisfy at least one of these matchers: [%!s(*matchers.EqualMatcher=&{[16 24]}) %!s(*matchers.EqualMatcher=&{[20 20]})] A normal datagram, as seen by the destination pod, looks like: 09:02:22.178149 IP (tos 0x0, ttl 63, id 61115, offset 0, flags [+], proto UDP (17), length 1444) 10.10.0.230.12345 > testds-smpbw.69: 1416 tftp-#0 09:02:22.178151 IP (tos 0x0, ttl 63, id 61115, offset 1424, flags [+], proto UDP (17), length 1444) 10.10.0.230 > testds-smpbw: udp 09:02:22.178233 IP (tos 0x0, ttl 63, id 61115, offset 2848, flags [+], proto UDP (17), length 1444) 10.10.0.230 > testds-smpbw: udp 09:02:22.178265 IP (tos 0x0, ttl 63, id 61115, offset 4272, flags [none], proto UDP (17), length 876) 10.10.0.230 > testds-smpbw: udp When reproducing the flake, we could observe the additional packet: 09:02:26.535728 IP (tos 0x0, ttl 63, id 61232, offset 0, flags [DF], proto UDP (17), length 540) 10.10.0.230.12345 > testds-smpbw.69: [udp sum ok] 512 tftp-#0 09:02:26.536103 IP (tos 0x0, ttl 63, id 61233, offset 0, flags [+], proto UDP (17), length 1444) 10.10.0.230.12345 > testds-smpbw.69: 1416 tftp-#0 09:02:26.536162 IP (tos 0x0, ttl 63, id 61233, offset 1424, flags [+], proto UDP (17), length 1444) 10.10.0.230 > testds-smpbw: udp 09:02:26.536274 IP (tos 0x0, ttl 63, id 61233, offset 2848, flags [+], proto UDP (17), length 1444) 10.10.0.230 > testds-smpbw: udp 09:02:26.536422 IP (tos 0x0, ttl 63, id 61233, offset 4272, flags [none], proto UDP (17), length 364) 10.10.0.230 > testds-smpbw: udp We note that data is split into two datagrams, the first packet being standalone and the rest being fragmented. The total length received is the same (3*1444 + 540 + 364 == 3*1444 + 876 + sizeof(IP, UDP headers)). The fact that the first packet is 512 byte-long is a hint to the probable cause of the flake. We send the packets with netcat, but source the data from /dev/zero with "dd" as follows: dd if=/dev/zero bs=512 count=10 | nc ... Most of the time, the different blocks written by dd are passed quick enough that netcat processes them in one go. But if the machine is under a heavier load at that moment, it is likely that a small latency is introduced between the blocks, and netcat sends the data in several chunks (datagrams). Let's solve this by copying from /dev/zero in just one block. Fixes: #10929 Signed-off-by: Quentin Monnet <quentin@isovalent.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test/K8sServices: send datagrams in one block for fragment support tests #11016

test/K8sServices: send datagrams in one block for fragment support tests #11016

Commits on Apr 16, 2020