Skip to content

Commit

Permalink
test/K8sServices: send datagrams in one block for fragment support tests
Browse files Browse the repository at this point in the history
Tests for IPv4 fragments support introduced a flake in the CI. The test
consists in sending a fragmented datagram and counting (from the
conntrack table) that all fragments were processed as expected. But
sometimes, an additional packet is observed, leading to a failure and to
a message like the following:

    Failed to account for IPv4 fragments (in)
         Expected
             <[]int | len:2, cap:2>: [21, 20]
         To satisfy at least one of these matchers: [%!s(*matchers.EqualMatcher=&{[16 24]}) %!s(*matchers.EqualMatcher=&{[20 20]})]

A normal datagram, as seen by the destination pod, looks like:

    09:02:22.178149 IP (tos 0x0, ttl 63, id 61115, offset 0, flags [+], proto UDP (17), length 1444)
        10.10.0.230.12345 > testds-smpbw.69:  1416 tftp-#0
    09:02:22.178151 IP (tos 0x0, ttl 63, id 61115, offset 1424, flags [+], proto UDP (17), length 1444)
        10.10.0.230 > testds-smpbw: udp
    09:02:22.178233 IP (tos 0x0, ttl 63, id 61115, offset 2848, flags [+], proto UDP (17), length 1444)
        10.10.0.230 > testds-smpbw: udp
    09:02:22.178265 IP (tos 0x0, ttl 63, id 61115, offset 4272, flags [none], proto UDP (17), length 876)
        10.10.0.230 > testds-smpbw: udp

When reproducing the flake, we could observe the additional packet:

    09:02:26.535728 IP (tos 0x0, ttl 63, id 61232, offset 0, flags [DF], proto UDP (17), length 540)
        10.10.0.230.12345 > testds-smpbw.69: [udp sum ok]  512 tftp-#0
    09:02:26.536103 IP (tos 0x0, ttl 63, id 61233, offset 0, flags [+], proto UDP (17), length 1444)
        10.10.0.230.12345 > testds-smpbw.69:  1416 tftp-#0
    09:02:26.536162 IP (tos 0x0, ttl 63, id 61233, offset 1424, flags [+], proto UDP (17), length 1444)
        10.10.0.230 > testds-smpbw: udp
    09:02:26.536274 IP (tos 0x0, ttl 63, id 61233, offset 2848, flags [+], proto UDP (17), length 1444)
        10.10.0.230 > testds-smpbw: udp
    09:02:26.536422 IP (tos 0x0, ttl 63, id 61233, offset 4272, flags [none], proto UDP (17), length 364)
        10.10.0.230 > testds-smpbw: udp

We note that data is split into two datagrams, the first packet being
standalone and the rest being fragmented. The total length received is
the same (3*1444 + 540 + 364 == 3*1444 + 876 + sizeof(IP, UDP headers)).

The fact that the first packet is 512 byte-long is a hint to the
probable cause of the flake. We send the packets with netcat, but source
the data from /dev/zero with "dd" as follows:

    dd if=/dev/zero bs=512 count=10 | nc ...

Most of the time, the different blocks written by dd are passed quick
enough that netcat processes them in one go. But if the machine is under
a heavier load at that moment, it is likely that a small latency is
introduced between the blocks, and netcat sends the data in several
chunks (datagrams).

Let's solve this by copying from /dev/zero in just one block.

Fixes: #10929
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
  • Loading branch information
qmonnet authored and borkmann committed Apr 16, 2020
1 parent bb9c5bb commit 8b74e24
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions test/k8sT/Services.go
Expand Up @@ -481,8 +481,8 @@ var _ = Describe("K8sServicesTest", func() {
// dstPort: Target endpoint port for sending the datagram
doFragmentedRequest := func(srcPod string, srcIP string, fromNode string, dstPodPort int, dstIP string, dstPort int32) {
var (
blockSize = 512
blockCount = 10
blockSize = 5120
blockCount = 1
srcPort = 12345
)
ciliumPodK8s1, err := kubectl.GetCiliumPodOnNode(helpers.CiliumNamespace, helpers.K8s1)
Expand Down

0 comments on commit 8b74e24

Please sign in to comment.