Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

selftests: mptcp_connect.sh sometimes fails when validating SYN cookies #318

Closed
matttbe opened this issue Nov 30, 2022 · 3 comments
Closed

Comments

@matttbe
Copy link
Member

matttbe commented Nov 30, 2022

It is not that often but reported at least one or two times per week by the public CI.

Here is an example:

# ns1 MPTCP -> ns2 (10.0.1.2:10006      ) MPTCP	(duration   730ms) [ OK ]
# ns1 MPTCP -> ns2 (dead:beef:1::2:10007) MPTCP	write: Connection reset by peer
# copyfd_io_poll: poll timed out (events: POLLIN 1, POLLOUT 4)
# (duration 30675ms) [ FAIL ] client exit code 111, server 0
# 
# netns ns2-6386f927-RqxpSD socket stat for 10007:
# Netid State Recv-Q Send-Q Local Address:Port Peer Address:PortProcess
# TcpPassiveOpens                 1                  0.0
# TcpEstabResets                  1                  0.0
# TcpInSegs                       7                  0.0
# TcpOutSegs                      8                  0.0
# TcpOutRsts                      1                  0.0
# TcpExtSyncookiesSent            1                  0.0
# TcpExtSyncookiesRecv            2                  0.0
# TcpExtSyncookiesFailed          1                  0.0
# TcpExtTCPPureAcks               1                  0.0
# TcpExtTCPReqQFullDoCookies      1                  0.0
# TcpExtTCPOrigDataSent           6                  0.0
# MPTcpExtMPCapableSYNRX          1                  0.0
# MPTcpExtMPCapableACKRX          1                  0.0
# 
# netns ns1-6386f927-RqxpSD socket stat for 10007:
# Netid State Recv-Q Send-Q Local Address:Port Peer Address:PortProcess
# TcpActiveOpens                  1                  0.0
# TcpEstabResets                  1                  0.0
# TcpInSegs                       5                  0.0
# TcpOutSegs                      11                 0.0
# TcpOutRsts                      3                  0.0
# TcpExtTCPOrigDataSent           6                  0.0
# TcpExtTCPDelivered              1                  0.0
# MPTcpExtMPCapableSYNTX          1                  0.0
# MPTcpExtMPCapableSYNACKRX       1                  0.0
# 
# ns1 MPTCP -> ns2 (10.0.2.1:10008      ) MPTCP	(duration   691ms) [ OK ]
# ns1 MPTCP -> ns2 (dead:beef:2::1:10009) MPTCP	(duration   706ms) [ OK ]

From:

We briefly discussed about that at the meeting last week and it could be eventually fixed by this series: https://lore.kernel.org/all/20221130094448.4119946-1-matthieu.baerts@tessares.net/ → We just got it in IPv4 with the patches applied so it is not linked.
Still, it is good to keep track of this issue.

@matttbe matttbe added this to Needs triage in MPTCP Bugs via automation Nov 30, 2022
@matttbe
Copy link
Member Author

matttbe commented Dec 6, 2022

I just got an issue but with IPv4 this time:

# ns1 TCP   -> ns1 (dead:beef:1::1:10005) MPTCP	(duration   665ms) [ OK ]
# ns1 MPTCP -> ns2 (10.0.1.2:10006      ) MPTCP	write: Connection reset by peer
# copyfd_io_poll: poll timed out (events: POLLIN 1, POLLOUT 4)
# (duration 30647ms) [ FAIL ] client exit code 111, server 0
# 
# netns ns2-638f4061-faG9Gz socket stat for 10006:
# Netid State Recv-Q Send-Q Local Address:Port Peer Address:PortProcess
# TcpPassiveOpens                 1                  0.0
# TcpEstabResets                  1                  0.0
# TcpInSegs                       7                  0.0
# TcpOutSegs                      7                  0.0
# TcpOutRsts                      1                  0.0
# TcpExtSyncookiesSent            1                  0.0
# TcpExtSyncookiesRecv            2                  0.0
# TcpExtSyncookiesFailed          1                  0.0
# TcpExtTCPPureAcks               1                  0.0
# TcpExtTCPReqQFullDoCookies      1                  0.0
# TcpExtTCPOrigDataSent           5                  0.0
# MPTcpExtMPCapableSYNRX          1                  0.0
# MPTcpExtMPCapableACKRX          1                  0.0
# 
# netns ns1-638f4061-faG9Gz socket stat for 10006:
# Netid State Recv-Q Send-Q Local Address:Port Peer Address:PortProcess
# TcpActiveOpens                  1                  0.0
# TcpEstabResets                  1                  0.0
# TcpInSegs                       5                  0.0
# TcpOutSegs                      11                 0.0
# TcpOutRsts                      3                  0.0
# TcpExtTCPOrigDataSent           6                  0.0
# TcpExtTCPDelivered              1                  0.0
# MPTcpExtMPCapableSYNTX          1                  0.0
# MPTcpExtMPCapableSYNACKRX       1                  0.0
# 
# ns1 MPTCP -> ns2 (dead:beef:1::2:10007) MPTCP	(duration  1345ms) [ OK ]

With a debug kernel (on top of export-net):

@matttbe matttbe changed the title selftests: mptcp_connect.sh sometimes fails when validating IPv6 and SYN cookies selftests: mptcp_connect.sh sometimes fails when validating SYN cookies Dec 6, 2022
@matttbe
Copy link
Member Author

matttbe commented Feb 1, 2023

Notes from the last meeting:

  • still appearing in the public CI (mostly IPv6)
  • seems difficult to reproduce
  • might help to stress the host and lower prio of the VM
  • maybe a sysctl to reduce the timeout → didn't find any
    • add losses? → maybe

matttbe pushed a commit that referenced this issue Aug 4, 2023
Add unit tests for new ldsx insns. The test includes sign-extension
with a single value or with a value range.

If cpuv4 is not supported due to
  (1) older compiler, e.g., less than clang version 18, or
  (2) test runner test_progs and test_progs-no_alu32 which tests
      cpu v2 and v3, or
  (3) non-x86_64 arch not supporting new insns in jit yet,
a dummy program is added with below output:
  #318/1   verifier_ldsx/cpuv4 is not supported by compiler or jit, use a dummy test:OK
  #318     verifier_ldsx:OK
to indicate the test passed with a dummy test instead of actually
testing cpuv4. I am using a dummy prog to avoid changing the
verifier testing infrastructure. Once clang 18 is widely available
and other architectures support cpuv4, at least for CI run,
the dummy program can be removed.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230728011304.3719139-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
matttbe pushed a commit that referenced this issue Mar 27, 2024
In case when is64 == 1 in emit(A64_REV32(is64, dst, dst), ctx) the
generated insn reverses byte order for both high and low 32-bit words,
resuling in an incorrect swap as indicated by the jit test:

[ 9757.262607] test_bpf: #312 BSWAP 16: 0x0123456789abcdef -> 0xefcd jited:1 8 PASS
[ 9757.264435] test_bpf: #313 BSWAP 32: 0x0123456789abcdef -> 0xefcdab89 jited:1 ret 1460850314 != -271733879 (0x5712ce8a != 0xefcdab89)FAIL (1 times)
[ 9757.266260] test_bpf: #314 BSWAP 64: 0x0123456789abcdef -> 0x67452301 jited:1 8 PASS
[ 9757.268000] test_bpf: #315 BSWAP 64: 0x0123456789abcdef >> 32 -> 0xefcdab89 jited:1 8 PASS
[ 9757.269686] test_bpf: #316 BSWAP 16: 0xfedcba9876543210 -> 0x1032 jited:1 8 PASS
[ 9757.271380] test_bpf: #317 BSWAP 32: 0xfedcba9876543210 -> 0x10325476 jited:1 ret -1460850316 != 271733878 (0xa8ed3174 != 0x10325476)FAIL (1 times)
[ 9757.273022] test_bpf: #318 BSWAP 64: 0xfedcba9876543210 -> 0x98badcfe jited:1 7 PASS
[ 9757.274721] test_bpf: #319 BSWAP 64: 0xfedcba9876543210 >> 32 -> 0x10325476 jited:1 9 PASS

Fix this by forcing 32bit variant of rev32.

Fixes: 1104247 ("bpf, arm64: Support unconditional bswap")
Signed-off-by: Artem Savkov <asavkov@redhat.com>
Tested-by: Puranjay Mohan <puranjay12@gmail.com>
Acked-by: Puranjay Mohan <puranjay12@gmail.com>
Acked-by: Xu Kuohai <xukuohai@huawei.com>
Message-ID: <20240321081809.158803-1-asavkov@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
@matttbe
Copy link
Member Author

matttbe commented Mar 28, 2024

We didn't see this issue for a while. I suggest to close this ticket. We can re-open it later if needed.

@matttbe matttbe closed this as completed Mar 28, 2024
matttbe pushed a commit that referenced this issue May 20, 2024
Recent additions in BPF like cpu v4 instructions, test_bpf module
exhibits the following failures:

  test_bpf: #82 ALU_MOVSX | BPF_B jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
  test_bpf: #83 ALU_MOVSX | BPF_H jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
  test_bpf: #84 ALU64_MOVSX | BPF_B jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
  test_bpf: #85 ALU64_MOVSX | BPF_H jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
  test_bpf: #86 ALU64_MOVSX | BPF_W jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)

  test_bpf: #165 ALU_SDIV_X: -6 / 2 = -3 jited:1 ret 2147483645 != -3 (0x7ffffffd != 0xfffffffd)FAIL (1 times)
  test_bpf: #166 ALU_SDIV_K: -6 / 2 = -3 jited:1 ret 2147483645 != -3 (0x7ffffffd != 0xfffffffd)FAIL (1 times)

  test_bpf: #169 ALU_SMOD_X: -7 % 2 = -1 jited:1 ret 1 != -1 (0x1 != 0xffffffff)FAIL (1 times)
  test_bpf: #170 ALU_SMOD_K: -7 % 2 = -1 jited:1 ret 1 != -1 (0x1 != 0xffffffff)FAIL (1 times)

  test_bpf: #172 ALU64_SMOD_K: -7 % 2 = -1 jited:1 ret 1 != -1 (0x1 != 0xffffffff)FAIL (1 times)

  test_bpf: #313 BSWAP 16: 0x0123456789abcdef -> 0xefcd
  eBPF filter opcode 00d7 (@2) unsupported
  jited:0 301 PASS
  test_bpf: #314 BSWAP 32: 0x0123456789abcdef -> 0xefcdab89
  eBPF filter opcode 00d7 (@2) unsupported
  jited:0 555 PASS
  test_bpf: #315 BSWAP 64: 0x0123456789abcdef -> 0x67452301
  eBPF filter opcode 00d7 (@2) unsupported
  jited:0 268 PASS
  test_bpf: #316 BSWAP 64: 0x0123456789abcdef >> 32 -> 0xefcdab89
  eBPF filter opcode 00d7 (@2) unsupported
  jited:0 269 PASS
  test_bpf: #317 BSWAP 16: 0xfedcba9876543210 -> 0x1032
  eBPF filter opcode 00d7 (@2) unsupported
  jited:0 460 PASS
  test_bpf: #318 BSWAP 32: 0xfedcba9876543210 -> 0x10325476
  eBPF filter opcode 00d7 (@2) unsupported
  jited:0 320 PASS
  test_bpf: #319 BSWAP 64: 0xfedcba9876543210 -> 0x98badcfe
  eBPF filter opcode 00d7 (@2) unsupported
  jited:0 222 PASS
  test_bpf: #320 BSWAP 64: 0xfedcba9876543210 >> 32 -> 0x10325476
  eBPF filter opcode 00d7 (@2) unsupported
  jited:0 273 PASS

  test_bpf: #344 BPF_LDX_MEMSX | BPF_B
  eBPF filter opcode 0091 (@5) unsupported
  jited:0 432 PASS
  test_bpf: #345 BPF_LDX_MEMSX | BPF_H
  eBPF filter opcode 0089 (@5) unsupported
  jited:0 381 PASS
  test_bpf: #346 BPF_LDX_MEMSX | BPF_W
  eBPF filter opcode 0081 (@5) unsupported
  jited:0 505 PASS

  test_bpf: #490 JMP32_JA: Unconditional jump: if (true) return 1
  eBPF filter opcode 0006 (@1) unsupported
  jited:0 261 PASS

  test_bpf: Summary: 1040 PASSED, 10 FAILED, [924/1038 JIT'ed]

Fix them by adding missing processing.

Fixes: daabb2b ("bpf/tests: add tests for cpuv4 instructions")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/91de862dda99d170697eb79ffb478678af7e0b27.1709652689.git.christophe.leroy@csgroup.eu
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
MPTCP Bugs
  
Needs triage
Development

No branches or pull requests

1 participant