Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash in netif_rx_internal #471

Closed
matttbe opened this issue Jan 13, 2024 · 14 comments
Closed

Crash in netif_rx_internal #471

matttbe opened this issue Jan 13, 2024 · 14 comments

Comments

@matttbe
Copy link
Member

matttbe commented Jan 13, 2024

A kernel panic has been detected by the CI (no debug kconfig).

Click to expand but probably ignore this one, no debug info
# INFO: validating network environment with pings
[ 2211.138427] int3: 0000 [#1] PREEMPT SMP NOPTI
[ 2211.138427] CPU: 0 PID: 21830 Comm: ping Tainted: G                 N 6.7.0-gc6465fa4649b #1
[ 2211.138427] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 2211.138427] RIP: 0010:__netif_receive_skb_core.constprop.0+0x39/0x10b0
[ 2211.138427] Code: 54 55 53 48 83 ec 78 48 8b 2f 48 89 7c 24 10 48 89 54 24 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 70 31 c0 48 89 6c 24 30 e9 <13> 08 00 00 0f 1f 44 00 00 48 8b 85 c8 00 00 00 48 2b 85 c0 00 00
[ 2211.138427] RSP: 0018:ffffb09700003e00 EFLAGS: 00000246
[ 2211.138427] RAX: 0000000000000000 RBX: ffff9eec3dc2ef10 RCX: ffff9eebc6205700
[ 2211.138427] RDX: ffffb09700003eb8 RSI: 0000000000000000 RDI: ffffb09700003eb0
[ 2211.138427] RBP: ffff9eebc6205700 R08: 0000000000000000 R09: 0000000000000048
[ 2211.138427] R10: 00000000000002ff R11: 020000ff01000000 R12: ffff9eebc82b5000
[ 2211.138427] R13: ffff9eec3dc2ee10 R14: 0000000000000000 R15: 0000000000000002
[ 2211.138427] FS:  00007fa1f295b1c0(0000) GS:ffff9eec3dc00000(0000) knlGS:0000000000000000
[ 2211.138427] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2211.138427] CR2: 00005595dc9df240 CR3: 0000000004758000 CR4: 00000000000006f0
[ 2211.138427] Call Trace:
[ 2211.138427]  
[ 2211.138427]  ? die+0x37/0x90
[ 2211.138427]  ? exc_int3+0x10b/0x110
[ 2211.138427]  ? asm_exc_int3+0x39/0x40
[ 2211.138427]  ? __netif_receive_skb_core.constprop.0+0x39/0x10b0
[ 2211.138427]  ? __netif_receive_skb_core.constprop.0+0x39/0x10b0
[ 2211.138427]  ? ip6_finish_output2+0x209/0x670
[ 2211.138427]  ? ip6_output+0x12d/0x150
[ 2211.138427]  ? unix_stream_read_generic+0x7c4/0xb70
[ 2211.138427]  ? ip6_mtu+0x46/0x50
[ 2211.138427]  __netif_receive_skb_one_core+0x3d/0x80
[ 2211.138427]  process_backlog+0x9d/0x140
[ 2211.138427]  __napi_poll+0x26/0x1b0
[ 2211.138427]  net_rx_action+0x28f/0x300
[ 2211.138427]  __do_softirq+0xc0/0x28b
[ 2211.138427]  do_softirq+0x43/0x60
[ 2211.138427]  
[ 2211.138427]  
[ 2211.138427]  __local_bh_enable_ip+0x5c/0x70
[ 2211.138427]  __dev_queue_xmit+0x28e/0xd70
[ 2211.138427]  ip6_finish_output2+0x2d8/0x670
[ 2211.138427]  ? ip6_output+0x12d/0x150
[ 2211.138427]  ? ip6_mtu+0x46/0x50
[ 2211.138427]  ip6_send_skb+0x22/0x70
[ 2211.138427]  rawv6_sendmsg+0xda5/0x10c0
[ 2211.138427]  ? netfs_clear_subrequests+0x63/0x80
[ 2211.138427]  ? netfs_alloc_request+0xec/0x130
[ 2211.138427]  ? folio_add_file_rmap_ptes+0x88/0xb0
[ 2211.138427]  ? set_pte_range+0xe8/0x310
[ 2211.138427]  ? next_uptodate_folio+0x85/0x260
[ 2211.138427]  ? __sock_sendmsg+0x38/0x70
[ 2211.138427]  __sock_sendmsg+0x38/0x70
[ 2211.138427]  ? move_addr_to_kernel.part.0+0x1b/0x60
[ 2211.138427]  __sys_sendto+0xfc/0x160
[ 2211.138427]  ? ktime_get_real_ts64+0x4d/0xf0
[ 2211.138427]  __x64_sys_sendto+0x24/0x30
[ 2211.138427]  do_syscall_64+0xad/0x1a0
[ 2211.138427]  entry_SYSCALL_64_after_hwframe+0x63/0x6b
[ 2211.138427] RIP: 0033:0x7fa1f2c2da0a
[ 2211.138427] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
[ 2211.138427] RSP: 002b:00007fff0d984668 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[ 2211.138427] RAX: ffffffffffffffda RBX: 00007fff0d985da0 RCX: 00007fa1f2c2da0a
[ 2211.138427] RDX: 0000000000000040 RSI: 00005595dcf1d300 RDI: 0000000000000003
[ 2211.138427] RBP: 00005595dcf1d300 R08: 00007fff0d987fb4 R09: 000000000000001c
[ 2211.138427] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff0d985930
[ 2211.138427] R13: 0000000000000040 R14: 00005595dcf1f4f4 R15: 00007fff0d985da0
[ 2211.138427]  
[ 2211.138427] Modules linked in: tcp_diag act_csum act_pedit cls_fw sch_ingress xt_mark xt_statistic xt_length xt_bpf ipt_REJECT nft_tproxy nf_tproxy_ipv6 nf_tproxy_ipv4 nft_socket nf_socket_ipv4 nf_socket_ipv6 nf_tables sch_netem mptcp_diag inet_diag mptcp_token_test mptcp_crypto_test kunit
[ 2211.138427] ---[ end trace 0000000000000000 ]---
[ 2211.138427] RIP: 0010:__netif_receive_skb_core.constprop.0+0x39/0x10b0
[ 2211.138427] Code: 54 55 53 48 83 ec 78 48 8b 2f 48 89 7c 24 10 48 89 54 24 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 70 31 c0 48 89 6c 24 30 e9 <13> 08 00 00 0f 1f 44 00 00 48 8b 85 c8 00 00 00 48 2b 85 c0 00 00
[ 2211.138427] RSP: 0018:ffffb09700003e00 EFLAGS: 00000246
[ 2211.138427] RAX: 0000000000000000 RBX: ffff9eec3dc2ef10 RCX: ffff9eebc6205700
[ 2211.138427] RDX: ffffb09700003eb8 RSI: 0000000000000000 RDI: ffffb09700003eb0
[ 2211.138427] RBP: ffff9eebc6205700 R08: 0000000000000000 R09: 0000000000000048
[ 2211.138427] R10: 00000000000002ff R11: 020000ff01000000 R12: ffff9eebc82b5000
[ 2211.138427] R13: ffff9eec3dc2ee10 R14: 0000000000000000 R15: 0000000000000002
[ 2211.138427] FS:  00007fa1f295b1c0(0000) GS:ffff9eec3dc00000(0000) knlGS:0000000000000000
[ 2211.138427] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2211.138427] CR2: 00005595dc9df240 CR3: 0000000004758000 CR4: 00000000000006f0
[ 2211.138427] Kernel panic - not syncing: Fatal exception in interrupt
[ 2211.138427] Kernel Offset: 0x1c400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

It looks like it is not related to MPTCP. Due to a global timeout, the trace has not been decoded and the vmlinux file has not been saved.

Anyway, logging it here, just in case. I just relaunched the job, hoping to be able to reproduce it (no issues on my side).

@matttbe
Copy link
Member Author

matttbe commented Jan 13, 2024

Closing this as this is not related to MPTCP.

I will report it to netdev ML if I manage to reproduce it.

@matttbe matttbe closed this as completed Jan 13, 2024
@matttbe
Copy link
Member Author

matttbe commented Jan 16, 2024

I had probably the same issue, just after having sent a ping (in v6 I suppose), this time with a decoded stacktrace, still on top of net:

[   45.505495] int3: 0000 [#1] PREEMPT SMP NOPTI
[   45.505547] CPU: 1 PID: 1070 Comm: ping Tainted: G                 N 6.7.0-g244ee3389ffe #1
[   45.505547] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[   45.505547] RIP: 0010:netif_rx_internal (arch/x86/include/asm/jump_label.h:27) 
[ 45.505547] Code: 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 48 89 fd 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 31 c0 e9 <c9> 00 00 00 66 90 66 90 48 8d 54 24 10 48 89 ef 65 8b 35 17 9d 11
All code
========
   0:	0f 1f 84 00 00 00 00 	nopl   0x0(%rax,%rax,1)
   7:	00 
   8:	0f 1f 40 00          	nopl   0x0(%rax)
   c:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  11:	55                   	push   %rbp
  12:	48 89 fd             	mov    %rdi,%rbp
  15:	48 83 ec 20          	sub    $0x20,%rsp
  19:	65 48 8b 04 25 28 00 	mov    %gs:0x28,%rax
  20:	00 00 
  22:	48 89 44 24 18       	mov    %rax,0x18(%rsp)
  27:	31 c0                	xor    %eax,%eax
  29:*	e9 c9 00 00 00       	jmp    0xf7		<-- trapping instruction
  2e:	66 90                	xchg   %ax,%ax
  30:	66 90                	xchg   %ax,%ax
  32:	48 8d 54 24 10       	lea    0x10(%rsp),%rdx
  37:	48 89 ef             	mov    %rbp,%rdi
  3a:	65                   	gs
  3b:	8b                   	.byte 0x8b
  3c:	35                   	.byte 0x35
  3d:	17                   	(bad)  
  3e:	9d                   	popf   
  3f:	11                   	.byte 0x11

Code starting with the faulting instruction
===========================================
   0:	c9                   	leave  
   1:	00 00                	add    %al,(%rax)
   3:	00 66 90             	add    %ah,-0x70(%rsi)
   6:	66 90                	xchg   %ax,%ax
   8:	48 8d 54 24 10       	lea    0x10(%rsp),%rdx
   d:	48 89 ef             	mov    %rbp,%rdi
  10:	65                   	gs
  11:	8b                   	.byte 0x8b
  12:	35                   	.byte 0x35
  13:	17                   	(bad)  
  14:	9d                   	popf   
  15:	11                   	.byte 0x11
[   45.505547] RSP: 0018:ffffb106c00f0af8 EFLAGS: 00000246
[   45.505547] RAX: 0000000000000000 RBX: ffff99918827b000 RCX: 0000000000000000
[   45.505547] RDX: 000000000000000a RSI: ffff99918827d000 RDI: ffff9991819e6400
[   45.505547] RBP: ffff9991819e6400 R08: 0000000000000000 R09: 0000000000000068
[   45.505547] R10: ffff999181c104c0 R11: 736f6d6570736575 R12: ffff9991819e6400
[   45.505547] R13: 0000000000000076 R14: 0000000000000000 R15: ffff99918827c000
[   45.505547] FS:  00007fa1d06ca1c0(0000) GS:ffff9991fdc80000(0000) knlGS:0000000000000000
[   45.505547] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   45.505547] CR2: 0000559b91aac240 CR3: 0000000004986000 CR4: 00000000000006f0
[   45.505547] Call Trace:
[   45.505547]  <IRQ>
[   45.505547] ? die (arch/x86/kernel/dumpstack.c:421) 
[   45.505547] ? exc_int3 (arch/x86/kernel/traps.c:762) 
[   45.505547] ? asm_exc_int3 (arch/x86/include/asm/idtentry.h:569) 
[   45.505547] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27) 
[   45.505547] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27) 
[   45.505547] __netif_rx (net/core/dev.c:5084) 
[   45.505547] veth_xmit (drivers/net/veth.c:321) 
[   45.505547] dev_hard_start_xmit (include/linux/netdevice.h:4989) 
[   45.505547] __dev_queue_xmit (include/linux/netdevice.h:3367) 
[   45.505547] ? selinux_ip_postroute_compat (security/selinux/hooks.c:5783) 
[   45.505547] ? eth_header (net/ethernet/eth.c:85) 
[   45.505547] ip6_finish_output2 (include/net/neighbour.h:542) 
[   45.505547] ? ip6_output (include/linux/netfilter.h:301) 
[   45.505547] ? ip6_mtu (net/ipv6/route.c:3208) 
[   45.505547] ip6_send_skb (net/ipv6/ip6_output.c:1953) 
[   45.505547] icmpv6_echo_reply (net/ipv6/icmp.c:812) 
[   45.505547] ? icmpv6_rcv (net/ipv6/icmp.c:939) 
[   45.505547] icmpv6_rcv (net/ipv6/icmp.c:939) 
[   45.505547] ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:440) 
[   45.505547] ip6_input_finish (include/linux/rcupdate.h:779) 
[   45.505547] __netif_receive_skb_one_core (net/core/dev.c:5537) 
[   45.505547] process_backlog (include/linux/rcupdate.h:779) 
[   45.505547] __napi_poll (net/core/dev.c:6576) 
[   45.505547] net_rx_action (net/core/dev.c:6647) 
[   45.505547] __do_softirq (arch/x86/include/asm/jump_label.h:27) 
[   45.505547] do_softirq (kernel/softirq.c:454) 
[   45.505547]  </IRQ>
[   45.505547]  <TASK>
[   45.505547] __local_bh_enable_ip (kernel/softirq.c:381) 
[   45.505547] __dev_queue_xmit (net/core/dev.c:4379) 
[   45.505547] ip6_finish_output2 (include/linux/netdevice.h:3171) 
[   45.505547] ? ip6_output (include/linux/netfilter.h:301) 
[   45.505547] ? ip6_mtu (net/ipv6/route.c:3208) 
[   45.505547] ip6_send_skb (net/ipv6/ip6_output.c:1953) 
[   45.505547] rawv6_sendmsg (net/ipv6/raw.c:584) 
[   45.505547] ? netfs_clear_subrequests (include/linux/list.h:373) 
[   45.505547] ? netfs_alloc_request (fs/netfs/objects.c:42) 
[   45.505547] ? folio_add_file_rmap_ptes (arch/x86/include/asm/bitops.h:206) 
[   45.505547] ? set_pte_range (mm/memory.c:4529) 
[   45.505547] ? next_uptodate_folio (include/linux/xarray.h:1699) 
[   45.505547] ? __sock_sendmsg (net/socket.c:733) 
[   45.505547] __sock_sendmsg (net/socket.c:733) 
[   45.505547] ? move_addr_to_kernel.part.0 (net/socket.c:253) 
[   45.505547] __sys_sendto (net/socket.c:2191) 
[   45.505547] ? __hrtimer_run_queues (include/linux/seqlock.h:566) 
[   45.505547] ? __do_softirq (arch/x86/include/asm/jump_label.h:27) 
[   45.505547] __x64_sys_sendto (net/socket.c:2203) 
[   45.505547] do_syscall_64 (arch/x86/entry/common.c:52) 
[   45.505547] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129) 
[   45.505547] RIP: 0033:0x7fa1d099ca0a
[ 45.505547] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
All code
========
   0:	d8 64 89 02          	fsubs  0x2(%rcx,%rcx,4)
   4:	48 c7 c0 ff ff ff ff 	mov    $0xffffffffffffffff,%rax
   b:	eb b8                	jmp    0xffffffffffffffc5
   d:	0f 1f 00             	nopl   (%rax)
  10:	f3 0f 1e fa          	endbr64 
  14:	41 89 ca             	mov    %ecx,%r10d
  17:	64 8b 04 25 18 00 00 	mov    %fs:0x18,%eax
  1e:	00 
  1f:	85 c0                	test   %eax,%eax
  21:	75 15                	jne    0x38
  23:	b8 2c 00 00 00       	mov    $0x2c,%eax
  28:	0f 05                	syscall 
  2a:*	48 3d 00 f0 ff ff    	cmp    $0xfffffffffffff000,%rax		<-- trapping instruction
  30:	77 7e                	ja     0xb0
  32:	c3                   	ret    
  33:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  38:	41 54                	push   %r12
  3a:	48 83 ec 30          	sub    $0x30,%rsp
  3e:	44                   	rex.R
  3f:	89                   	.byte 0x89

Code starting with the faulting instruction
===========================================
   0:	48 3d 00 f0 ff ff    	cmp    $0xfffffffffffff000,%rax
   6:	77 7e                	ja     0x86
   8:	c3                   	ret    
   9:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
   e:	41 54                	push   %r12
  10:	48 83 ec 30          	sub    $0x30,%rsp
  14:	44                   	rex.R
  15:	89                   	.byte 0x89
[   45.505547] RSP: 002b:00007ffe47710958 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[   45.505547] RAX: ffffffffffffffda RBX: 00007ffe47712090 RCX: 00007fa1d099ca0a
[   45.505547] RDX: 0000000000000040 RSI: 0000559b91bbd300 RDI: 0000000000000003
[   45.505547] RBP: 0000559b91bbd300 R08: 00007ffe477142a4 R09: 000000000000001c
[   45.505547] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe47711c20
[   45.505547] R13: 0000000000000040 R14: 0000559b91bbf4f4 R15: 00007ffe47712090
[   45.505547]  </TASK>
[   45.505547] Modules linked in: mptcp_diag inet_diag mptcp_token_test mptcp_crypto_test kunit
[   45.505547] ---[ end trace 0000000000000000 ]---
[   45.505547] RIP: 0010:netif_rx_internal (arch/x86/include/asm/jump_label.h:27) 
[ 45.505547] Code: 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 48 89 fd 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 31 c0 e9 <c9> 00 00 00 66 90 66 90 48 8d 54 24 10 48 89 ef 65 8b 35 17 9d 11
All code
========
   0:	0f 1f 84 00 00 00 00 	nopl   0x0(%rax,%rax,1)
   7:	00 
   8:	0f 1f 40 00          	nopl   0x0(%rax)
   c:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  11:	55                   	push   %rbp
  12:	48 89 fd             	mov    %rdi,%rbp
  15:	48 83 ec 20          	sub    $0x20,%rsp
  19:	65 48 8b 04 25 28 00 	mov    %gs:0x28,%rax
  20:	00 00 
  22:	48 89 44 24 18       	mov    %rax,0x18(%rsp)
  27:	31 c0                	xor    %eax,%eax
  29:*	e9 c9 00 00 00       	jmp    0xf7		<-- trapping instruction
  2e:	66 90                	xchg   %ax,%ax
  30:	66 90                	xchg   %ax,%ax
  32:	48 8d 54 24 10       	lea    0x10(%rsp),%rdx
  37:	48 89 ef             	mov    %rbp,%rdi
  3a:	65                   	gs
  3b:	8b                   	.byte 0x8b
  3c:	35                   	.byte 0x35
  3d:	17                   	(bad)  
  3e:	9d                   	popf   
  3f:	11                   	.byte 0x11

Code starting with the faulting instruction
===========================================
   0:	c9                   	leave  
   1:	00 00                	add    %al,(%rax)
   3:	00 66 90             	add    %ah,-0x70(%rsi)
   6:	66 90                	xchg   %ax,%ax
   8:	48 8d 54 24 10       	lea    0x10(%rsp),%rdx
   d:	48 89 ef             	mov    %rbp,%rdi
  10:	65                   	gs
  11:	8b                   	.byte 0x8b
  12:	35                   	.byte 0x35
  13:	17                   	(bad)  
  14:	9d                   	popf   
  15:	11                   	.byte 0x11
[   45.505547] RSP: 0018:ffffb106c00f0af8 EFLAGS: 00000246
[   45.505547] RAX: 0000000000000000 RBX: ffff99918827b000 RCX: 0000000000000000
[   45.505547] RDX: 000000000000000a RSI: ffff99918827d000 RDI: ffff9991819e6400
[   45.505547] RBP: ffff9991819e6400 R08: 0000000000000000 R09: 0000000000000068
[   45.505547] R10: ffff999181c104c0 R11: 736f6d6570736575 R12: ffff9991819e6400
[   45.505547] R13: 0000000000000076 R14: 0000000000000000 R15: ffff99918827c000
[   45.505547] FS:  00007fa1d06ca1c0(0000) GS:ffff9991fdc80000(0000) knlGS:0000000000000000
[   45.505547] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   45.505547] CR2: 0000559b91aac240 CR3: 0000000004986000 CR4: 00000000000006f0
[   45.505547] Kernel panic - not syncing: Fatal exception in interrupt
[   45.505547] Kernel Offset: 0x37600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

https://github.com/multipath-tcp/mptcp_net-next/actions/runs/7537561906/job/20516659466
export-net/20240116T054733
244ee33

EDIT: we have CONFIG_RPS=y in the tests. For KConfig and the VMLinux, you can find them in the artifacts linked to the test job

@matttbe
Copy link
Member Author

matttbe commented Jan 16, 2024

Yet another one, on top of net-next + net:

# INFO: validating network environment with pings
[   46.316504] int3: 0000 [#1] PREEMPT SMP NOPTI
[   46.316504] CPU: 0 PID: 1078 Comm: ping Tainted: G                 N 6.7.0-g2572fed72ac3 #1
[   46.316504] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[   46.316504] RIP: 0010:netif_rx_internal (arch/x86/include/asm/jump_label.h:27) 
[ 46.316504] Code: 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 48 89 fd 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 31 c0 e9 <c9> 00 00 00 66 90 66 90 48 8d 54 24 10 48 89 ef 65 8b 35 d7 9c 31
All code
========
   0:	0f 1f 84 00 00 00 00 	nopl   0x0(%rax,%rax,1)
   7:	00 
   8:	0f 1f 40 00          	nopl   0x0(%rax)
   c:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  11:	55                   	push   %rbp
  12:	48 89 fd             	mov    %rdi,%rbp
  15:	48 83 ec 20          	sub    $0x20,%rsp
  19:	65 48 8b 04 25 28 00 	mov    %gs:0x28,%rax
  20:	00 00 
  22:	48 89 44 24 18       	mov    %rax,0x18(%rsp)
  27:	31 c0                	xor    %eax,%eax
  29:*	e9 c9 00 00 00       	jmp    0xf7		<-- trapping instruction
  2e:	66 90                	xchg   %ax,%ax
  30:	66 90                	xchg   %ax,%ax
  32:	48 8d 54 24 10       	lea    0x10(%rsp),%rdx
  37:	48 89 ef             	mov    %rbp,%rdi
  3a:	65                   	gs
  3b:	8b                   	.byte 0x8b
  3c:	35                   	.byte 0x35
  3d:	d7                   	xlat   %ds:(%rbx)
  3e:	9c                   	pushf  
  3f:	31                   	.byte 0x31

Code starting with the faulting instruction
===========================================
   0:	c9                   	leave  
   1:	00 00                	add    %al,(%rax)
   3:	00 66 90             	add    %ah,-0x70(%rsi)
   6:	66 90                	xchg   %ax,%ax
   8:	48 8d 54 24 10       	lea    0x10(%rsp),%rdx
   d:	48 89 ef             	mov    %rbp,%rdi
  10:	65                   	gs
  11:	8b                   	.byte 0x8b
  12:	35                   	.byte 0x35
  13:	d7                   	xlat   %ds:(%rbx)
  14:	9c                   	pushf  
  15:	31                   	.byte 0x31
[   46.316504] RSP: 0018:ffffb96ac0003af8 EFLAGS: 00000246
[   46.316504] RAX: 0000000000000000 RBX: ffff9d8088424000 RCX: 0000000000000000
[   46.316504] RDX: 000000000000000a RSI: ffff9d8088426000 RDI: ffff9d8081b1f400
[   46.316504] RBP: ffff9d8081b1f400 R08: 0000000000000000 R09: 0000000000000000
[   46.316504] R10: ffff9d8082338000 R11: 736f6d6570736575 R12: ffff9d8081b1f400
[   46.316504] R13: 0000000000000076 R14: 0000000000000000 R15: ffff9d8088422000
[   46.316504] FS:  00007f6e554d61c0(0000) GS:ffff9d80fdc00000(0000) knlGS:0000000000000000
[   46.316504] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   46.316504] CR2: 00005619ee27d240 CR3: 000000000548a000 CR4: 00000000000006f0
[   46.316504] Call Trace:
[   46.316504]  <IRQ>
[   46.316504] ? die (arch/x86/kernel/dumpstack.c:421) 
[   46.316504] ? exc_int3 (arch/x86/kernel/traps.c:762) 
[   46.316504] ? asm_exc_int3 (arch/x86/include/asm/idtentry.h:569) 
[   46.316504] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27) 
[   46.316504] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27) 
[   46.316504] __netif_rx (net/core/dev.c:5084) 
[   46.316504] veth_xmit (drivers/net/veth.c:321) 
[   46.316504] dev_hard_start_xmit (include/linux/netdevice.h:4989) 
[   46.316504] __dev_queue_xmit (include/linux/netdevice.h:3367) 
[   46.316504] ? selinux_ip_postroute_compat (security/selinux/hooks.c:5783) 
[   46.316504] ip6_finish_output2 (include/linux/netdevice.h:3171) 
[   46.316504] ? ip6_output (include/linux/netfilter.h:301) 
[   46.316504] ? ip6_mtu (net/ipv6/route.c:3208) 
[   46.316504] ip6_send_skb (net/ipv6/ip6_output.c:1953) 
[   46.316504] icmpv6_echo_reply (net/ipv6/icmp.c:812) 
[   46.316504] ? icmpv6_rcv (net/ipv6/icmp.c:939) 
[   46.316504] icmpv6_rcv (net/ipv6/icmp.c:939) 
[   46.316504] ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:440) 
[   46.316504] ip6_input_finish (include/linux/rcupdate.h:779) 
[   46.316504] __netif_receive_skb_one_core (net/core/dev.c:5537) 
[   46.316504] process_backlog (include/linux/rcupdate.h:779) 
[   46.316504] __napi_poll (net/core/dev.c:6576) 
[   46.316504] net_rx_action (net/core/dev.c:6647) 
[   46.316504] __do_softirq (arch/x86/include/asm/jump_label.h:27) 
[   46.316504] do_softirq (kernel/softirq.c:454) 
[   46.316504]  </IRQ>
[   46.316504]  <TASK>
[   46.316504] __local_bh_enable_ip (kernel/softirq.c:381) 
[   46.316504] __dev_queue_xmit (net/core/dev.c:4379) 
[   46.316504] ? selinux_ip_postroute_compat (security/selinux/hooks.c:5783) 
[   46.316504] ip6_finish_output2 (include/linux/netdevice.h:3171) 
[   46.316504] ? ip6_output (include/linux/netfilter.h:301) 
[   46.316504] ? ip6_mtu (net/ipv6/route.c:3208) 
[   46.316504] ip6_send_skb (net/ipv6/ip6_output.c:1953) 
[   46.316504] rawv6_sendmsg (net/ipv6/raw.c:584) 
[   46.316504] ? netfs_clear_subrequests (include/linux/list.h:373) 
[   46.316504] ? netfs_alloc_request (fs/netfs/objects.c:42) 
[   46.316504] ? folio_add_file_rmap_ptes (arch/x86/include/asm/bitops.h:206) 
[   46.316504] ? set_pte_range (mm/memory.c:4529) 
[   46.316504] ? next_uptodate_folio (include/linux/xarray.h:1699) 
[   46.316504] ? __sock_sendmsg (net/socket.c:733) 
[   46.316504] __sock_sendmsg (net/socket.c:733) 
[   46.316504] ? move_addr_to_kernel.part.0 (net/socket.c:253) 
[   46.316504] __sys_sendto (net/socket.c:2191) 
[   46.316504] ? ktime_get_real_ts64 (kernel/time/timekeeping.c:292 (discriminator 3)) 
[   46.316504] __x64_sys_sendto (net/socket.c:2203) 
[   46.316504] do_syscall_64 (arch/x86/entry/common.c:52) 
[   46.316504] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129) 
[   46.316504] RIP: 0033:0x7f6e557a8a0a
[ 46.316504] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
All code
========
   0:	d8 64 89 02          	fsubs  0x2(%rcx,%rcx,4)
   4:	48 c7 c0 ff ff ff ff 	mov    $0xffffffffffffffff,%rax
   b:	eb b8                	jmp    0xffffffffffffffc5
   d:	0f 1f 00             	nopl   (%rax)
  10:	f3 0f 1e fa          	endbr64 
  14:	41 89 ca             	mov    %ecx,%r10d
  17:	64 8b 04 25 18 00 00 	mov    %fs:0x18,%eax
  1e:	00 
  1f:	85 c0                	test   %eax,%eax
  21:	75 15                	jne    0x38
  23:	b8 2c 00 00 00       	mov    $0x2c,%eax
  28:	0f 05                	syscall 
  2a:*	48 3d 00 f0 ff ff    	cmp    $0xfffffffffffff000,%rax		<-- trapping instruction
  30:	77 7e                	ja     0xb0
  32:	c3                   	ret    
  33:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  38:	41 54                	push   %r12
  3a:	48 83 ec 30          	sub    $0x30,%rsp
  3e:	44                   	rex.R
  3f:	89                   	.byte 0x89

Code starting with the faulting instruction
===========================================
   0:	48 3d 00 f0 ff ff    	cmp    $0xfffffffffffff000,%rax
   6:	77 7e                	ja     0x86
   8:	c3                   	ret    
   9:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
   e:	41 54                	push   %r12
  10:	48 83 ec 30          	sub    $0x30,%rsp
  14:	44                   	rex.R
  15:	89                   	.byte 0x89
[   46.316504] RSP: 002b:00007ffd7aa1f0b8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[   46.316504] RAX: ffffffffffffffda RBX: 00007ffd7aa207f0 RCX: 00007f6e557a8a0a
[   46.316504] RDX: 0000000000000040 RSI: 00005619effd8300 RDI: 0000000000000003
[   46.316504] RBP: 00005619effd8300 R08: 00007ffd7aa22a04 R09: 000000000000001c
[   46.316504] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd7aa20380
[   46.316504] R13: 0000000000000040 R14: 00005619effda4f4 R15: 00007ffd7aa207f0
[   46.316504]  </TASK>
[   46.316504] Modules linked in: mptcp_diag inet_diag mptcp_token_test mptcp_crypto_test kunit
[   46.316504] ---[ end trace 0000000000000000 ]---
[   46.316504] RIP: 0010:netif_rx_internal (arch/x86/include/asm/jump_label.h:27) 
[ 46.316504] Code: 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 48 89 fd 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 31 c0 e9 <c9> 00 00 00 66 90 66 90 48 8d 54 24 10 48 89 ef 65 8b 35 d7 9c 31
All code
========
   0:	0f 1f 84 00 00 00 00 	nopl   0x0(%rax,%rax,1)
   7:	00 
   8:	0f 1f 40 00          	nopl   0x0(%rax)
   c:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  11:	55                   	push   %rbp
  12:	48 89 fd             	mov    %rdi,%rbp
  15:	48 83 ec 20          	sub    $0x20,%rsp
  19:	65 48 8b 04 25 28 00 	mov    %gs:0x28,%rax
  20:	00 00 
  22:	48 89 44 24 18       	mov    %rax,0x18(%rsp)
  27:	31 c0                	xor    %eax,%eax
  29:*	e9 c9 00 00 00       	jmp    0xf7		<-- trapping instruction
  2e:	66 90                	xchg   %ax,%ax
  30:	66 90                	xchg   %ax,%ax
  32:	48 8d 54 24 10       	lea    0x10(%rsp),%rdx
  37:	48 89 ef             	mov    %rbp,%rdi
  3a:	65                   	gs
  3b:	8b                   	.byte 0x8b
  3c:	35                   	.byte 0x35
  3d:	d7                   	xlat   %ds:(%rbx)
  3e:	9c                   	pushf  
  3f:	31                   	.byte 0x31

Code starting with the faulting instruction
===========================================
   0:	c9                   	leave  
   1:	00 00                	add    %al,(%rax)
   3:	00 66 90             	add    %ah,-0x70(%rsi)
   6:	66 90                	xchg   %ax,%ax
   8:	48 8d 54 24 10       	lea    0x10(%rsp),%rdx
   d:	48 89 ef             	mov    %rbp,%rdi
  10:	65                   	gs
  11:	8b                   	.byte 0x8b
  12:	35                   	.byte 0x35
  13:	d7                   	xlat   %ds:(%rbx)
  14:	9c                   	pushf  
  15:	31                   	.byte 0x31
[   46.316504] RSP: 0018:ffffb96ac0003af8 EFLAGS: 00000246
[   46.316504] RAX: 0000000000000000 RBX: ffff9d8088424000 RCX: 0000000000000000
[   46.316504] RDX: 000000000000000a RSI: ffff9d8088426000 RDI: ffff9d8081b1f400
[   46.316504] RBP: ffff9d8081b1f400 R08: 0000000000000000 R09: 0000000000000000
[   46.316504] R10: ffff9d8082338000 R11: 736f6d6570736575 R12: ffff9d8081b1f400
[   46.316504] R13: 0000000000000076 R14: 0000000000000000 R15: ffff9d8088422000
[   46.316504] FS:  00007f6e554d61c0(0000) GS:ffff9d80fdc00000(0000) knlGS:0000000000000000
[   46.316504] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   46.316504] CR2: 00005619ee27d240 CR3: 000000000548a000 CR4: 00000000000006f0
[   46.316504] Kernel panic - not syncing: Fatal exception in interrupt
[   46.316504] Kernel Offset: 0x32400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Unexpected stop of the VM

https://github.com/multipath-tcp/mptcp_net-next/actions/runs/7545349968/job/20540751697
export/20240116T172013
2572fed

@matttbe matttbe changed the title Crash in __netif_receive_skb_core Crash in netif_rx_internal Jan 16, 2024
@matttbe
Copy link
Member Author

matttbe commented Jan 17, 2024

Note that Eric suggests this is probably an issue on x86's side.

We had another stack trace:

 # INFO: validating network environment with pings
[   46.565607] int3: 0000 [#1] PREEMPT SMP NOPTI
[   46.565607] CPU: 2 PID: 1079 Comm: ping Tainted: G                 N 6.7.0-g1fd81af266b7 #1
[   46.565607] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[   46.565607] RIP: 0010:netif_rx_internal (arch/x86/include/asm/jump_label.h:27) 
[ 46.565607] Code: 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 48 89 fd 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 31 c0 e9 <c9> 00 00 00 66 90 66 90 48 8d 54 24 10 48 89 ef 65 8b 35 d7 9c d1
All code
========
   0:	0f 1f 84 00 00 00 00 	nopl   0x0(%rax,%rax,1)
   7:	00 
   8:	0f 1f 40 00          	nopl   0x0(%rax)
   c:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  11:	55                   	push   %rbp
  12:	48 89 fd             	mov    %rdi,%rbp
  15:	48 83 ec 20          	sub    $0x20,%rsp
  19:	65 48 8b 04 25 28 00 	mov    %gs:0x28,%rax
  20:	00 00 
  22:	48 89 44 24 18       	mov    %rax,0x18(%rsp)
  27:	31 c0                	xor    %eax,%eax
  29:*	e9 c9 00 00 00       	jmp    0xf7		<-- trapping instruction
  2e:	66 90                	xchg   %ax,%ax
  30:	66 90                	xchg   %ax,%ax
  32:	48 8d 54 24 10       	lea    0x10(%rsp),%rdx
  37:	48 89 ef             	mov    %rbp,%rdi
  3a:	65                   	gs
  3b:	8b                   	.byte 0x8b
  3c:	35                   	.byte 0x35
  3d:	d7                   	xlat   %ds:(%rbx)
  3e:	9c                   	pushf  
  3f:	d1                   	.byte 0xd1

Code starting with the faulting instruction
===========================================
   0:	c9                   	leave  
   1:	00 00                	add    %al,(%rax)
   3:	00 66 90             	add    %ah,-0x70(%rsi)
   6:	66 90                	xchg   %ax,%ax
   8:	48 8d 54 24 10       	lea    0x10(%rsp),%rdx
   d:	48 89 ef             	mov    %rbp,%rdi
  10:	65                   	gs
  11:	8b                   	.byte 0x8b
  12:	35                   	.byte 0x35
  13:	d7                   	xlat   %ds:(%rbx)
  14:	9c                   	pushf  
  15:	d1                   	.byte 0xd1
[   46.565607] RSP: 0018:ffff9ffc0011cc08 EFLAGS: 00000246
[   46.565607] RAX: 0000000000000000 RBX: ffff9b8983696000 RCX: 0000000000000001
[   46.565607] RDX: 0000000000000002 RSI: ffff9b8983697000 RDI: ffff9b89821cd600
[   46.565607] RBP: ffff9b89821cd600 R08: 0000000000000000 R09: 000000000000001c
[   46.565607] R10: ffff9b8983a05910 R11: ffff9b8983a05900 R12: ffff9b89821cd600
[   46.565607] R13: 000000000000002a R14: 0000000000000000 R15: ffff9b8983695000
[   46.565607] FS:  00007f5bd2bf61c0(0000) GS:ffff9b89fdd00000(0000) knlGS:0000000000000000
[   46.565607] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   46.565607] CR2: 00007ffd2b36eff8 CR3: 00000000019b0000 CR4: 00000000000006f0
[   46.565607] Call Trace:
[   46.565607]  <IRQ>
[   46.565607] ? die (arch/x86/kernel/dumpstack.c:421) 
[   46.565607] ? exc_int3 (arch/x86/kernel/traps.c:762) 
[   46.565607] ? asm_exc_int3 (arch/x86/include/asm/idtentry.h:569) 
[   46.565607] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27) 
[   46.565607] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27) 
[   46.565607] __netif_rx (net/core/dev.c:5084) 
[   46.565607] veth_xmit (drivers/net/veth.c:321) 
[   46.565607] dev_hard_start_xmit (include/linux/netdevice.h:4989) 
[   46.565607] __dev_queue_xmit (include/linux/netdevice.h:3367) 
[   46.565607] ? arp_send_dst (net/ipv4/arp.c:314) 
[   46.565607] arp_solicit (net/ipv4/arp.c:392) 
[   46.565607] ? kmem_cache_alloc (mm/slub.c:3843) 
[   46.565607] ? arp_constructor (net/ipv4/arp.c:249) 
[   46.565607] neigh_probe (arch/x86/include/asm/atomic.h:53) 
[   46.565607] __neigh_event_send (net/core/neighbour.c:1242) 
[   46.565607] neigh_resolve_output (net/core/neighbour.c:1547) 
[   46.565607] ip_finish_output2 (include/net/neighbour.h:542) 
[   46.565607] ? __ip_finish_output.part.0 (include/linux/skbuff.h:4884) 
[   46.565607] __netif_receive_skb_one_core (net/core/dev.c:5537) 
[   46.565607] process_backlog (include/linux/rcupdate.h:779) 
[   46.565607] __napi_poll (net/core/dev.c:6576) 
[   46.565607] net_rx_action (net/core/dev.c:6647) 
[   46.565607] __do_softirq (arch/x86/include/asm/jump_label.h:27) 
[   46.565607] do_softirq (kernel/softirq.c:454) 
[   46.565607]  </IRQ>
[   46.565607]  <TASK>
[   46.565607] __local_bh_enable_ip (kernel/softirq.c:381) 
[   46.565607] __dev_queue_xmit (net/core/dev.c:4379) 
[   46.565607] ip_finish_output2 (include/linux/netdevice.h:3171) 
[   46.565607] ? __ip_finish_output.part.0 (include/linux/skbuff.h:4884) 
[   46.565607] ip_push_pending_frames (net/ipv4/ip_output.c:1490) 
[   46.565607] raw_sendmsg (net/ipv4/raw.c:647) 
[   46.565607] ? folio_add_file_rmap_ptes (arch/x86/include/asm/bitops.h:206) 
[   46.565607] ? set_pte_range (mm/memory.c:4529) 
[   46.565607] ? update_load_avg (kernel/sched/fair.c:4405) 
[   46.565607] ? __sock_sendmsg (net/socket.c:733) 
[   46.565607] __sock_sendmsg (net/socket.c:733) 
[   46.565607] ? move_addr_to_kernel.part.0 (net/socket.c:253) 
[   46.565607] __sys_sendto (net/socket.c:2191) 
[   46.565607] ? __do_softirq (arch/x86/include/asm/jump_label.h:27) 
[   46.565607] __x64_sys_sendto (net/socket.c:2203) 
[   46.565607] do_syscall_64 (arch/x86/entry/common.c:52) 
[   46.565607] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129) 
[   46.565607] RIP: 0033:0x7f5bd2ec8a0a
[ 46.565607] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
All code
========
   0:	d8 64 89 02          	fsubs  0x2(%rcx,%rcx,4)
   4:	48 c7 c0 ff ff ff ff 	mov    $0xffffffffffffffff,%rax
   b:	eb b8                	jmp    0xffffffffffffffc5
   d:	0f 1f 00             	nopl   (%rax)
  10:	f3 0f 1e fa          	endbr64 
  14:	41 89 ca             	mov    %ecx,%r10d
  17:	64 8b 04 25 18 00 00 	mov    %fs:0x18,%eax
  1e:	00 
  1f:	85 c0                	test   %eax,%eax
  21:	75 15                	jne    0x38
  23:	b8 2c 00 00 00       	mov    $0x2c,%eax
  28:	0f 05                	syscall 
  2a:*	48 3d 00 f0 ff ff    	cmp    $0xfffffffffffff000,%rax		<-- trapping instruction
  30:	77 7e                	ja     0xb0
  32:	c3                   	ret    
  33:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  38:	41 54                	push   %r12
  3a:	48 83 ec 30          	sub    $0x30,%rsp
  3e:	44                   	rex.R
  3f:	89                   	.byte 0x89

Code starting with the faulting instruction
===========================================
   0:	48 3d 00 f0 ff ff    	cmp    $0xfffffffffffff000,%rax
   6:	77 7e                	ja     0x86
   8:	c3                   	ret    
   9:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
   e:	41 54                	push   %r12
  10:	48 83 ec 30          	sub    $0x30,%rsp
  14:	44                   	rex.R
  15:	89                   	.byte 0x89
[   46.565607] RSP: 002b:00007ffd2b36eff8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[   46.565607] RAX: ffffffffffffffda RBX: 00007ffd2b3706a0 RCX: 00007f5bd2ec8a0a
[   46.565607] RDX: 0000000000000040 RSI: 00005645ab399300 RDI: 0000000000000003
[   46.565607] RBP: 00005645ab399300 R08: 00007ffd2b372920 R09: 0000000000000010
[   46.565607] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040
[   46.565607] R13: 00007ffd2b370238 R14: 00007ffd2b36f000 R15: 00007ffd2b3706a0
[   46.565607]  </TASK>
[   46.565607] Modules linked in: mptcp_diag inet_diag mptcp_token_test mptcp_crypto_test kunit
[   46.565607] ---[ end trace 0000000000000000 ]---
[   46.565607] RIP: 0010:netif_rx_internal (arch/x86/include/asm/jump_label.h:27) 
[ 46.565607] Code: 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 48 89 fd 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 31 c0 e9 <c9> 00 00 00 66 90 66 90 48 8d 54 24 10 48 89 ef 65 8b 35 d7 9c d1
All code
========
   0:	0f 1f 84 00 00 00 00 	nopl   0x0(%rax,%rax,1)
   7:	00 
   8:	0f 1f 40 00          	nopl   0x0(%rax)
   c:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  11:	55                   	push   %rbp
  12:	48 89 fd             	mov    %rdi,%rbp
  15:	48 83 ec 20          	sub    $0x20,%rsp
  19:	65 48 8b 04 25 28 00 	mov    %gs:0x28,%rax
  20:	00 00 
  22:	48 89 44 24 18       	mov    %rax,0x18(%rsp)
  27:	31 c0                	xor    %eax,%eax
  29:*	e9 c9 00 00 00       	jmp    0xf7		<-- trapping instruction
  2e:	66 90                	xchg   %ax,%ax
  30:	66 90                	xchg   %ax,%ax
  32:	48 8d 54 24 10       	lea    0x10(%rsp),%rdx
  37:	48 89 ef             	mov    %rbp,%rdi
  3a:	65                   	gs
  3b:	8b                   	.byte 0x8b
  3c:	35                   	.byte 0x35
  3d:	d7                   	xlat   %ds:(%rbx)
  3e:	9c                   	pushf  
  3f:	d1                   	.byte 0xd1

Code starting with the faulting instruction
===========================================
   0:	c9                   	leave  
   1:	00 00                	add    %al,(%rax)
   3:	00 66 90             	add    %ah,-0x70(%rsi)
   6:	66 90                	xchg   %ax,%ax
   8:	48 8d 54 24 10       	lea    0x10(%rsp),%rdx
   d:	48 89 ef             	mov    %rbp,%rdi
  10:	65                   	gs
  11:	8b                   	.byte 0x8b
  12:	35                   	.byte 0x35
  13:	d7                   	xlat   %ds:(%rbx)
  14:	9c                   	pushf  
  15:	d1                   	.byte 0xd1
[   46.565607] RSP: 0018:ffff9ffc0011cc08 EFLAGS: 00000246
[   46.565607] RAX: 0000000000000000 RBX: ffff9b8983696000 RCX: 0000000000000001
[   46.565607] RDX: 0000000000000002 RSI: ffff9b8983697000 RDI: ffff9b89821cd600
[   46.565607] RBP: ffff9b89821cd600 R08: 0000000000000000 R09: 000000000000001c
[   46.565607] R10: ffff9b8983a05910 R11: ffff9b8983a05900 R12: ffff9b89821cd600
[   46.565607] R13: 000000000000002a R14: 0000000000000000 R15: ffff9b8983695000
[   46.565607] FS:  00007f5bd2bf61c0(0000) GS:ffff9b89fdd00000(0000) knlGS:0000000000000000
[   46.565607] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   46.565607] CR2: 00007ffd2b36eff8 CR3: 00000000019b0000 CR4: 00000000000006f0
[   46.565607] Kernel panic - not syncing: Fatal exception in interrupt
[   46.565607] Kernel Offset: 0x39a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Unexpected stop of the VM

https://github.com/multipath-tcp/mptcp_net-next/actions/runs/7550415246/job/20556002186
On top of export.
patchew/cover.1705331716.git.pabeni@redhat.com

@matttbe
Copy link
Member Author

matttbe commented Jan 17, 2024

Because it is impacting us with the CI, I suggest to reopen it for the moment.

@matttbe matttbe reopened this Jan 17, 2024
@matttbe
Copy link
Member Author

matttbe commented Jan 18, 2024

I managed to reproduce it manually by:

  • disabling KVM support: Docker launched without --privileged mode → QEmu is then using tcg instead.
  • forcing only MPTCP Connect test: echo "run_loop_n 150 run_selftest_one mptcp_connect.sh" > .virtme-exec-run
  • Stopping the test after the ping using this patch:
diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.sh b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
index 7898d62fce0b..52320cb95d31 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.sh
@@ -852,6 +852,7 @@ done
 mptcp_lib_result_code "${ret}" "ping tests"
 
 stop_if_error "Could not even run ping tests"
+exit ${final_ret}
 
 [ -n "$tc_loss" ] && tc -net "$ns2" qdisc add dev ns2eth3 root netem loss random $tc_loss delay ${tc_delay}ms
 echo -n "INFO: Using loss of $tc_loss "

Sometimes, it is "quick" (~10 attempts), but sometimes it takes more than 100 attempts.

I started to do a Git bisect, but I can still reproduce it on a v6.4 kernel for example.

The Cirrus CI (KVM) never complained about that, so maybe an issue with TCG that is used instead of KVM? Maybe an issue with QEmu? I tried to upgrade it to the v8 (currently on the v6.2), but virtme sets QEmu options that are no longer supported...

@matttbe
Copy link
Member Author

matttbe commented Jan 20, 2024

After a few long git bisect sessions, I managed to find a commit. If I revert this commit on top of our export branch, I can no longer reproduce the issue. Or at least, not after ~2000 iterations. Most of the time, I hit the panic after less than 50 iterations. I saw a few times that it was taking more than 100 iterations, up to 140. During the blame sessions, I ended up doing 200 iterations before marking the commit as good. So I guess 2000 iterations are enough to confirm this commit does something.

Now... surprisingly, this commit is 8e791f7 ("x86/kprobes: Drop removed INT3 handling code"): a modification in arch/x86/kernel/kprobes/core.c. I'm not sure to see the link.

I guess the best is to report this to the author of the patch.

# bad: [457391b0380335d5e9a5babdec90ac53928b23b4] Linux 6.3
# good: [c9c3395d5e3dcc6daee66c6908354d47bf98cb0c] Linux 6.2
git bisect start 'v6.3' 'v6.2'
# bad: [056612fd41fef88eef22a032021cc15ef98cfc34] Merge tag 'x86-cleanups-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 056612fd41fef88eef22a032021cc15ef98cfc34
# bad: [3f0b0903fde584a7398f82fc00bf4f8138610b87] Merge tag 'x86_vdso_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 3f0b0903fde584a7398f82fc00bf4f8138610b87
# good: [7dbdc16fc85bcd89a2f3698df37a7202ea266454] Merge tag 'qcom-arm64-for-6.3-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/dt
git bisect good 7dbdc16fc85bcd89a2f3698df37a7202ea266454
# good: [5b0ed5964928b0aaf0d644c17c886c7f5ea4bb3f] Merge tag 'for-6.3/block-2023-02-16' of git://git.kernel.dk/linux
git bisect good 5b0ed5964928b0aaf0d644c17c886c7f5ea4bb3f
# good: [6e649d08568220ee88deef0a1ad8b3a935420cf2] Merge tag 'locking-core-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 6e649d08568220ee88deef0a1ad8b3a935420cf2
# good: [7c4a5b89a0b5a57a64b601775b296abf77a9fe97] sched/rt: pick_next_rt_entity(): check list_entry
git bisect good 7c4a5b89a0b5a57a64b601775b296abf77a9fe97
# bad: [0246725d7399d7d6acc8fd5a1a0a1ffce9a1eaa3] Merge tag 'ras_core_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 0246725d7399d7d6acc8fd5a1a0a1ffce9a1eaa3
# bad: [fd636b6a9bc6034f2e5bb869658898a2b472c037] x86/perf/zhaoxin: Add stepping check for ZXC
git bisect bad fd636b6a9bc6034f2e5bb869658898a2b472c037
# bad: [4cf7a136115e96241f9f1089d2b53c47accf3823] perf/core: Save the dynamic parts of sample data size
git bisect bad 4cf7a136115e96241f9f1089d2b53c47accf3823
# bad: [a018d2e3d4b1abc4a3cb64415c5d204fc5d2eafd] x86/cpufeatures: Add Architectural PerfMon Extension bit
git bisect bad a018d2e3d4b1abc4a3cb64415c5d204fc5d2eafd
# bad: [b6c00fb9949fbd073e651a77aa75faca978cf2a6] perf: Add PMU_FORMAT_ATTR_SHOW
git bisect bad b6c00fb9949fbd073e651a77aa75faca978cf2a6
# bad: [8e791f7eba4c7711f56616ae163ee3cbc00b1bf4] x86/kprobes: Drop removed INT3 handling code
git bisect bad 8e791f7eba4c7711f56616ae163ee3cbc00b1bf4
# good: [03c4c7f88709fac0e20b6a48357c73d6fc50e544] perf/x86/lbr: Simplify the exposure check for the LBR_INFO registers
git bisect good 03c4c7f88709fac0e20b6a48357c73d6fc50e544
# first bad commit: [8e791f7eba4c7711f56616ae163ee3cbc00b1bf4] x86/kprobes: Drop removed INT3 handling code

@matttbe
Copy link
Member Author

matttbe commented Jan 20, 2024

Note that I just managed to reproduce it on top of the export branch (export/20240119T055335), after having done a ping in IPv4 this time:

# INFO: set ns4-65abe8a3-O2ZCgd dev ns4eth3: ethtool -K tso off
# Created /tmp/tmp.BWY7Jw45jg (size 1924224     /tmp/tmp.BWY7Jw45jg) containing data sent by client
# Created /tmp/tmp.19cAx2Eg8O (size 2428289     /tmp/tmp.19cAx2Eg8O) containing data sent by server
# New MPTCP socket can be blocked via sysctl            [ OK ]
# INFO: validating network environment with pings
[ 1985.073189] int3: 0000 [#1] PREEMPT SMP NOPTI
[ 1985.073246] CPU: 0 PID: 3203 Comm: ping Not tainted 6.7.0-113761-g5e006770879c-dirty #250
[ 1985.073246] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 1985.073246] RIP: 0010:netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 1985.073246] Code: 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 48 89 fd 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 31 c0 e9 <c9> 00 00 00 66 90 66 90 48 8d 54 24 10 48 89 ef 65 8b 35 67 48 d0
All code
========
   0:   0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
   7:   00
   8:   0f 1f 40 00             nopl   0x0(%rax)
   c:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
  11:   55                      push   %rbp
  12:   48 89 fd                mov    %rdi,%rbp
  15:   48 83 ec 20             sub    $0x20,%rsp
  19:   65 48 8b 04 25 28 00    mov    %gs:0x28,%rax
  20:   00 00
  22:   48 89 44 24 18          mov    %rax,0x18(%rsp)
  27:   31 c0                   xor    %eax,%eax
  29:*  e9 c9 00 00 00          jmp    0xf7             <-- trapping instruction
  2e:   66 90                   xchg   %ax,%ax
  30:   66 90                   xchg   %ax,%ax
  32:   48 8d 54 24 10          lea    0x10(%rsp),%rdx
  37:   48 89 ef                mov    %rbp,%rdi
  3a:   65                      gs
  3b:   8b                      .byte 0x8b
  3c:   35                      .byte 0x35
  3d:   67                      addr32
  3e:   48                      rex.W
  3f:   d0                      .byte 0xd0

Code starting with the faulting instruction
===========================================
   0:   c9                      leave
   1:   00 00                   add    %al,(%rax)
   3:   00 66 90                add    %ah,-0x70(%rsi)
   6:   66 90                   xchg   %ax,%ax
   8:   48 8d 54 24 10          lea    0x10(%rsp),%rdx
   d:   48 89 ef                mov    %rbp,%rdi
  10:   65                      gs
  11:   8b                      .byte 0x8b
  12:   35                      .byte 0x35
  13:   67                      addr32
  14:   48                      rex.W
  15:   d0                      .byte 0xd0
[ 1985.073246] RSP: 0018:ffffb36d40003c08 EFLAGS: 00000246
[ 1985.073246] RAX: 0000000000000000 RBX: ffff9580825ca000 RCX: 0000000000000001
[ 1985.073246] RDX: 0000000000000002 RSI: ffff9580825c8000 RDI: ffff9580821cca00
[ 1985.073246] RBP: ffff9580821cca00 R08: 0000000000000000 R09: 000000000000001c
[ 1985.073246] R10: ffff9580812dcf10 R11: ffff9580812dcf00 R12: ffff9580821cca00
[ 1985.073246] R13: 000000000000002a R14: 0000000000000000 R15: ffff9580825e1800
[ 1985.073246] FS:  00007fa7c46be1c0(0000) GS:ffff9580fdc00000(0000) knlGS:0000000000000000
[ 1985.073246] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1985.073246] CR2: 00005584236b2200 CR3: 0000000002704000 CR4: 00000000000006f0
[ 1985.073246] Call Trace:
[ 1985.073246]  <IRQ>
[ 1985.073246] ? die (arch/x86/kernel/dumpstack.c:421)
[ 1985.073246] ? exc_int3 (arch/x86/kernel/traps.c:762)
[ 1985.073246] ? asm_exc_int3 (arch/x86/include/asm/idtentry.h:569)
[ 1985.073246] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 1985.073246] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 1985.073246] ? kmem_cache_alloc_node (mm/slub.c:3843)
[ 1985.073246] __netif_rx (net/core/dev.c:5084)
[ 1985.073246] veth_xmit (drivers/net/veth.c:321)
[ 1985.073246] dev_hard_start_xmit (include/linux/netdevice.h:4989)
[ 1985.073246] __dev_queue_xmit (include/linux/netdevice.h:3367)
[ 1985.073246] ? arp_create (net/ipv4/arp.c:577)
[ 1985.073246] arp_solicit (net/ipv4/arp.c:392)
[ 1985.073246] ? kmem_cache_alloc (mm/slub.c:3843)
[ 1985.073246] ? arp_constructor (net/ipv4/arp.c:249)
[ 1985.073246] neigh_probe (arch/x86/include/asm/atomic.h:53)
[ 1985.073246] __neigh_event_send (net/core/neighbour.c:1242)
[ 1985.073246] neigh_resolve_output (net/core/neighbour.c:1547)
[ 1985.073246] ip_finish_output2 (include/net/neighbour.h:542)
[ 1985.073246] ? __ip_finish_output.part.0 (include/linux/skbuff.h:4884)
[ 1985.073246] __netif_receive_skb_one_core (net/core/dev.c:5537)
[ 1985.073246] process_backlog (include/linux/rcupdate.h:782)
[ 1985.073246] __napi_poll (net/core/dev.c:6576)
[ 1985.073246] net_rx_action (net/core/dev.c:6647)
[ 1985.073246] __do_softirq (arch/x86/include/asm/jump_label.h:27)
[ 1985.073246] do_softirq (kernel/softirq.c:454)
[ 1985.073246]  </IRQ>
[ 1985.073246]  <TASK>
[ 1985.073246] __local_bh_enable_ip (kernel/softirq.c:381)
[ 1985.073246] __dev_queue_xmit (net/core/dev.c:4379)
[ 1985.073246] ip_finish_output2 (include/linux/netdevice.h:3171)
[ 1985.073246] ? __ip_finish_output.part.0 (include/linux/skbuff.h:4884)
[ 1985.073246] ip_push_pending_frames (net/ipv4/ip_output.c:1490)
[ 1985.073246] raw_sendmsg (net/ipv4/raw.c:647)
[ 1985.073246] ? netfs_rreq_assess (fs/netfs/io.c:101)
[ 1985.073246] ? folio_add_file_rmap_ptes (arch/x86/include/asm/bitops.h:206)
[ 1985.073246] ? set_pte_range (mm/memory.c:4529)
[ 1985.073246] ? __sock_sendmsg (net/socket.c:733)
[ 1985.073246] __sock_sendmsg (net/socket.c:733)
[ 1985.073246] ? move_addr_to_kernel.part.0 (net/socket.c:253)
[ 1985.073246] __sys_sendto (net/socket.c:2191)
[ 1985.073246] ? __rseq_handle_notify_resume (kernel/rseq.c:257)
[ 1985.073246] ? ktime_get_real_ts64 (kernel/time/timekeeping.c:292 (discriminator 3))
[ 1985.073246] __x64_sys_sendto (net/socket.c:2203)
[ 1985.073246] do_syscall_64 (arch/x86/entry/common.c:52)
[ 1985.073246] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
[ 1985.073246] RIP: 0033:0x7fa7c499081a
[ 1985.073246] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
All code
========
   0:   d8 64 89 02             fsubs  0x2(%rcx,%rcx,4)
   4:   48 c7 c0 ff ff ff ff    mov    $0xffffffffffffffff,%rax
   b:   eb b8                   jmp    0xffffffffffffffc5
   d:   0f 1f 00                nopl   (%rax)
  10:   f3 0f 1e fa             endbr64
  14:   41 89 ca                mov    %ecx,%r10d
  17:   64 8b 04 25 18 00 00    mov    %fs:0x18,%eax
  1e:   00
  1f:   85 c0                   test   %eax,%eax
  21:   75 15                   jne    0x38
  23:   b8 2c 00 00 00          mov    $0x2c,%eax
  28:   0f 05                   syscall
  2a:*  48 3d 00 f0 ff ff       cmp    $0xfffffffffffff000,%rax         <-- trapping instruction
  30:   77 7e                   ja     0xb0
  32:   c3                      ret
  33:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
  38:   41 54                   push   %r12
  3a:   48 83 ec 30             sub    $0x30,%rsp
  3e:   44                      rex.R
  3f:   89                      .byte 0x89

Code starting with the faulting instruction
===========================================
   0:   48 3d 00 f0 ff ff       cmp    $0xfffffffffffff000,%rax
   6:   77 7e                   ja     0x86
   8:   c3                      ret
   9:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
   e:   41 54                   push   %r12
  10:   48 83 ec 30             sub    $0x30,%rsp
  14:   44                      rex.R
  15:   89                      .byte 0x89
[ 1985.073246] RSP: 002b:00007ffce269b368 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[ 1985.073246] RAX: ffffffffffffffda RBX: 00007ffce269ca10 RCX: 00007fa7c499081a
[ 1985.073246] RDX: 0000000000000040 RSI: 0000558423f7c300 RDI: 0000000000000003
[ 1985.073246] RBP: 0000558423f7c300 R08: 00007ffce269ec90 R09: 0000000000000010
[ 1985.073246] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040
[ 1985.073246] R13: 00007ffce269c5a8 R14: 00007ffce269b370 R15: 00007ffce269ca10
[ 1985.073246]  </TASK>
[ 1985.073246] Modules linked in:
[ 1985.073246] ---[ end trace 0000000000000000 ]---
[ 1985.073246] RIP: 0010:netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
[ 1985.073246] Code: 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 48 89 fd 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 31 c0 e9 <c9> 00 00 00 66 90 66 90 48 8d 54 24 10 48 89 ef 65 8b 35 67 48 d0
All code
========
   0:   0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
   7:   00
   8:   0f 1f 40 00             nopl   0x0(%rax)
   c:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
  11:   55                      push   %rbp
  12:   48 89 fd                mov    %rdi,%rbp
  15:   48 83 ec 20             sub    $0x20,%rsp
  19:   65 48 8b 04 25 28 00    mov    %gs:0x28,%rax
  20:   00 00
  22:   48 89 44 24 18          mov    %rax,0x18(%rsp)
  27:   31 c0                   xor    %eax,%eax
  29:*  e9 c9 00 00 00          jmp    0xf7             <-- trapping instruction
  2e:   66 90                   xchg   %ax,%ax
  30:   66 90                   xchg   %ax,%ax
  32:   48 8d 54 24 10          lea    0x10(%rsp),%rdx
  37:   48 89 ef                mov    %rbp,%rdi
  3a:   65                      gs
  3b:   8b                      .byte 0x8b
  3c:   35                      .byte 0x35
  3d:   67                      addr32
  3e:   48                      rex.W
  3f:   d0                      .byte 0xd0

Code starting with the faulting instruction
===========================================
   0:   c9                      leave
   1:   00 00                   add    %al,(%rax)
   3:   00 66 90                add    %ah,-0x70(%rsi)
   6:   66 90                   xchg   %ax,%ax
   8:   48 8d 54 24 10          lea    0x10(%rsp),%rdx
   d:   48 89 ef                mov    %rbp,%rdi
  10:   65                      gs
  11:   8b                      .byte 0x8b
  12:   35                      .byte 0x35
  13:   67                      addr32
  14:   48                      rex.W
  15:   d0                      .byte 0xd0
[ 1985.073246] RSP: 0018:ffffb36d40003c08 EFLAGS: 00000246
[ 1985.073246] RAX: 0000000000000000 RBX: ffff9580825ca000 RCX: 0000000000000001
[ 1985.073246] RDX: 0000000000000002 RSI: ffff9580825c8000 RDI: ffff9580821cca00
[ 1985.073246] RBP: ffff9580821cca00 R08: 0000000000000000 R09: 000000000000001c
[ 1985.073246] R10: ffff9580812dcf10 R11: ffff9580812dcf00 R12: ffff9580821cca00
[ 1985.073246] R13: 000000000000002a R14: 0000000000000000 R15: ffff9580825e1800
[ 1985.073246] FS:  00007fa7c46be1c0(0000) GS:ffff9580fdc00000(0000) knlGS:0000000000000000
[ 1985.073246] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1985.073246] CR2: 00005584236b2200 CR3: 0000000002704000 CR4: 00000000000006f0
[ 1985.073246] Kernel panic - not syncing: Fatal exception in interrupt
[ 1985.073246] Kernel Offset: 0x19a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Unexpected stop of the VM

What was being done in userspace:

Full version

++ dirname ./mptcp_connect.sh
+ . ./mptcp_lib.sh
++ readonly KSFT_PASS=0
++ KSFT_PASS=0
++ readonly KSFT_FAIL=1
++ KSFT_FAIL=1
++ readonly KSFT_SKIP=4
++ KSFT_SKIP=4
+++ basename ./mptcp_connect.sh
+++ sed 's/\.sh$//g'
++ readonly KSFT_TEST=mptcp_connect
++ KSFT_TEST=mptcp_connect
++ MPTCP_LIB_SUBTESTS=()
++ '[' -t 1 ']'
++ '[' 1 = 1 ']'
++ '[' '' '!=' 1 ']'
++ readonly 'MPTCP_LIB_COLOR_RED=\E[1;31m'
++ MPTCP_LIB_COLOR_RED='\E[1;31m'
++ readonly 'MPTCP_LIB_COLOR_GREEN=\E[1;32m'
++ MPTCP_LIB_COLOR_GREEN='\E[1;32m'
++ readonly 'MPTCP_LIB_COLOR_YELLOW=\E[1;33m'
++ MPTCP_LIB_COLOR_YELLOW='\E[1;33m'
++ readonly 'MPTCP_LIB_COLOR_BLUE=\E[1;34m'
++ MPTCP_LIB_COLOR_BLUE='\E[1;34m'
++ readonly 'MPTCP_LIB_COLOR_RESET=\E[0m'
++ MPTCP_LIB_COLOR_RESET='\E[0m'
++ date +%s
+ time_start=1705765027
+ optstring=S:R:d:e:l:r:h4cm:f:tC
+ ret=0
+ final_ret=0
+ sin=
+ sout=
+ cin_disconnect=
+ cin=
+ cout=
+ ksft_skip=4
+ capture=false
+ timeout_poll=30
+ timeout_test=61
+ ipv6=true
+ ethtool_random_on=true
+ tc_delay=49
+ tc_loss=76
+ testmode=
+ sndbuf=0
+ rcvbuf=0
+ options_log=true
+ do_tcp=0
+ checksum=false
+ filesize=0
+ connect_per_transfer=1
+ '[' 76 -eq 100 ']'
+ '[' 76 -ge 10 ']'
+ tc_loss=0.76%
+ getopts S:R:d:e:l:r:h4cm:f:tC option
++ date +%s
+ sec=1705765027
++ printf %x 1705765027
++ mktemp -u XXXXXX
+ rndh=65abe8a3-O2ZCgd
+ ns1=ns1-65abe8a3-O2ZCgd
+ ns2=ns2-65abe8a3-O2ZCgd
+ ns3=ns3-65abe8a3-O2ZCgd
+ ns4=ns4-65abe8a3-O2ZCgd
+ TEST_COUNT=0
+ TEST_GROUP=
+ mptcp_lib_check_mptcp
+ mptcp_lib_has_file /proc/sys/net/mptcp/enabled
+ local f=/proc/sys/net/mptcp/enabled
+ '[' -f /proc/sys/net/mptcp/enabled ']'
+ return 0
+ mptcp_lib_check_kallsyms
+ mptcp_lib_has_file /proc/kallsyms
+ local f=/proc/kallsyms
+ '[' -f /proc/kallsyms ']'
+ return 0
+ ip -Version
+ '[' 0 -ne 0 ']'
++ mktemp
+ sin=/tmp/tmp.19cAx2Eg8O
++ mktemp
+ sout=/tmp/tmp.D74uSh5z4f
++ mktemp
+ cin=/tmp/tmp.BWY7Jw45jg
++ mktemp
+ cout=/tmp/tmp.YGp9ybpEow
++ mktemp
+ capout=/tmp/tmp.FIKQQYHbaC
+ cin_disconnect=/tmp/tmp.BWY7Jw45jg.disconnect
+ cout_disconnect=/tmp/tmp.YGp9ybpEow.disconnect
+ trap cleanup EXIT
+ for i in "$ns1" "$ns2" "$ns3" "$ns4"
+ ip netns add ns1-65abe8a3-O2ZCgd
+ ip -net ns1-65abe8a3-O2ZCgd link set lo up
+ for i in "$ns1" "$ns2" "$ns3" "$ns4"
+ ip netns add ns2-65abe8a3-O2ZCgd
+ ip -net ns2-65abe8a3-O2ZCgd link set lo up
+ for i in "$ns1" "$ns2" "$ns3" "$ns4"
+ ip netns add ns3-65abe8a3-O2ZCgd
+ ip -net ns3-65abe8a3-O2ZCgd link set lo up
+ for i in "$ns1" "$ns2" "$ns3" "$ns4"
+ ip netns add ns4-65abe8a3-O2ZCgd
+ ip -net ns4-65abe8a3-O2ZCgd link set lo up
+ ip link add ns1eth2 netns ns1-65abe8a3-O2ZCgd type veth peer name ns2eth1 netns ns2-65abe8a3-O2ZCgd
+ ip link add ns2eth3 netns ns2-65abe8a3-O2ZCgd type veth peer name ns3eth2 netns ns3-65abe8a3-O2ZCgd
+ ip link add ns3eth4 netns ns3-65abe8a3-O2ZCgd type veth peer name ns4eth3 netns ns4-65abe8a3-O2ZCgd
+ ip -net ns1-65abe8a3-O2ZCgd addr add 10.0.1.1/24 dev ns1eth2
+ ip -net ns1-65abe8a3-O2ZCgd addr add dead:beef:1::1/64 dev ns1eth2 nodad
+ ip -net ns1-65abe8a3-O2ZCgd link set ns1eth2 up
+ ip -net ns1-65abe8a3-O2ZCgd route add default via 10.0.1.2
+ ip -net ns1-65abe8a3-O2ZCgd route add default via dead:beef:1::2
+ ip -net ns2-65abe8a3-O2ZCgd addr add 10.0.1.2/24 dev ns2eth1
+ ip -net ns2-65abe8a3-O2ZCgd addr add dead:beef:1::2/64 dev ns2eth1 nodad
+ ip -net ns2-65abe8a3-O2ZCgd link set ns2eth1 up
+ ip -net ns2-65abe8a3-O2ZCgd addr add 10.0.2.1/24 dev ns2eth3
+ ip -net ns2-65abe8a3-O2ZCgd addr add dead:beef:2::1/64 dev ns2eth3 nodad
+ ip -net ns2-65abe8a3-O2ZCgd link set ns2eth3 up
+ ip -net ns2-65abe8a3-O2ZCgd route add default via 10.0.2.2
+ ip -net ns2-65abe8a3-O2ZCgd route add default via dead:beef:2::2
+ ip netns exec ns2-65abe8a3-O2ZCgd sysctl -q net.ipv4.ip_forward=1
+ ip netns exec ns2-65abe8a3-O2ZCgd sysctl -q net.ipv6.conf.all.forwarding=1
+ ip -net ns3-65abe8a3-O2ZCgd addr add 10.0.2.2/24 dev ns3eth2
+ ip -net ns3-65abe8a3-O2ZCgd addr add dead:beef:2::2/64 dev ns3eth2 nodad
+ ip -net ns3-65abe8a3-O2ZCgd link set ns3eth2 up
+ ip -net ns3-65abe8a3-O2ZCgd addr add 10.0.3.2/24 dev ns3eth4
+ ip -net ns3-65abe8a3-O2ZCgd addr add dead:beef:3::2/64 dev ns3eth4 nodad
+ ip -net ns3-65abe8a3-O2ZCgd link set ns3eth4 up
+ ip -net ns3-65abe8a3-O2ZCgd route add default via 10.0.2.1
+ ip -net ns3-65abe8a3-O2ZCgd route add default via dead:beef:2::1
+ ip netns exec ns3-65abe8a3-O2ZCgd sysctl -q net.ipv4.ip_forward=1
+ ip netns exec ns3-65abe8a3-O2ZCgd sysctl -q net.ipv6.conf.all.forwarding=1
+ ip -net ns4-65abe8a3-O2ZCgd addr add 10.0.3.1/24 dev ns4eth3
+ ip -net ns4-65abe8a3-O2ZCgd addr add dead:beef:3::1/64 dev ns4eth3 nodad
+ ip -net ns4-65abe8a3-O2ZCgd link set ns4eth3 up
+ ip -net ns4-65abe8a3-O2ZCgd route add default via 10.0.3.2
+ ip -net ns4-65abe8a3-O2ZCgd route add default via dead:beef:3::2
+ false
+ true
+ set_random_ethtool_flags ns3-65abe8a3-O2ZCgd ns3eth2
+ local flags=
+ local r=25968
+ local pick1=0
+ local pick2=0
+ local pick3=0
+ '[' 0 -ne 0 ']'
+ '[' 0 -ne 0 ']'
+ '[' 0 -ne 0 ']'
+ '[' -z '' ']'
+ return
+ set_random_ethtool_flags ns4-65abe8a3-O2ZCgd ns4eth3
+ local flags=
+ local r=32681
+ local pick1=1
+ local pick2=0
+ local pick3=0
+ '[' 1 -ne 0 ']'
+ flags='tso off'
+ '[' 0 -ne 0 ']'
+ '[' 0 -ne 0 ']'
+ '[' -z 'tso off' ']'
+ set_ethtool_flags ns4-65abe8a3-O2ZCgd ns4eth3 'tso off'
+ local ns=ns4-65abe8a3-O2ZCgd
+ local dev=ns4eth3
+ local 'flags=tso off'
+ ip netns exec ns4-65abe8a3-O2ZCgd ethtool -K ns4eth3 tso off
+ '[' 0 -eq 0 ']'
+ echo 'INFO: set ns4-65abe8a3-O2ZCgd dev ns4eth3: ethtool -K tso off'
+ make_file /tmp/tmp.BWY7Jw45jg client
+ local name=/tmp/tmp.BWY7Jw45jg
+ local who=client
+ local SIZE=0
+ local ksize
+ local rem
+ '[' 0 -eq 0 ']'
+ local MAXSIZE=8388608
+ local MINSIZE=262144
+ SIZE=1924196
+ ksize=1879
+ rem=100
+ mptcp_lib_make_file /tmp/tmp.BWY7Jw45jg 1024 1879
+ local name=/tmp/tmp.BWY7Jw45jg
+ local bs=1024
+ local size=1879
+ dd if=/dev/urandom of=/tmp/tmp.BWY7Jw45jg bs=1024 count=1879
+ echo -e '\nMPTCP_TEST_FILE_END_MARKER'
+ dd if=/dev/urandom conv=notrunc of=/tmp/tmp.BWY7Jw45jg oflag=append bs=1 count=100
++ du -b /tmp/tmp.BWY7Jw45jg
+ echo 'Created /tmp/tmp.BWY7Jw45jg (size 1924224	/tmp/tmp.BWY7Jw45jg) containing data sent by client'
+ make_file /tmp/tmp.19cAx2Eg8O server
+ local name=/tmp/tmp.19cAx2Eg8O
+ local who=server
+ local SIZE=0
+ local ksize
+ local rem
+ '[' 0 -eq 0 ']'
+ local MAXSIZE=8388608
+ local MINSIZE=262144
+ SIZE=2428261
+ ksize=2371
+ rem=357
+ mptcp_lib_make_file /tmp/tmp.19cAx2Eg8O 1024 2371
+ local name=/tmp/tmp.19cAx2Eg8O
+ local bs=1024
+ local size=2371
+ dd if=/dev/urandom of=/tmp/tmp.19cAx2Eg8O bs=1024 count=2371
+ echo -e '\nMPTCP_TEST_FILE_END_MARKER'
+ dd if=/dev/urandom conv=notrunc of=/tmp/tmp.19cAx2Eg8O oflag=append bs=1 count=357
++ du -b /tmp/tmp.19cAx2Eg8O
+ echo 'Created /tmp/tmp.19cAx2Eg8O (size 2428289	/tmp/tmp.19cAx2Eg8O) containing data sent by server'
+ check_mptcp_disabled
+ local disabled_ns=ns_disabled-65abe8a3-O2ZCgd
+ ip netns add ns_disabled-65abe8a3-O2ZCgd
++ ip netns exec ns_disabled-65abe8a3-O2ZCgd sysctl net.mptcp.enabled
++ awk '{ print $3 }'
+ '[' 1 -ne 1 ']'
+ ip netns exec ns_disabled-65abe8a3-O2ZCgd sysctl -q net.mptcp.enabled=0
+ local err=0
+ grep -q '^socket: Protocol not available$'
+ LC_ALL=C
+ ip netns exec ns_disabled-65abe8a3-O2ZCgd ./mptcp_connect -p 10000 -s MPTCP 127.0.0.1
+ err=1
+ ip netns delete ns_disabled-65abe8a3-O2ZCgd
+ '[' 1 -eq 0 ']'
+ echo -e 'New MPTCP socket can be blocked via sysctl\t\t[ OK ]'
+ mptcp_lib_result_pass 'New MPTCP socket can be blocked via sysctl'
+ __mptcp_lib_result_add ok 'New MPTCP socket can be blocked via sysctl'
+ local result=ok
+ shift
+ local id=1
+ MPTCP_LIB_SUBTESTS+=("${result} ${id} - ${KSFT_TEST}: ${*}")
+ return 0
+ stop_if_error 'The kernel configuration is not valid for MPTCP'
+ log_if_error 'The kernel configuration is not valid for MPTCP'
+ local 'msg=The kernel configuration is not valid for MPTCP'
+ '[' 0 -ne 0 ']'
+ echo 'INFO: validating network environment with pings'
+ for sender in "$ns1" "$ns2" "$ns3" "$ns4"
+ do_ping ns1-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd 10.0.1.1
+ local listener_ns=ns1-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=10.0.1.1
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 10.0.1.1
+ '[' -z 10.0.1.1 ']'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.1.1
+ '[' 0 -ne 0 ']'
+ return 0
+ do_ping ns1-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd dead:beef:1::1
+ local listener_ns=ns1-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=dead:beef:1::1
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 dead:beef:1::1
+ '[' -z '' ']'
+ true
+ ping_args='-q -c 1 -6'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 -6 dead:beef:1::1
+ '[' 0 -ne 0 ']'
+ return 0
+ do_ping ns2-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd 10.0.1.2
+ local listener_ns=ns2-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=10.0.1.2
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 10.0.1.2
+ '[' -z 10.0.1.2 ']'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.1.2
+ '[' 0 -ne 0 ']'
+ return 0
+ do_ping ns2-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd dead:beef:1::2
+ local listener_ns=ns2-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=dead:beef:1::2
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 dead:beef:1::2
+ '[' -z '' ']'
+ true
+ ping_args='-q -c 1 -6'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 -6 dead:beef:1::2
+ '[' 0 -ne 0 ']'
+ return 0
+ do_ping ns2-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd 10.0.2.1
+ local listener_ns=ns2-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=10.0.2.1
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 10.0.2.1
+ '[' -z 10.0.2.1 ']'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.2.1
+ '[' 0 -ne 0 ']'
+ return 0
+ do_ping ns2-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd dead:beef:2::1
+ local listener_ns=ns2-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=dead:beef:2::1
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 dead:beef:2::1
+ '[' -z '' ']'
+ true
+ ping_args='-q -c 1 -6'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 -6 dead:beef:2::1
+ '[' 0 -ne 0 ']'
+ return 0
+ do_ping ns3-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd 10.0.2.2
+ local listener_ns=ns3-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=10.0.2.2
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 10.0.2.2
+ '[' -z 10.0.2.2 ']'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.2.2
+ '[' 0 -ne 0 ']'
+ return 0
+ do_ping ns3-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd dead:beef:2::2
+ local listener_ns=ns3-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=dead:beef:2::2
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 dead:beef:2::2
+ '[' -z '' ']'
+ true
+ ping_args='-q -c 1 -6'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 -6 dead:beef:2::2
+ '[' 0 -ne 0 ']'
+ return 0
+ do_ping ns3-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd 10.0.3.2
+ local listener_ns=ns3-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=10.0.3.2
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 10.0.3.2
+ '[' -z 10.0.3.2 ']'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.3.2
+ '[' 0 -ne 0 ']'
+ return 0
+ do_ping ns3-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd dead:beef:3::2
+ local listener_ns=ns3-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=dead:beef:3::2
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 dead:beef:3::2
+ '[' -z '' ']'
+ true
+ ping_args='-q -c 1 -6'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 -6 dead:beef:3::2
+ '[' 0 -ne 0 ']'
+ return 0
+ do_ping ns4-65abe8a3-O2ZCgd ns1-65abe8a3-O2ZCgd 10.0.3.1
+ local listener_ns=ns4-65abe8a3-O2ZCgd
+ local connector_ns=ns1-65abe8a3-O2ZCgd
+ local connect_addr=10.0.3.1
+ local 'ping_args=-q -c 1'
+ local rc=0
+ mptcp_lib_is_v6 10.0.3.1
+ '[' -z 10.0.3.1 ']'
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.3.1

Here is the stripped version to ease the reading:

+ ip netns add ns1-65abe8a3-O2ZCgd
+ ip -net ns1-65abe8a3-O2ZCgd link set lo up
+ ip netns add ns2-65abe8a3-O2ZCgd
+ ip -net ns2-65abe8a3-O2ZCgd link set lo up
+ ip netns add ns3-65abe8a3-O2ZCgd
+ ip -net ns3-65abe8a3-O2ZCgd link set lo up
+ ip netns add ns4-65abe8a3-O2ZCgd
+ ip -net ns4-65abe8a3-O2ZCgd link set lo up

+ ip link add ns1eth2 netns ns1-65abe8a3-O2ZCgd type veth peer name ns2eth1 netns ns2-65abe8a3-O2ZCgd
+ ip link add ns2eth3 netns ns2-65abe8a3-O2ZCgd type veth peer name ns3eth2 netns ns3-65abe8a3-O2ZCgd
+ ip link add ns3eth4 netns ns3-65abe8a3-O2ZCgd type veth peer name ns4eth3 netns ns4-65abe8a3-O2ZCgd
+ ip -net ns1-65abe8a3-O2ZCgd addr add 10.0.1.1/24 dev ns1eth2
+ ip -net ns1-65abe8a3-O2ZCgd addr add dead:beef:1::1/64 dev ns1eth2 nodad
+ ip -net ns1-65abe8a3-O2ZCgd link set ns1eth2 up
+ ip -net ns1-65abe8a3-O2ZCgd route add default via 10.0.1.2
+ ip -net ns1-65abe8a3-O2ZCgd route add default via dead:beef:1::2
+ ip -net ns2-65abe8a3-O2ZCgd addr add 10.0.1.2/24 dev ns2eth1
+ ip -net ns2-65abe8a3-O2ZCgd addr add dead:beef:1::2/64 dev ns2eth1 nodad
+ ip -net ns2-65abe8a3-O2ZCgd link set ns2eth1 up
+ ip -net ns2-65abe8a3-O2ZCgd addr add 10.0.2.1/24 dev ns2eth3
+ ip -net ns2-65abe8a3-O2ZCgd addr add dead:beef:2::1/64 dev ns2eth3 nodad
+ ip -net ns2-65abe8a3-O2ZCgd link set ns2eth3 up
+ ip -net ns2-65abe8a3-O2ZCgd route add default via 10.0.2.2
+ ip -net ns2-65abe8a3-O2ZCgd route add default via dead:beef:2::2
+ ip netns exec ns2-65abe8a3-O2ZCgd sysctl -q net.ipv4.ip_forward=1
+ ip netns exec ns2-65abe8a3-O2ZCgd sysctl -q net.ipv6.conf.all.forwarding=1
+ ip -net ns3-65abe8a3-O2ZCgd addr add 10.0.2.2/24 dev ns3eth2
+ ip -net ns3-65abe8a3-O2ZCgd addr add dead:beef:2::2/64 dev ns3eth2 nodad
+ ip -net ns3-65abe8a3-O2ZCgd link set ns3eth2 up
+ ip -net ns3-65abe8a3-O2ZCgd addr add 10.0.3.2/24 dev ns3eth4
+ ip -net ns3-65abe8a3-O2ZCgd addr add dead:beef:3::2/64 dev ns3eth4 nodad
+ ip -net ns3-65abe8a3-O2ZCgd link set ns3eth4 up
+ ip -net ns3-65abe8a3-O2ZCgd route add default via 10.0.2.1
+ ip -net ns3-65abe8a3-O2ZCgd route add default via dead:beef:2::1
+ ip netns exec ns3-65abe8a3-O2ZCgd sysctl -q net.ipv4.ip_forward=1
+ ip netns exec ns3-65abe8a3-O2ZCgd sysctl -q net.ipv6.conf.all.forwarding=1
+ ip -net ns4-65abe8a3-O2ZCgd addr add 10.0.3.1/24 dev ns4eth3
+ ip -net ns4-65abe8a3-O2ZCgd addr add dead:beef:3::1/64 dev ns4eth3 nodad
+ ip -net ns4-65abe8a3-O2ZCgd link set ns4eth3 up
+ ip -net ns4-65abe8a3-O2ZCgd route add default via 10.0.3.2
+ ip -net ns4-65abe8a3-O2ZCgd route add default via dead:beef:3::2
+ ip netns exec ns4-65abe8a3-O2ZCgd ethtool -K ns4eth3 tso off

+ dd if=/dev/urandom of=/tmp/tmp.BWY7Jw45jg bs=1024 count=1879
+ dd if=/dev/urandom conv=notrunc of=/tmp/tmp.BWY7Jw45jg oflag=append bs=1 count=100
+ dd if=/dev/urandom of=/tmp/tmp.19cAx2Eg8O bs=1024 count=2371
+ dd if=/dev/urandom conv=notrunc of=/tmp/tmp.19cAx2Eg8O oflag=append bs=1 count=357

+ ip netns add ns_disabled-65abe8a3-O2ZCgd
++ ip netns exec ns_disabled-65abe8a3-O2ZCgd sysctl net.mptcp.enabled | awk '{ print $3 }'
+ '[' 1 -ne 1 ']'
+ ip netns exec ns_disabled-65abe8a3-O2ZCgd sysctl -q net.mptcp.enabled=0
+ local err=0
+ ip netns exec ns_disabled-65abe8a3-O2ZCgd ./mptcp_connect -p 10000 -s MPTCP 127.0.0.1 | grep -q '^socket: Protocol not available$'
+ err=1
+ ip netns delete ns_disabled-65abe8a3-O2ZCgd
+ '[' 1 -eq 0 ']'
New MPTCP socket can be blocked via sysctl		[ OK ]

INFO: validating network environment with pings
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.1.1
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 -6 dead:beef:1::1
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.1.2
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 -6 dead:beef:1::2
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.2.1
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 -6 dead:beef:2::1
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.2.2
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 -6 dead:beef:2::2
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.3.2
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 -6 dead:beef:3::2
+ ip netns exec ns1-65abe8a3-O2ZCgd ping -q -c 1 10.0.3.1
<crash>

Kernel config:
config.gz

@matttbe
Copy link
Member Author

matttbe commented Jan 20, 2024

The issue has been reported to the x86 ML: Lore

matttbe pushed a commit that referenced this issue Jan 21, 2024
INT3 is used not only for software breakpoint, but also self modifying
code on x86 in the kernel. For example, jump_label, function tracer etc.
Those may not handle INT3 after removing it but not waiting for
synchronizing CPUs enough. Since such 'ghost' INT3 is not handled by
anyone because they think it has been removed already.
Recheck there is INT3 on the exception address and if not, ignore it.

Note that previously kprobes does the same thing by itself, but that is
not a good location to do that because INT3 is commonly used. Do it at
the common place so that it can handle all 'ghost' INT3.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/all/06cb540e-34ff-4dcd-b936-19d4d14378c9@kernel.org/
Closes: #471
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 8e791f7 ("x86/kprobes: Drop removed INT3 handling code")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
matttbe pushed a commit that referenced this issue Jan 23, 2024
INT3 is used not only for software breakpoint, but also self modifying
code on x86 in the kernel. For example, jump_label, function tracer etc.
Those may not handle INT3 after removing it but not waiting for
synchronizing CPUs enough. Since such 'ghost' INT3 is not handled by
anyone because they think it has been removed already.
Recheck there is INT3 on the exception address and if not, ignore it.

Note that previously kprobes does the same thing by itself, but that is
not a good location to do that because INT3 is commonly used. Do it at
the common place so that it can handle all 'ghost' INT3.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/all/06cb540e-34ff-4dcd-b936-19d4d14378c9@kernel.org/
Closes: #471
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 8e791f7 ("x86/kprobes: Drop removed INT3 handling code")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
matttbe pushed a commit that referenced this issue Jan 23, 2024
INT3 is used not only for software breakpoint, but also self modifying
code on x86 in the kernel. For example, jump_label, function tracer etc.
Those may not handle INT3 after removing it but not waiting for
synchronizing CPUs enough. Since such 'ghost' INT3 is not handled by
anyone because they think it has been removed already.
Recheck there is INT3 on the exception address and if not, ignore it.

Note that previously kprobes does the same thing by itself, but that is
not a good location to do that because INT3 is commonly used. Do it at
the common place so that it can handle all 'ghost' INT3.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/all/06cb540e-34ff-4dcd-b936-19d4d14378c9@kernel.org/
Closes: #471
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 8e791f7 ("x86/kprobes: Drop removed INT3 handling code")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
matttbe pushed a commit that referenced this issue Jan 26, 2024
INT3 is used not only for software breakpoint, but also self modifying
code on x86 in the kernel. For example, jump_label, function tracer etc.
Those may not handle INT3 after removing it but not waiting for
synchronizing CPUs enough. Since such 'ghost' INT3 is not handled by
anyone because they think it has been removed already.
Recheck there is INT3 on the exception address and if not, ignore it.

Note that previously kprobes does the same thing by itself, but that is
not a good location to do that because INT3 is commonly used. Do it at
the common place so that it can handle all 'ghost' INT3.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/all/06cb540e-34ff-4dcd-b936-19d4d14378c9@kernel.org/
Closes: #471
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 8e791f7 ("x86/kprobes: Drop removed INT3 handling code")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
matttbe pushed a commit that referenced this issue Jan 26, 2024
INT3 is used not only for software breakpoint, but also self modifying
code on x86 in the kernel. For example, jump_label, function tracer etc.
Those may not handle INT3 after removing it but not waiting for
synchronizing CPUs enough. Since such 'ghost' INT3 is not handled by
anyone because they think it has been removed already.
Recheck there is INT3 on the exception address and if not, ignore it.

Note that previously kprobes does the same thing by itself, but that is
not a good location to do that because INT3 is commonly used. Do it at
the common place so that it can handle all 'ghost' INT3.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/all/06cb540e-34ff-4dcd-b936-19d4d14378c9@kernel.org/
Closes: #471
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 8e791f7 ("x86/kprobes: Drop removed INT3 handling code")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
matttbe pushed a commit that referenced this issue Jan 30, 2024
INT3 is used not only for software breakpoint, but also self modifying
code on x86 in the kernel. For example, jump_label, function tracer etc.
Those may not handle INT3 after removing it but not waiting for
synchronizing CPUs enough. Since such 'ghost' INT3 is not handled by
anyone because they think it has been removed already.
Recheck there is INT3 on the exception address and if not, ignore it.

Note that previously kprobes does the same thing by itself, but that is
not a good location to do that because INT3 is commonly used. Do it at
the common place so that it can handle all 'ghost' INT3.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/all/06cb540e-34ff-4dcd-b936-19d4d14378c9@kernel.org/
Closes: #471
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 8e791f7 ("x86/kprobes: Drop removed INT3 handling code")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
matttbe pushed a commit that referenced this issue Jan 30, 2024
INT3 is used not only for software breakpoint, but also self modifying
code on x86 in the kernel. For example, jump_label, function tracer etc.
Those may not handle INT3 after removing it but not waiting for
synchronizing CPUs enough. Since such 'ghost' INT3 is not handled by
anyone because they think it has been removed already.
Recheck there is INT3 on the exception address and if not, ignore it.

Note that previously kprobes does the same thing by itself, but that is
not a good location to do that because INT3 is commonly used. Do it at
the common place so that it can handle all 'ghost' INT3.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/all/06cb540e-34ff-4dcd-b936-19d4d14378c9@kernel.org/
Closes: #471
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 8e791f7 ("x86/kprobes: Drop removed INT3 handling code")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
matttbe pushed a commit that referenced this issue Jan 30, 2024
INT3 is used not only for software breakpoint, but also self modifying
code on x86 in the kernel. For example, jump_label, function tracer etc.
Those may not handle INT3 after removing it but not waiting for
synchronizing CPUs enough. Since such 'ghost' INT3 is not handled by
anyone because they think it has been removed already.
Recheck there is INT3 on the exception address and if not, ignore it.

Note that previously kprobes does the same thing by itself, but that is
not a good location to do that because INT3 is commonly used. Do it at
the common place so that it can handle all 'ghost' INT3.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/all/06cb540e-34ff-4dcd-b936-19d4d14378c9@kernel.org/
Closes: #471
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 8e791f7 ("x86/kprobes: Drop removed INT3 handling code")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
matttbe pushed a commit that referenced this issue Jan 30, 2024
INT3 is used not only for software breakpoint, but also self modifying
code on x86 in the kernel. For example, jump_label, function tracer etc.
Those may not handle INT3 after removing it but not waiting for
synchronizing CPUs enough. Since such 'ghost' INT3 is not handled by
anyone because they think it has been removed already.
Recheck there is INT3 on the exception address and if not, ignore it.

Note that previously kprobes does the same thing by itself, but that is
not a good location to do that because INT3 is commonly used. Do it at
the common place so that it can handle all 'ghost' INT3.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/all/06cb540e-34ff-4dcd-b936-19d4d14378c9@kernel.org/
Closes: #471
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 8e791f7 ("x86/kprobes: Drop removed INT3 handling code")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
matttbe pushed a commit that referenced this issue Jan 30, 2024
INT3 is used not only for software breakpoint, but also self modifying
code on x86 in the kernel. For example, jump_label, function tracer etc.
Those may not handle INT3 after removing it but not waiting for
synchronizing CPUs enough. Since such 'ghost' INT3 is not handled by
anyone because they think it has been removed already.
Recheck there is INT3 on the exception address and if not, ignore it.

Note that previously kprobes does the same thing by itself, but that is
not a good location to do that because INT3 is commonly used. Do it at
the common place so that it can handle all 'ghost' INT3.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/all/06cb540e-34ff-4dcd-b936-19d4d14378c9@kernel.org/
Closes: #471
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 8e791f7 ("x86/kprobes: Drop removed INT3 handling code")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
matttbe pushed a commit that referenced this issue Jan 30, 2024
INT3 is used not only for software breakpoint, but also self modifying
code on x86 in the kernel. For example, jump_label, function tracer etc.
Those may not handle INT3 after removing it but not waiting for
synchronizing CPUs enough. Since such 'ghost' INT3 is not handled by
anyone because they think it has been removed already.
Recheck there is INT3 on the exception address and if not, ignore it.

Note that previously kprobes does the same thing by itself, but that is
not a good location to do that because INT3 is commonly used. Do it at
the common place so that it can handle all 'ghost' INT3.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/all/06cb540e-34ff-4dcd-b936-19d4d14378c9@kernel.org/
Closes: #471
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 8e791f7 ("x86/kprobes: Drop removed INT3 handling code")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
matttbe pushed a commit that referenced this issue Jan 30, 2024
INT3 is used not only for software breakpoint, but also self modifying
code on x86 in the kernel. For example, jump_label, function tracer etc.
Those may not handle INT3 after removing it but not waiting for
synchronizing CPUs enough. Since such 'ghost' INT3 is not handled by
anyone because they think it has been removed already.
Recheck there is INT3 on the exception address and if not, ignore it.

Note that previously kprobes does the same thing by itself, but that is
not a good location to do that because INT3 is commonly used. Do it at
the common place so that it can handle all 'ghost' INT3.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/all/06cb540e-34ff-4dcd-b936-19d4d14378c9@kernel.org/
Closes: #471
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 8e791f7 ("x86/kprobes: Drop removed INT3 handling code")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
matttbe pushed a commit that referenced this issue Jan 30, 2024
INT3 is used not only for software breakpoint, but also self modifying
code on x86 in the kernel. For example, jump_label, function tracer etc.
Those may not handle INT3 after removing it but not waiting for
synchronizing CPUs enough. Since such 'ghost' INT3 is not handled by
anyone because they think it has been removed already.
Recheck there is INT3 on the exception address and if not, ignore it.

Note that previously kprobes does the same thing by itself, but that is
not a good location to do that because INT3 is commonly used. Do it at
the common place so that it can handle all 'ghost' INT3.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/all/06cb540e-34ff-4dcd-b936-19d4d14378c9@kernel.org/
Closes: #471
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 8e791f7 ("x86/kprobes: Drop removed INT3 handling code")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
matttbe pushed a commit that referenced this issue Jan 30, 2024
INT3 is used not only for software breakpoint, but also self modifying
code on x86 in the kernel. For example, jump_label, function tracer etc.
Those may not handle INT3 after removing it but not waiting for
synchronizing CPUs enough. Since such 'ghost' INT3 is not handled by
anyone because they think it has been removed already.
Recheck there is INT3 on the exception address and if not, ignore it.

Note that previously kprobes does the same thing by itself, but that is
not a good location to do that because INT3 is commonly used. Do it at
the common place so that it can handle all 'ghost' INT3.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/all/06cb540e-34ff-4dcd-b936-19d4d14378c9@kernel.org/
Closes: #471
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 8e791f7 ("x86/kprobes: Drop removed INT3 handling code")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
matttbe pushed a commit that referenced this issue Jan 30, 2024
INT3 is used not only for software breakpoint, but also self modifying
code on x86 in the kernel. For example, jump_label, function tracer etc.
Those may not handle INT3 after removing it but not waiting for
synchronizing CPUs enough. Since such 'ghost' INT3 is not handled by
anyone because they think it has been removed already.
Recheck there is INT3 on the exception address and if not, ignore it.

Note that previously kprobes does the same thing by itself, but that is
not a good location to do that because INT3 is commonly used. Do it at
the common place so that it can handle all 'ghost' INT3.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/all/06cb540e-34ff-4dcd-b936-19d4d14378c9@kernel.org/
Closes: #471
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 8e791f7 ("x86/kprobes: Drop removed INT3 handling code")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
@matttbe
Copy link
Member Author

matttbe commented Jan 30, 2024

Short summary of the discussion we had on lore:

  • The commit I identified was preventing a bug
  • This bug is in QEmu v6.2.0 (Ubuntu 22.04):

    "Intel SDM Vol3A, 9.1.3 Handling Self- and Cross-Modifying Code" said that what the other CPU needs to do is "Execute serializing instruction; (* For example, CPUID instruction *)" for cross-modifying code. that has been done in do_sync_core(). Thus this bug should not happen.

  • A workaround has been shared and applied in our tree: 0a9890a ("x86: Fixup from the removed INT3 if it is unhandled")
  • It looks like this bug is not visible with QEmu 8.2

Next steps are:

  • upgrade QEmu to a supported version (e.g. from Ubuntu 23.10)
  • revert the workaround
  • report the bug to LP

matttbe added a commit to multipath-tcp/mptcp-upstream-virtme-docker that referenced this issue Jan 30, 2024
This is helpful for two things:

- There is a bug in the QEmu version we use, causing some issues when
  KVM is not used, see [1] [2].

- BPF selftests need a more recent version of the compiler. This is
  needed for [3].

Link: multipath-tcp/mptcp_net-next#471 [1]
Link: https://lore.kernel.org/all/06cb540e-34ff-4dcd-b936-19d4d14378c9@kernel.org/T/ [2]
Link: multipath-tcp/mptcp_net-next#406
Co-developed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
matttbe pushed a commit that referenced this issue Jan 31, 2024
INT3 is used not only for software breakpoint, but also self modifying
code on x86 in the kernel. For example, jump_label, function tracer etc.
Those may not handle INT3 after removing it but not waiting for
synchronizing CPUs enough. Since such 'ghost' INT3 is not handled by
anyone because they think it has been removed already.
Recheck there is INT3 on the exception address and if not, ignore it.

Note that previously kprobes does the same thing by itself, but that is
not a good location to do that because INT3 is commonly used. Do it at
the common place so that it can handle all 'ghost' INT3.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/all/06cb540e-34ff-4dcd-b936-19d4d14378c9@kernel.org/
Closes: #471
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 8e791f7 ("x86/kprobes: Drop removed INT3 handling code")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
matttbe pushed a commit that referenced this issue Jan 31, 2024
INT3 is used not only for software breakpoint, but also self modifying
code on x86 in the kernel. For example, jump_label, function tracer etc.
Those may not handle INT3 after removing it but not waiting for
synchronizing CPUs enough. Since such 'ghost' INT3 is not handled by
anyone because they think it has been removed already.
Recheck there is INT3 on the exception address and if not, ignore it.

Note that previously kprobes does the same thing by itself, but that is
not a good location to do that because INT3 is commonly used. Do it at
the common place so that it can handle all 'ghost' INT3.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/all/06cb540e-34ff-4dcd-b936-19d4d14378c9@kernel.org/
Closes: #471
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 8e791f7 ("x86/kprobes: Drop removed INT3 handling code")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
matttbe pushed a commit that referenced this issue Jan 31, 2024
INT3 is used not only for software breakpoint, but also self modifying
code on x86 in the kernel. For example, jump_label, function tracer etc.
Those may not handle INT3 after removing it but not waiting for
synchronizing CPUs enough. Since such 'ghost' INT3 is not handled by
anyone because they think it has been removed already.
Recheck there is INT3 on the exception address and if not, ignore it.

Note that previously kprobes does the same thing by itself, but that is
not a good location to do that because INT3 is commonly used. Do it at
the common place so that it can handle all 'ghost' INT3.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/all/06cb540e-34ff-4dcd-b936-19d4d14378c9@kernel.org/
Closes: #471
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 8e791f7 ("x86/kprobes: Drop removed INT3 handling code")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
matttbe pushed a commit that referenced this issue Jan 31, 2024
INT3 is used not only for software breakpoint, but also self modifying
code on x86 in the kernel. For example, jump_label, function tracer etc.
Those may not handle INT3 after removing it but not waiting for
synchronizing CPUs enough. Since such 'ghost' INT3 is not handled by
anyone because they think it has been removed already.
Recheck there is INT3 on the exception address and if not, ignore it.

Note that previously kprobes does the same thing by itself, but that is
not a good location to do that because INT3 is commonly used. Do it at
the common place so that it can handle all 'ghost' INT3.

Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/all/06cb540e-34ff-4dcd-b936-19d4d14378c9@kernel.org/
Closes: #471
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 8e791f7 ("x86/kprobes: Drop removed INT3 handling code")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
@matttbe
Copy link
Member Author

matttbe commented Feb 1, 2024

reopening: we still have the issue without the workaround (kernel patch) with QEmu 8.0.4 that is installed in the docker used by the CI to execute the tests.

Next steps: try to identify the fix on QEmu side and have it backported (or upgrade QEmu manually?)

@matttbe matttbe reopened this Feb 1, 2024
@matttbe matttbe self-assigned this Feb 1, 2024
@matttbe
Copy link
Member Author

matttbe commented Feb 15, 2024

Note that we just had the issue with the kernel patch as a workaround:

+ ./mptcp_connect.sh -m mmap
+ tee /github/workspace/mptcp_connect_mmap.tap.tmp
+ /github/workspace/tools/testing/selftests/kselftest/prefix.pl
# INFO: set ns4-65ce3396-YqN01V dev ns4eth3: ethtool -K  gso off gro off
# Created /tmp/tmp.A4VMEbiPE9 (size 5537555	/tmp/tmp.A4VMEbiPE9) containing data sent by client
# Created /tmp/tmp.nXZrvFIw8k (size 4784112	/tmp/tmp.nXZrvFIw8k) containing data sent by server
# New MPTCP socket can be blocked via sysctl		[ OK ]
# INFO: validating network environment with pings
[ 1620.690258] int3: 0000 [#1] PREEMPT SMP NOPTI
[ 1620.690586] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G                 N 6.8.0-rc3-g39cb90ad6cf5 #1
[ 1620.690586] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 1620.690586] RIP: 0010:netif_rx_internal (arch/x86/include/asm/jump_label.h:27) 
[ 1620.690586] Code: e9 fd fe ff ff 0f 1f 80 00 00 00 00 0f 1f 44 00 00 53 48 89 fb 48 83 ec 18 65 48 8b 04 25 28 00 00 00 48 89 44 24 10 31 c0 cc <dc> 00 00 00 0f 1f 44 00 00 66 90 c7 44 24 08 00 00 00 00 48 89 df
All code
========
   0:	e9 fd fe ff ff       	jmp    0xffffffffffffff02
   5:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)
   c:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  11:	53                   	push   %rbx
  12:	48 89 fb             	mov    %rdi,%rbx
  15:	48 83 ec 18          	sub    $0x18,%rsp
  19:	65 48 8b 04 25 28 00 	mov    %gs:0x28,%rax
  20:	00 00 
  22:	48 89 44 24 10       	mov    %rax,0x10(%rsp)
  27:	31 c0                	xor    %eax,%eax
  29:	cc                   	int3
  2a:*	dc 00                	faddl  (%rax)		<-- trapping instruction
  2c:	00 00                	add    %al,(%rax)
  2e:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  33:	66 90                	xchg   %ax,%ax
  35:	c7 44 24 08 00 00 00 	movl   $0x0,0x8(%rsp)
  3c:	00 
  3d:	48 89 df             	mov    %rbx,%rdi

Code starting with the faulting instruction
===========================================
   0:	dc 00                	faddl  (%rax)
   2:	00 00                	add    %al,(%rax)
   4:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
   9:	66 90                	xchg   %ax,%ax
   b:	c7 44 24 08 00 00 00 	movl   $0x0,0x8(%rsp)
  12:	00 
  13:	48 89 df             	mov    %rbx,%rdi
[ 1620.690586] RSP: 0018:ffffae3bc011cbf0 EFLAGS: 00000246
[ 1620.690586] RAX: 0000000000000000 RBX: ffff998843b31800 RCX: 0000000000000002
[ 1620.690586] RDX: 0000000000000002 RSI: ffff998842c45000 RDI: ffff998843b31800
[ 1620.690586] RBP: ffff998842c44000 R08: ffff998844149a00 R09: 0000000000000000
[ 1620.690586] R10: ffff998844538980 R11: ffff9988419a7238 R12: 0000000000000000
[ 1620.690586] R13: 0000000000000046 R14: 0000000000000000 R15: ffff9988446ba000
[ 1620.690586] FS:  0000000000000000(0000) GS:ffff9988bdd00000(0000) knlGS:0000000000000000
[ 1620.690586] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1620.690586] CR2: 000055bd409dfca0 CR3: 00000000025fa000 CR4: 00000000000006f0
[ 1620.690586] Call Trace:
[ 1620.690586]  <IRQ>
[ 1620.690586] ? die (arch/x86/kernel/dumpstack.c:421) 
[ 1620.690586] ? exc_int3 (arch/x86/kernel/traps.c:781) 
[ 1620.690586] ? asm_exc_int3 (arch/x86/include/asm/idtentry.h:569) 
[ 1620.690586] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27) 
[ 1620.690586] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27) 
[ 1620.690586] __netif_rx (net/core/dev.c:5092) 
[ 1620.690586] veth_xmit (drivers/net/veth.c:374 (discriminator 2)) 
[ 1620.690586] dev_hard_start_xmit (include/linux/netdevice.h:4989) 
[ 1620.690586] __dev_queue_xmit (include/linux/netdevice.h:3367 (discriminator 25)) 
[ 1620.690586] ip6_finish_output2 (include/linux/netdevice.h:3171) 
[ 1620.690586] ? ip6_output (include/linux/netfilter.h:301 (discriminator 1)) 
[ 1620.690586] ? ip6_mtu (net/ipv6/route.c:3217) 
[ 1620.690586] ndisc_send_skb (net/ipv6/ndisc.c:512) 
[ 1620.690586] addrconf_rs_timer (net/ipv6/addrconf.c:4000) 
[ 1620.690586] ? ipv6_get_lladdr (net/ipv6/addrconf.c:3976) 
[ 1620.690586] call_timer_fn (arch/x86/include/asm/jump_label.h:27) 
[ 1620.690586] ? ipv6_get_lladdr (net/ipv6/addrconf.c:3976) 
[ 1620.690586] __run_timers (kernel/time/timer.c:1752) 
[ 1620.690586] run_timer_softirq (kernel/time/timer.c:2053 (discriminator 1)) 
[ 1620.690586] __do_softirq (arch/x86/include/asm/jump_label.h:27) 
[ 1620.690586] irq_exit_rcu (kernel/softirq.c:427) 
[ 1620.690586] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1076 (discriminator 47)) 
[ 1620.690586]  </IRQ>
[ 1620.690586]  <TASK>
[ 1620.690586] asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:649) 
[ 1620.690586] RIP: 0010:default_idle (arch/x86/include/asm/irqflags.h:37) 
[ 1620.690586] Code: 89 07 49 c7 c0 08 00 00 00 4d 29 c8 4c 01 c7 4c 29 c2 e9 76 ff ff ff cc cc cc cc f3 0f 1e fa eb 07 0f 00 2d d3 54 37 00 fb f4 <fa> c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 f3 0f 1e fa 65
All code
========
   0:	89 07                	mov    %eax,(%rdi)
   2:	49 c7 c0 08 00 00 00 	mov    $0x8,%r8
   c:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  11:	53                   	push   %rbx
  12:	48 89 fb             	mov    %rdi,%rbx
  15:	48 83 ec 18          	sub    $0x18,%rsp
  19:	65 48 8b 04 25 28 00 	mov    %gs:0x28,%rax
  20:	00 00 
  22:	48 89 44 24 10       	mov    %rax,0x10(%rsp)
  27:	31 c0                	xor    %eax,%eax
  29:	cc                   	int3
  2a:*	dc 00                	faddl  (%rax)		<-- trapping instruction
  2c:	00 00                	add    %al,(%rax)
  2e:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  33:	66 90                	xchg   %ax,%ax
  35:	c7 44 24 08 00 00 00 	movl   $0x0,0x8(%rsp)
  3c:	00 
  3d:	48 89 df             	mov    %rbx,%rdi

Code starting with the faulting instruction
===========================================
   0:	dc 00                	faddl  (%rax)
   2:	00 00                	add    %al,(%rax)
   4:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
   9:	66 90                	xchg   %ax,%ax
   b:	c7 44 24 08 00 00 00 	movl   $0x0,0x8(%rsp)
  12:	00 
  13:	48 89 df             	mov    %rbx,%rdi
[ 1620.690586] RSP: 0018:ffffae3bc011cbf0 EFLAGS: 00000246
[ 1620.690586] RAX: 0000000000000000 RBX: ffff998843b31800 RCX: 0000000000000002
[ 1620.690586] RDX: 0000000000000002 RSI: ffff998842c45000 RDI: ffff998843b31800
[ 1620.690586] RBP: ffff998842c44000 R08: ffff998844149a00 R09: 0000000000000000
[ 1620.690586] R10: ffff998844538980 R11: ffff9988419a7238 R12: 0000000000000000
[ 1620.690586] R13: 0000000000000046 R14: 0000000000000000 R15: ffff9988446ba000
[ 1620.690586] FS:  0000000000000000(0000) GS:ffff9988bdd00000(0000) knlGS:0000000000000000
[ 1620.690586] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1620.690586] CR2: 000055bd409dfca0 CR3: 00000000025fa000 CR4: 00000000000006f0
[ 1620.690586] Kernel panic - not syncing: Fatal exception in interrupt
[ 1620.690586] Kernel Offset: 0x1fe00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Unexpected stop of the VM

https://github.com/multipath-tcp/mptcp_net-next/actions/runs/7918253003/job/21616220869

@matttbe
Copy link
Member Author

matttbe commented Feb 15, 2024

Next steps: try to identify the fix on QEmu side and have it backported (or upgrade QEmu manually?)

The issue has been fixed in QEmu v8.1.0, but not backported earlier. And it looks like there will not be any new v8.0 releases.

The fixes on QEmu's side:

  • fix: deba78709a ("accel/tcg: Always lock pages before translation") (not marked as a fix, not clear what was the reproducer, nor which error was seen, without a link to the initial bug report...)
  • dependence: cb62bd15e1 ("accel/tcg: Split out cpu_exec_longjmp_cleanup")
  • extra fix: ad17868eb1 ("accel/tcg: Clear tcg_ctx->gen_tb on buffer overflow")

There are some conflicts when backporting them to v8.0.4, but it is not blocking. I resolved the conflicts and pushed these 3 commits in this branch:

https://gitlab.com/matttbe/qemu/-/commits/lp-2051965/

Thanks to Canonical Server devs, Ubuntu 23.10 (and maybe 22.04 too) will get a new version with the fixes. Once it is available, we can revert the kernel patch acting as workaround, and close this issue.

For more details: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/2051965

matttbe added a commit that referenced this issue Mar 11, 2024
This reverts commit d942cde.

We no longer need this workaround:

- QEmu in Ubuntu 23.10 (used by the CIs) now has a fix

- Our CIs now all have KVM support

Closes: #471
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
@matttbe
Copy link
Member Author

matttbe commented Mar 11, 2024

QEmu 8.0.4+dfsg-1ubuntu3.23.10.3 in Ubuntu 23.10 now includes a fix to avoid the kernel panic.

I then reverted the workaround from our tree.

Note that the workaround is also no longer needed since all our CIs are now using KVM support #474

New patches for t/upstream-net and t/upstream:

Tests are now in progress:

@matttbe matttbe closed this as completed Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant