Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[syzkaller] WARNING in subflow_data_ready (v2.0) #39

Closed
cpaasch opened this issue Jun 10, 2020 · 5 comments
Closed

[syzkaller] WARNING in subflow_data_ready (v2.0) #39

cpaasch opened this issue Jun 10, 2020 · 5 comments
Assignees

Comments

@cpaasch
Copy link
Member

cpaasch commented Jun 10, 2020

HEAD is:

2d948a5e3ead ("mptcp: check for plain TCP sock at accept time") (HEAD) (10 minutes ago)
404b0abd5a9a ("mptcp: support IPV6_V6ONLY setsockopt") (2 hours ago)
5355be031726 ("mptcp: add REUSEADDR/REUSEADDR support") (2 hours ago)
84f9bcd3c1fc ("net: use mptcp setsockopt function for SOL_SOCKET on mptcp sockets") (2 hours ago)
0ec5e66cad3d ("selftests/mptcp: Capture pcap on both sender and receiver") (2 hours ago)
952f6a1 ("[DO-NOT-MERGE] mptcp: enabled by default") (tag: export/20200616T013825, mptcp_net-next/export) (15 hours ago)
b6b8b80 ("[DO-NOT-MERGE] mptcp: use kmalloc on kasan build") (15 hours ago)
c8807b1 ("mptcp: close poll() races") (15 hours ago)
9c7e7ff ("mptcp: add receive buffer auto-tuning") (15 hours ago)
c8c1a0f ("selftests: mptcp: add option to specify size of file to transfer") (15 hours ago)
0e68992 ("mptcp: fallback in case of simultaneous connect") (15 hours ago)
147eccc ("net: mptcp: improve fallback to TCP") (15 hours ago)
2579482 ("mptcp: introduce token KUNIT self-tests") (15 hours ago)
5d5fbf5 ("mptcp: move crypto test to KUNIT") (15 hours ago)
56530a9 ("mptcp: do nonce initialization at subflow creation time") (15 hours ago)
7a138a0 ("mptcp: refactor token container") (15 hours ago)
1dd5359 ("mptcp: add __init annotation on setup functions") (15 hours ago)
292de25 ("mptcp: drop MP_JOIN request sock on syn cookies") (15 hours ago)
9bdf579 ("mptcp: cache msk on MP_JOIN init_req") (15 hours ago)
b3a9e3b ("Linux 5.8-rc1") (tag: v5.8-rc1, netnext/master, mptcp_net-next/net-next) (2 days ago)

splat:

TCP: request_sock_subflow: Possible SYN flooding on port 20000. Sending cookies.  Check SNMP counters.
TCP: request_sock_subflow: Possible SYN flooding on port 20000. Sending cookies.  Check SNMP counters.
------------[ cut here ]------------
WARNING: CPU: 0 PID: 9370 at net/mptcp/subflow.c:885 subflow_data_ready+0x1e6/0x290 net/mptcp/subflow.c:885
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 9370 Comm: syz-executor.0 Not tainted 5.7.0 #106
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xb7/0xfe lib/dump_stack.c:118
 panic+0x29e/0x692 kernel/panic.c:221
 __warn.cold+0x2f/0x3d kernel/panic.c:582
 report_bug+0x28b/0x2f0 lib/bug.c:195
 fixup_bug arch/x86/kernel/traps.c:105 [inline]
 fixup_bug arch/x86/kernel/traps.c:100 [inline]
 do_error_trap+0x10f/0x180 arch/x86/kernel/traps.c:197
 do_invalid_op+0x32/0x40 arch/x86/kernel/traps.c:216
 invalid_op+0x1e/0x30 arch/x86/entry/entry_64.S:1027
RIP: 0010:subflow_data_ready+0x1e6/0x290 net/mptcp/subflow.c:885
Code: 04 02 84 c0 74 06 0f 8e 91 00 00 00 41 0f b6 5e 48 31 ff 83 e3 18 89 de e8 37 ec 3d fe 84 db 0f 85 65 ff ff ff e8 fa ea 3d fe <0f> 0b e9 59 ff ff ff e8 ee ea 3d fe 48 89 ee 4c 89 ef e8 f3 77 ff
RSP: 0018:ffff88811b2099b0 EFLAGS: 00010206
RAX: ffff888111197000 RBX: 0000000000000000 RCX: ffffffff82fbc609
RDX: 0000000000000100 RSI: ffffffff82fbc616 RDI: 0000000000000001
RBP: ffff8881111bc800 R08: ffff888111197000 R09: ffffed10222a82af
R10: ffff888111541577 R11: ffffed10222a82ae R12: 1ffff11023641336
R13: ffff888111541000 R14: ffff88810fd4ca00 R15: ffff888111541570
 tcp_child_process+0x754/0x920 net/ipv4/tcp_minisocks.c:841
 tcp_v4_do_rcv+0x749/0x8b0 net/ipv4/tcp_ipv4.c:1642
 tcp_v4_rcv+0x2666/0x2e60 net/ipv4/tcp_ipv4.c:1999
 ip_protocol_deliver_rcu+0x29/0x1f0 net/ipv4/ip_input.c:204
 ip_local_deliver_finish net/ipv4/ip_input.c:231 [inline]
 NF_HOOK include/linux/netfilter.h:421 [inline]
 ip_local_deliver+0x2da/0x390 net/ipv4/ip_input.c:252
 dst_input include/net/dst.h:441 [inline]
 ip_rcv_finish net/ipv4/ip_input.c:428 [inline]
 ip_rcv_finish net/ipv4/ip_input.c:414 [inline]
 NF_HOOK include/linux/netfilter.h:421 [inline]
 ip_rcv+0xef/0x140 net/ipv4/ip_input.c:539
 __netif_receive_skb_one_core+0x197/0x1e0 net/core/dev.c:5268
 __netif_receive_skb+0x27/0x1c0 net/core/dev.c:5382
 process_backlog+0x1e5/0x6d0 net/core/dev.c:6226
 napi_poll net/core/dev.c:6671 [inline]
 net_rx_action+0x3e3/0xd70 net/core/dev.c:6739
 __do_softirq+0x18c/0x634 kernel/softirq.c:292
 do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1082
 </IRQ>
 do_softirq.part.0+0x26/0x30 kernel/softirq.c:337
 do_softirq arch/x86/include/asm/preempt.h:26 [inline]
 __local_bh_enable_ip+0x46/0x50 kernel/softirq.c:189
 local_bh_enable include/linux/bottom_half.h:32 [inline]
 rcu_read_unlock_bh include/linux/rcupdate.h:723 [inline]
 ip_finish_output2+0x78a/0x19c0 net/ipv4/ip_output.c:229
 __ip_finish_output+0x471/0x720 net/ipv4/ip_output.c:306
 dst_output include/net/dst.h:435 [inline]
 ip_local_out+0x181/0x1e0 net/ipv4/ip_output.c:125
 __ip_queue_xmit+0x7a1/0x14e0 net/ipv4/ip_output.c:530
 __tcp_transmit_skb+0x19dc/0x35e0 net/ipv4/tcp_output.c:1238
 __tcp_send_ack.part.0+0x3c2/0x5b0 net/ipv4/tcp_output.c:3785
 __tcp_send_ack net/ipv4/tcp_output.c:3791 [inline]
 tcp_send_ack+0x7d/0xa0 net/ipv4/tcp_output.c:3791
 tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:6040 [inline]
 tcp_rcv_state_process+0x36a4/0x49c2 net/ipv4/tcp_input.c:6209
 tcp_v4_do_rcv+0x343/0x8b0 net/ipv4/tcp_ipv4.c:1651
 sk_backlog_rcv include/net/sock.h:996 [inline]
 __release_sock+0x1ad/0x310 net/core/sock.c:2548
 release_sock+0x54/0x1a0 net/core/sock.c:3064
 inet_wait_for_connect net/ipv4/af_inet.c:594 [inline]
 __inet_stream_connect+0x57e/0xd50 net/ipv4/af_inet.c:686
 inet_stream_connect+0x53/0xa0 net/ipv4/af_inet.c:725
 mptcp_stream_connect+0x171/0x5f0 net/mptcp/protocol.c:1920
 __sys_connect_file net/socket.c:1854 [inline]
 __sys_connect+0x267/0x2f0 net/socket.c:1871
 __do_sys_connect net/socket.c:1882 [inline]
 __se_sys_connect net/socket.c:1879 [inline]
 __x64_sys_connect+0x6f/0xb0 net/socket.c:1879
 do_syscall_64+0xb7/0x3d0 arch/x86/entry/common.c:295
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fb577d06469
Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ff 49 2b 00 f7 d8 64 89 01 48
RSP: 002b:00007fb5783d5dd8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 000000000068bfa0 RCX: 00007fb577d06469
RDX: 000000000000004d RSI: 0000000020000040 RDI: 0000000000000003
RBP: 00000000ffffffff R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000041427c R14: 00007fb5783d65c0 R15: 0000000000000003
Dumping ftrace buffer:
   (ftrace buffer empty)
Kernel Offset: disabled
Rebooting in 1 seconds..

Syzkaller reproducer:

# {Threaded:true Collide:true Repeat:true RepeatTimes:0 Procs:1 Sandbox:none Fault:false FaultCall:-1 FaultNth:0 Leak:false NetInjection:true NetDevices:true NetReset:true Cgroups:true BinfmtMisc:true CloseFDs:true KCSAN:false DevlinkPCI:false USB:false UseTmpDir:true HandleSegv:true Repro:false Trace:false}
r0 = perf_event_open(0x0, 0x0, 0x0, 0xffffffffffffffff, 0x0)
r1 = socket$inet_mptcp(0x2, 0x1, 0x106)
r2 = socket$inet_mptcp(0x2, 0x1, 0x106)
bind$inet(r2, &(0x7f00000013c0)={0x2, 0x4e20, @multicast2}, 0x10)
connect$inet(r2, &(0x7f0000000040)={0x2, 0x0, @loopback}, 0x10)
listen(r2, 0x0)
ioctl$int_in(0xffffffffffffffff, 0x5452, &(0x7f0000000000)=0x7)
connect$inet(r1, &(0x7f0000000040)={0x2, 0x4e20, @loopback}, 0x4d)
dup3(r0, r2, 0x0)
socket$inet_icmp(0x2, 0x2, 0x1)

kernel-config:
CURRENT_CONFIG.txt

[EDIT June 22nd: Update HEAD & kconfig)

@dcaratti
Copy link
Contributor

dcaratti commented Jul 1, 2020

I reproduced this only a couple of times, most of the times the code hits the branch where the state is TCP_LISTEN

 869 static void subflow_data_ready(struct sock *sk)
 870 {
 871         struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
 872         struct sock *parent = subflow->conn;
 873         struct mptcp_sock *msk;
 874 
 875         msk = mptcp_sk(parent);
 876         if (inet_sk_state_load(sk) == TCP_LISTEN) {
 877                 set_bit(MPTCP_DATA_READY, &msk->flags);
 878                 parent->sk_data_ready(parent);
 879                 return; /* <-- here */
 880         }
 881 
 882         WARN_ON_ONCE(!__mptcp_check_fallback(msk) && !subflow->mp_capable &&
 883                      !subflow->mp_join);
 884         printk("%s: data ready (2)\n",__FUNCTION__);
 885 
 886         if (mptcp_subflow_data_available(sk))
 887                 mptcp_data_ready(parent, sk);
 888 }

@dcaratti
Copy link
Contributor

dcaratti commented Jul 1, 2020

... right after posting the comment above, I managed to reproduce it another couple of times. the state of the socket was TCP_CLOSE both times

@dcaratti
Copy link
Contributor

dcaratti commented Jul 2, 2020

forcing subflows in TCP_CLOSE to go through parent->sk_data_ready() as follows, proved to fix the problem (at least in my setup):

--- a/net/mptcp/subflow.c
+++ b/net/mptcp/subflow.c
@@ -873,7 +873,7 @@ static void subflow_data_ready(struct sock *sk)
        struct mptcp_sock *msk;
 
        msk = mptcp_sk(parent);
-       if (inet_sk_state_load(sk) == TCP_LISTEN) {
+       if ((1 << inet_sk_state_load(sk)) & (TCPF_LISTEN | TCPF_CLOSE)) {
                set_bit(MPTCP_DATA_READY, &msk->flags);
                parent->sk_data_ready(parent);
                return;

@cpaasch
Copy link
Member Author

cpaasch commented Jul 2, 2020

Same here! I repro'ed the issue without your change and with your change here I was no more able to reproduce.

fengguang pushed a commit to 0day-ci/linux that referenced this issue Jul 6, 2020
syzkaller was able to make the kernel reach subflow_data_ready() for a
server subflow that was closed before subflow_finish_connect() completed.
In these cases we can avoid using the path for regular/fallback MPTCP
data, and just wake the main socket, to avoid the following warning:

 WARNING: CPU: 0 PID: 9370 at net/mptcp/subflow.c:885
 subflow_data_ready+0x1e6/0x290 net/mptcp/subflow.c:885
 Kernel panic - not syncing: panic_on_warn set ...
 CPU: 0 PID: 9370 Comm: syz-executor.0 Not tainted 5.7.0 torvalds#106
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
 rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
 Call Trace:
  <IRQ>
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0xb7/0xfe lib/dump_stack.c:118
  panic+0x29e/0x692 kernel/panic.c:221
  __warn.cold+0x2f/0x3d kernel/panic.c:582
  report_bug+0x28b/0x2f0 lib/bug.c:195
  fixup_bug arch/x86/kernel/traps.c:105 [inline]
  fixup_bug arch/x86/kernel/traps.c:100 [inline]
  do_error_trap+0x10f/0x180 arch/x86/kernel/traps.c:197
  do_invalid_op+0x32/0x40 arch/x86/kernel/traps.c:216
  invalid_op+0x1e/0x30 arch/x86/entry/entry_64.S:1027
 RIP: 0010:subflow_data_ready+0x1e6/0x290 net/mptcp/subflow.c:885
 Code: 04 02 84 c0 74 06 0f 8e 91 00 00 00 41 0f b6 5e 48 31 ff 83 e3 18
 89 de e8 37 ec 3d fe 84 db 0f 85 65 ff ff ff e8 fa ea 3d fe <0f> 0b e9
 59 ff ff ff e8 ee ea 3d fe 48 89 ee 4c 89 ef e8 f3 77 ff
 RSP: 0018:ffff88811b2099b0 EFLAGS: 00010206
 RAX: ffff888111197000 RBX: 0000000000000000 RCX: ffffffff82fbc609
 RDX: 0000000000000100 RSI: ffffffff82fbc616 RDI: 0000000000000001
 RBP: ffff8881111bc800 R08: ffff888111197000 R09: ffffed10222a82af
 R10: ffff888111541577 R11: ffffed10222a82ae R12: 1ffff11023641336
 R13: ffff888111541000 R14: ffff88810fd4ca00 R15: ffff888111541570
  tcp_child_process+0x754/0x920 net/ipv4/tcp_minisocks.c:841
  tcp_v4_do_rcv+0x749/0x8b0 net/ipv4/tcp_ipv4.c:1642
  tcp_v4_rcv+0x2666/0x2e60 net/ipv4/tcp_ipv4.c:1999
  ip_protocol_deliver_rcu+0x29/0x1f0 net/ipv4/ip_input.c:204
  ip_local_deliver_finish net/ipv4/ip_input.c:231 [inline]
  NF_HOOK include/linux/netfilter.h:421 [inline]
  ip_local_deliver+0x2da/0x390 net/ipv4/ip_input.c:252
  dst_input include/net/dst.h:441 [inline]
  ip_rcv_finish net/ipv4/ip_input.c:428 [inline]
  ip_rcv_finish net/ipv4/ip_input.c:414 [inline]
  NF_HOOK include/linux/netfilter.h:421 [inline]
  ip_rcv+0xef/0x140 net/ipv4/ip_input.c:539
  __netif_receive_skb_one_core+0x197/0x1e0 net/core/dev.c:5268
  __netif_receive_skb+0x27/0x1c0 net/core/dev.c:5382
  process_backlog+0x1e5/0x6d0 net/core/dev.c:6226
  napi_poll net/core/dev.c:6671 [inline]
  net_rx_action+0x3e3/0xd70 net/core/dev.c:6739
  __do_softirq+0x18c/0x634 kernel/softirq.c:292
  do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1082
  </IRQ>
  do_softirq.part.0+0x26/0x30 kernel/softirq.c:337
  do_softirq arch/x86/include/asm/preempt.h:26 [inline]
  __local_bh_enable_ip+0x46/0x50 kernel/softirq.c:189
  local_bh_enable include/linux/bottom_half.h:32 [inline]
  rcu_read_unlock_bh include/linux/rcupdate.h:723 [inline]
  ip_finish_output2+0x78a/0x19c0 net/ipv4/ip_output.c:229
  __ip_finish_output+0x471/0x720 net/ipv4/ip_output.c:306
  dst_output include/net/dst.h:435 [inline]
  ip_local_out+0x181/0x1e0 net/ipv4/ip_output.c:125
  __ip_queue_xmit+0x7a1/0x14e0 net/ipv4/ip_output.c:530
  __tcp_transmit_skb+0x19dc/0x35e0 net/ipv4/tcp_output.c:1238
  __tcp_send_ack.part.0+0x3c2/0x5b0 net/ipv4/tcp_output.c:3785
  __tcp_send_ack net/ipv4/tcp_output.c:3791 [inline]
  tcp_send_ack+0x7d/0xa0 net/ipv4/tcp_output.c:3791
  tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:6040 [inline]
  tcp_rcv_state_process+0x36a4/0x49c2 net/ipv4/tcp_input.c:6209
  tcp_v4_do_rcv+0x343/0x8b0 net/ipv4/tcp_ipv4.c:1651
  sk_backlog_rcv include/net/sock.h:996 [inline]
  __release_sock+0x1ad/0x310 net/core/sock.c:2548
  release_sock+0x54/0x1a0 net/core/sock.c:3064
  inet_wait_for_connect net/ipv4/af_inet.c:594 [inline]
  __inet_stream_connect+0x57e/0xd50 net/ipv4/af_inet.c:686
  inet_stream_connect+0x53/0xa0 net/ipv4/af_inet.c:725
  mptcp_stream_connect+0x171/0x5f0 net/mptcp/protocol.c:1920
  __sys_connect_file net/socket.c:1854 [inline]
  __sys_connect+0x267/0x2f0 net/socket.c:1871
  __do_sys_connect net/socket.c:1882 [inline]
  __se_sys_connect net/socket.c:1879 [inline]
  __x64_sys_connect+0x6f/0xb0 net/socket.c:1879
  do_syscall_64+0xb7/0x3d0 arch/x86/entry/common.c:295
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7fb577d06469
 Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89
 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
 f0 ff ff 73 01 c3 48 8b 0d ff 49 2b 00 f7 d8 64 89 01 48
 RSP: 002b:00007fb5783d5dd8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
 RAX: ffffffffffffffda RBX: 000000000068bfa0 RCX: 00007fb577d06469
 RDX: 000000000000004d RSI: 0000000020000040 RDI: 0000000000000003
 RBP: 00000000ffffffff R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
 R13: 000000000041427c R14: 00007fb5783d65c0 R15: 0000000000000003

Closes: multipath-tcp/mptcp_net-next#39
Reported-by: Christoph Paasch <cpaasch@apple.com>
Fixes: e1ff9e8 ("net: mptcp: improve fallback to TCP")
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
fengguang pushed a commit to 0day-ci/linux that referenced this issue Jul 7, 2020
syzkaller was able to make the kernel reach subflow_data_ready() for a
server subflow that was closed before subflow_finish_connect() completed.
In these cases we can avoid using the path for regular/fallback MPTCP
data, and just wake the main socket, to avoid the following warning:

 WARNING: CPU: 0 PID: 9370 at net/mptcp/subflow.c:885
 subflow_data_ready+0x1e6/0x290 net/mptcp/subflow.c:885
 Kernel panic - not syncing: panic_on_warn set ...
 CPU: 0 PID: 9370 Comm: syz-executor.0 Not tainted 5.7.0 torvalds#106
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
 rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
 Call Trace:
  <IRQ>
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0xb7/0xfe lib/dump_stack.c:118
  panic+0x29e/0x692 kernel/panic.c:221
  __warn.cold+0x2f/0x3d kernel/panic.c:582
  report_bug+0x28b/0x2f0 lib/bug.c:195
  fixup_bug arch/x86/kernel/traps.c:105 [inline]
  fixup_bug arch/x86/kernel/traps.c:100 [inline]
  do_error_trap+0x10f/0x180 arch/x86/kernel/traps.c:197
  do_invalid_op+0x32/0x40 arch/x86/kernel/traps.c:216
  invalid_op+0x1e/0x30 arch/x86/entry/entry_64.S:1027
 RIP: 0010:subflow_data_ready+0x1e6/0x290 net/mptcp/subflow.c:885
 Code: 04 02 84 c0 74 06 0f 8e 91 00 00 00 41 0f b6 5e 48 31 ff 83 e3 18
 89 de e8 37 ec 3d fe 84 db 0f 85 65 ff ff ff e8 fa ea 3d fe <0f> 0b e9
 59 ff ff ff e8 ee ea 3d fe 48 89 ee 4c 89 ef e8 f3 77 ff
 RSP: 0018:ffff88811b2099b0 EFLAGS: 00010206
 RAX: ffff888111197000 RBX: 0000000000000000 RCX: ffffffff82fbc609
 RDX: 0000000000000100 RSI: ffffffff82fbc616 RDI: 0000000000000001
 RBP: ffff8881111bc800 R08: ffff888111197000 R09: ffffed10222a82af
 R10: ffff888111541577 R11: ffffed10222a82ae R12: 1ffff11023641336
 R13: ffff888111541000 R14: ffff88810fd4ca00 R15: ffff888111541570
  tcp_child_process+0x754/0x920 net/ipv4/tcp_minisocks.c:841
  tcp_v4_do_rcv+0x749/0x8b0 net/ipv4/tcp_ipv4.c:1642
  tcp_v4_rcv+0x2666/0x2e60 net/ipv4/tcp_ipv4.c:1999
  ip_protocol_deliver_rcu+0x29/0x1f0 net/ipv4/ip_input.c:204
  ip_local_deliver_finish net/ipv4/ip_input.c:231 [inline]
  NF_HOOK include/linux/netfilter.h:421 [inline]
  ip_local_deliver+0x2da/0x390 net/ipv4/ip_input.c:252
  dst_input include/net/dst.h:441 [inline]
  ip_rcv_finish net/ipv4/ip_input.c:428 [inline]
  ip_rcv_finish net/ipv4/ip_input.c:414 [inline]
  NF_HOOK include/linux/netfilter.h:421 [inline]
  ip_rcv+0xef/0x140 net/ipv4/ip_input.c:539
  __netif_receive_skb_one_core+0x197/0x1e0 net/core/dev.c:5268
  __netif_receive_skb+0x27/0x1c0 net/core/dev.c:5382
  process_backlog+0x1e5/0x6d0 net/core/dev.c:6226
  napi_poll net/core/dev.c:6671 [inline]
  net_rx_action+0x3e3/0xd70 net/core/dev.c:6739
  __do_softirq+0x18c/0x634 kernel/softirq.c:292
  do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1082
  </IRQ>
  do_softirq.part.0+0x26/0x30 kernel/softirq.c:337
  do_softirq arch/x86/include/asm/preempt.h:26 [inline]
  __local_bh_enable_ip+0x46/0x50 kernel/softirq.c:189
  local_bh_enable include/linux/bottom_half.h:32 [inline]
  rcu_read_unlock_bh include/linux/rcupdate.h:723 [inline]
  ip_finish_output2+0x78a/0x19c0 net/ipv4/ip_output.c:229
  __ip_finish_output+0x471/0x720 net/ipv4/ip_output.c:306
  dst_output include/net/dst.h:435 [inline]
  ip_local_out+0x181/0x1e0 net/ipv4/ip_output.c:125
  __ip_queue_xmit+0x7a1/0x14e0 net/ipv4/ip_output.c:530
  __tcp_transmit_skb+0x19dc/0x35e0 net/ipv4/tcp_output.c:1238
  __tcp_send_ack.part.0+0x3c2/0x5b0 net/ipv4/tcp_output.c:3785
  __tcp_send_ack net/ipv4/tcp_output.c:3791 [inline]
  tcp_send_ack+0x7d/0xa0 net/ipv4/tcp_output.c:3791
  tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:6040 [inline]
  tcp_rcv_state_process+0x36a4/0x49c2 net/ipv4/tcp_input.c:6209
  tcp_v4_do_rcv+0x343/0x8b0 net/ipv4/tcp_ipv4.c:1651
  sk_backlog_rcv include/net/sock.h:996 [inline]
  __release_sock+0x1ad/0x310 net/core/sock.c:2548
  release_sock+0x54/0x1a0 net/core/sock.c:3064
  inet_wait_for_connect net/ipv4/af_inet.c:594 [inline]
  __inet_stream_connect+0x57e/0xd50 net/ipv4/af_inet.c:686
  inet_stream_connect+0x53/0xa0 net/ipv4/af_inet.c:725
  mptcp_stream_connect+0x171/0x5f0 net/mptcp/protocol.c:1920
  __sys_connect_file net/socket.c:1854 [inline]
  __sys_connect+0x267/0x2f0 net/socket.c:1871
  __do_sys_connect net/socket.c:1882 [inline]
  __se_sys_connect net/socket.c:1879 [inline]
  __x64_sys_connect+0x6f/0xb0 net/socket.c:1879
  do_syscall_64+0xb7/0x3d0 arch/x86/entry/common.c:295
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7fb577d06469
 Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89
 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
 f0 ff ff 73 01 c3 48 8b 0d ff 49 2b 00 f7 d8 64 89 01 48
 RSP: 002b:00007fb5783d5dd8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
 RAX: ffffffffffffffda RBX: 000000000068bfa0 RCX: 00007fb577d06469
 RDX: 000000000000004d RSI: 0000000020000040 RDI: 0000000000000003
 RBP: 00000000ffffffff R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
 R13: 000000000041427c R14: 00007fb5783d65c0 R15: 0000000000000003

Closes: multipath-tcp/mptcp_net-next#39
Reported-by: Christoph Paasch <cpaasch@apple.com>
Fixes: e1ff9e8 ("net: mptcp: improve fallback to TCP")
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
LorenzoBianconi pushed a commit to LorenzoBianconi/net-next that referenced this issue Jul 8, 2020
syzkaller was able to make the kernel reach subflow_data_ready() for a
server subflow that was closed before subflow_finish_connect() completed.
In these cases we can avoid using the path for regular/fallback MPTCP
data, and just wake the main socket, to avoid the following warning:

 WARNING: CPU: 0 PID: 9370 at net/mptcp/subflow.c:885
 subflow_data_ready+0x1e6/0x290 net/mptcp/subflow.c:885
 Kernel panic - not syncing: panic_on_warn set ...
 CPU: 0 PID: 9370 Comm: syz-executor.0 Not tainted 5.7.0 #106
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
 rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
 Call Trace:
  <IRQ>
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0xb7/0xfe lib/dump_stack.c:118
  panic+0x29e/0x692 kernel/panic.c:221
  __warn.cold+0x2f/0x3d kernel/panic.c:582
  report_bug+0x28b/0x2f0 lib/bug.c:195
  fixup_bug arch/x86/kernel/traps.c:105 [inline]
  fixup_bug arch/x86/kernel/traps.c:100 [inline]
  do_error_trap+0x10f/0x180 arch/x86/kernel/traps.c:197
  do_invalid_op+0x32/0x40 arch/x86/kernel/traps.c:216
  invalid_op+0x1e/0x30 arch/x86/entry/entry_64.S:1027
 RIP: 0010:subflow_data_ready+0x1e6/0x290 net/mptcp/subflow.c:885
 Code: 04 02 84 c0 74 06 0f 8e 91 00 00 00 41 0f b6 5e 48 31 ff 83 e3 18
 89 de e8 37 ec 3d fe 84 db 0f 85 65 ff ff ff e8 fa ea 3d fe <0f> 0b e9
 59 ff ff ff e8 ee ea 3d fe 48 89 ee 4c 89 ef e8 f3 77 ff
 RSP: 0018:ffff88811b2099b0 EFLAGS: 00010206
 RAX: ffff888111197000 RBX: 0000000000000000 RCX: ffffffff82fbc609
 RDX: 0000000000000100 RSI: ffffffff82fbc616 RDI: 0000000000000001
 RBP: ffff8881111bc800 R08: ffff888111197000 R09: ffffed10222a82af
 R10: ffff888111541577 R11: ffffed10222a82ae R12: 1ffff11023641336
 R13: ffff888111541000 R14: ffff88810fd4ca00 R15: ffff888111541570
  tcp_child_process+0x754/0x920 net/ipv4/tcp_minisocks.c:841
  tcp_v4_do_rcv+0x749/0x8b0 net/ipv4/tcp_ipv4.c:1642
  tcp_v4_rcv+0x2666/0x2e60 net/ipv4/tcp_ipv4.c:1999
  ip_protocol_deliver_rcu+0x29/0x1f0 net/ipv4/ip_input.c:204
  ip_local_deliver_finish net/ipv4/ip_input.c:231 [inline]
  NF_HOOK include/linux/netfilter.h:421 [inline]
  ip_local_deliver+0x2da/0x390 net/ipv4/ip_input.c:252
  dst_input include/net/dst.h:441 [inline]
  ip_rcv_finish net/ipv4/ip_input.c:428 [inline]
  ip_rcv_finish net/ipv4/ip_input.c:414 [inline]
  NF_HOOK include/linux/netfilter.h:421 [inline]
  ip_rcv+0xef/0x140 net/ipv4/ip_input.c:539
  __netif_receive_skb_one_core+0x197/0x1e0 net/core/dev.c:5268
  __netif_receive_skb+0x27/0x1c0 net/core/dev.c:5382
  process_backlog+0x1e5/0x6d0 net/core/dev.c:6226
  napi_poll net/core/dev.c:6671 [inline]
  net_rx_action+0x3e3/0xd70 net/core/dev.c:6739
  __do_softirq+0x18c/0x634 kernel/softirq.c:292
  do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1082
  </IRQ>
  do_softirq.part.0+0x26/0x30 kernel/softirq.c:337
  do_softirq arch/x86/include/asm/preempt.h:26 [inline]
  __local_bh_enable_ip+0x46/0x50 kernel/softirq.c:189
  local_bh_enable include/linux/bottom_half.h:32 [inline]
  rcu_read_unlock_bh include/linux/rcupdate.h:723 [inline]
  ip_finish_output2+0x78a/0x19c0 net/ipv4/ip_output.c:229
  __ip_finish_output+0x471/0x720 net/ipv4/ip_output.c:306
  dst_output include/net/dst.h:435 [inline]
  ip_local_out+0x181/0x1e0 net/ipv4/ip_output.c:125
  __ip_queue_xmit+0x7a1/0x14e0 net/ipv4/ip_output.c:530
  __tcp_transmit_skb+0x19dc/0x35e0 net/ipv4/tcp_output.c:1238
  __tcp_send_ack.part.0+0x3c2/0x5b0 net/ipv4/tcp_output.c:3785
  __tcp_send_ack net/ipv4/tcp_output.c:3791 [inline]
  tcp_send_ack+0x7d/0xa0 net/ipv4/tcp_output.c:3791
  tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:6040 [inline]
  tcp_rcv_state_process+0x36a4/0x49c2 net/ipv4/tcp_input.c:6209
  tcp_v4_do_rcv+0x343/0x8b0 net/ipv4/tcp_ipv4.c:1651
  sk_backlog_rcv include/net/sock.h:996 [inline]
  __release_sock+0x1ad/0x310 net/core/sock.c:2548
  release_sock+0x54/0x1a0 net/core/sock.c:3064
  inet_wait_for_connect net/ipv4/af_inet.c:594 [inline]
  __inet_stream_connect+0x57e/0xd50 net/ipv4/af_inet.c:686
  inet_stream_connect+0x53/0xa0 net/ipv4/af_inet.c:725
  mptcp_stream_connect+0x171/0x5f0 net/mptcp/protocol.c:1920
  __sys_connect_file net/socket.c:1854 [inline]
  __sys_connect+0x267/0x2f0 net/socket.c:1871
  __do_sys_connect net/socket.c:1882 [inline]
  __se_sys_connect net/socket.c:1879 [inline]
  __x64_sys_connect+0x6f/0xb0 net/socket.c:1879
  do_syscall_64+0xb7/0x3d0 arch/x86/entry/common.c:295
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7fb577d06469
 Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89
 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
 f0 ff ff 73 01 c3 48 8b 0d ff 49 2b 00 f7 d8 64 89 01 48
 RSP: 002b:00007fb5783d5dd8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
 RAX: ffffffffffffffda RBX: 000000000068bfa0 RCX: 00007fb577d06469
 RDX: 000000000000004d RSI: 0000000020000040 RDI: 0000000000000003
 RBP: 00000000ffffffff R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
 R13: 000000000041427c R14: 00007fb5783d65c0 R15: 0000000000000003

Closes: multipath-tcp/mptcp_net-next#39
Reported-by: Christoph Paasch <cpaasch@apple.com>
Fixes: e1ff9e8 ("net: mptcp: improve fallback to TCP")
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
@pabeni
Copy link

pabeni commented Jul 9, 2020

fixed with commit:

commit d47a721
Author: Davide Caratti dcaratti@redhat.com
Date: Mon Jul 6 21:06:12 2020 +0200

mptcp: fix race in subflow_data_ready()

@pabeni pabeni closed this as completed Jul 9, 2020
jenkins-tessares pushed a commit that referenced this issue Aug 2, 2020
When an LE connection is requested and an RPA update is needed via
hci_connect_le_scan, the default scanning parameters are used rather
than the connect parameters.  This leads to significant delays in the
connection establishment process when using lower duty cycle scanning
parameters.

The patch simply looks at the pended connection list when trying to
determine which scanning parameters should be used.

Before:
< HCI Command: LE Set Extended Scan Parameters (0x08|0x0041) plen 8
                            #378 [hci0] 1659.247156
        Own address type: Public (0x00)
        Filter policy: Ignore not in white list (0x01)
        PHYs: 0x01
        Entry 0: LE 1M
          Type: Passive (0x00)
          Interval: 367.500 msec (0x024c)
          Window: 37.500 msec (0x003c)

After:
< HCI Command: LE Set Extended Scan Parameters (0x08|0x0041) plen 8
                               #39 [hci0] 7.422109
        Own address type: Public (0x00)
        Filter policy: Ignore not in white list (0x01)
        PHYs: 0x01
        Entry 0: LE 1M
          Type: Passive (0x00)
          Interval: 60.000 msec (0x0060)
          Window: 60.000 msec (0x0060)

Signed-off-by: Alain Michaud <alainm@chromium.org>
Reviewed-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>
Reviewed-by: Yu Liu <yudiliu@google.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
matttbe pushed a commit that referenced this issue Oct 3, 2022
The following warning is displayed when the tcp6-multi-diffip11 stress
test case of the LTP test suite is tested:

watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [ns-tcpserver:48198]
CPU: 0 PID: 48198 Comm: ns-tcpserver Kdump: loaded Not tainted 6.0.0-rc6+ #39
Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : des3_ede_encrypt+0x27c/0x460 [libdes]
lr : 0x3f
sp : ffff80000ceaa1b0
x29: ffff80000ceaa1b0 x28: ffff0000df056100 x27: ffff0000e51e5280
x26: ffff80004df75030 x25: ffff0000e51e4600 x24: 000000000000003b
x23: 0000000000802080 x22: 000000000000003d x21: 0000000000000038
x20: 0000000080000020 x19: 000000000000000a x18: 0000000000000033
x17: ffff0000e51e4780 x16: ffff80004e2d1448 x15: ffff80004e2d1248
x14: ffff0000e51e4680 x13: ffff80004e2d1348 x12: ffff80004e2d1548
x11: ffff80004e2d1848 x10: ffff80004e2d1648 x9 : ffff80004e2d1748
x8 : ffff80004e2d1948 x7 : 000000000bcaf83d x6 : 000000000000001b
x5 : ffff80004e2d1048 x4 : 00000000761bf3bf x3 : 000000007f1dd0a3
x2 : ffff0000e51e4780 x1 : ffff0000e3b9a2f8 x0 : 00000000db44e872
Call trace:
 des3_ede_encrypt+0x27c/0x460 [libdes]
 crypto_des3_ede_encrypt+0x1c/0x30 [des_generic]
 crypto_cbc_encrypt+0x148/0x190
 crypto_skcipher_encrypt+0x2c/0x40
 crypto_authenc_encrypt+0xc8/0xfc [authenc]
 crypto_aead_encrypt+0x2c/0x40
 echainiv_encrypt+0x144/0x1a0 [echainiv]
 crypto_aead_encrypt+0x2c/0x40
 esp6_output_tail+0x1c8/0x5d0 [esp6]
 esp6_output+0x120/0x278 [esp6]
 xfrm_output_one+0x458/0x4ec
 xfrm_output_resume+0x6c/0x1f0
 xfrm_output+0xac/0x4ac
 __xfrm6_output+0x130/0x270
 xfrm6_output+0x60/0xec
 ip6_xmit+0x2ec/0x5bc
 inet6_csk_xmit+0xbc/0x10c
 __tcp_transmit_skb+0x460/0x8c0
 tcp_write_xmit+0x348/0x890
 __tcp_push_pending_frames+0x44/0x110
 tcp_rcv_established+0x3c8/0x720
 tcp_v6_do_rcv+0xdc/0x4a0
 tcp_v6_rcv+0xc24/0xcb0
 ip6_protocol_deliver_rcu+0xf0/0x574
 ip6_input_finish+0x48/0x7c
 ip6_input+0x48/0xc0
 ip6_rcv_finish+0x80/0x9c
 xfrm_trans_reinject+0xb0/0xf4
 tasklet_action_common.constprop.0+0xf8/0x134
 tasklet_action+0x30/0x3c
 __do_softirq+0x128/0x368
 do_softirq+0xb4/0xc0
 __local_bh_enable_ip+0xb0/0xb4
 put_cpu_fpsimd_context+0x40/0x70
 kernel_neon_end+0x20/0x40
 sha1_base_do_update.constprop.0.isra.0+0x11c/0x140 [sha1_ce]
 sha1_ce_finup+0x94/0x110 [sha1_ce]
 crypto_shash_finup+0x34/0xc0
 hmac_finup+0x48/0xe0
 crypto_shash_finup+0x34/0xc0
 shash_digest_unaligned+0x74/0x90
 crypto_shash_digest+0x4c/0x9c
 shash_ahash_digest+0xc8/0xf0
 shash_async_digest+0x28/0x34
 crypto_ahash_digest+0x48/0xcc
 crypto_authenc_genicv+0x88/0xcc [authenc]
 crypto_authenc_encrypt+0xd8/0xfc [authenc]
 crypto_aead_encrypt+0x2c/0x40
 echainiv_encrypt+0x144/0x1a0 [echainiv]
 crypto_aead_encrypt+0x2c/0x40
 esp6_output_tail+0x1c8/0x5d0 [esp6]
 esp6_output+0x120/0x278 [esp6]
 xfrm_output_one+0x458/0x4ec
 xfrm_output_resume+0x6c/0x1f0
 xfrm_output+0xac/0x4ac
 __xfrm6_output+0x130/0x270
 xfrm6_output+0x60/0xec
 ip6_xmit+0x2ec/0x5bc
 inet6_csk_xmit+0xbc/0x10c
 __tcp_transmit_skb+0x460/0x8c0
 tcp_write_xmit+0x348/0x890
 __tcp_push_pending_frames+0x44/0x110
 tcp_push+0xb4/0x14c
 tcp_sendmsg_locked+0x71c/0xb64
 tcp_sendmsg+0x40/0x6c
 inet6_sendmsg+0x4c/0x80
 sock_sendmsg+0x5c/0x6c
 __sys_sendto+0x128/0x15c
 __arm64_sys_sendto+0x30/0x40
 invoke_syscall+0x50/0x120
 el0_svc_common.constprop.0+0x170/0x194
 do_el0_svc+0x38/0x4c
 el0_svc+0x28/0xe0
 el0t_64_sync_handler+0xbc/0x13c
 el0t_64_sync+0x180/0x184

Get softirq info by bcc tool:
./softirqs -NT 10
Tracing soft irq event time... Hit Ctrl-C to end.

15:34:34
SOFTIRQ          TOTAL_nsecs
block                 158990
timer               20030920
sched               46577080
net_rx             676746820
tasklet           9906067650

15:34:45
SOFTIRQ          TOTAL_nsecs
block                  86100
sched               38849790
net_rx             676532470
timer             1163848790
tasklet           9409019620

15:34:55
SOFTIRQ          TOTAL_nsecs
sched               58078450
net_rx             475156720
timer              533832410
tasklet           9431333300

The tasklet software interrupt takes too much time. Therefore, the
xfrm_trans_reinject executor is changed from tasklet to workqueue. Add add
spin lock to protect the queue. This reduces the processing flow of the
tcp_sendmsg function in this scenario.

Fixes: acf568e ("xfrm: Reinject transport-mode packets through tasklet")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
matttbe pushed a commit that referenced this issue Apr 25, 2024
Both the function that migrates all the chunks within a region and the
function that migrates all the entries within a chunk call
list_first_entry() on the respective lists without checking that the
lists are not empty. This is incorrect usage of the API, which leads to
the following warning [1].

Fix by returning if the lists are empty as there is nothing to migrate
in this case.

[1]
WARNING: CPU: 0 PID: 6437 at drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c:1266 mlxsw_sp_acl_tcam_vchunk_migrate_all+0x1f1/0>
Modules linked in:
CPU: 0 PID: 6437 Comm: kworker/0:37 Not tainted 6.9.0-rc3-custom-00883-g94a65f079ef6 #39
Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019
Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work
RIP: 0010:mlxsw_sp_acl_tcam_vchunk_migrate_all+0x1f1/0x2c0
[...]
Call Trace:
 <TASK>
 mlxsw_sp_acl_tcam_vregion_rehash_work+0x6c/0x4a0
 process_one_work+0x151/0x370
 worker_thread+0x2cb/0x3e0
 kthread+0xd0/0x100
 ret_from_fork+0x34/0x50
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Fixes: 6f9579d ("mlxsw: spectrum_acl: Remember where to continue rehash migration")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/4628e9a22d1d84818e28310abbbc498e7bc31bc9.1713797103.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants