Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syzkaller: WARNING: refcount bug in inet_csk_listen_stop #366

Closed
cpaasch opened this issue Feb 27, 2023 · 7 comments
Closed

syzkaller: WARNING: refcount bug in inet_csk_listen_stop #366

cpaasch opened this issue Feb 27, 2023 · 7 comments
Labels
bisected Git commit introducing the bug is known bug reproducer Has a simple program to reproduce the bug syzkaller

Comments

@cpaasch
Copy link
Member

cpaasch commented Feb 27, 2023

HEAD: cc518bd

------------[ cut here ]------------
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 1 PID: 5939 at lib/refcount.c:25 refcount_warn_saturate+0x105/0x1b0 lib/refcount.c:25
Modules linked in:
CPU: 1 PID: 5939 Comm: syz-executor.3 Not tainted 6.2.0-gcc518bd4fe12 #6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
RIP: 0010:refcount_warn_saturate+0x105/0x1b0 lib/refcount.c:25
Code: 01 31 ff 89 de e8 5b c5 b3 ff 84 db 0f 85 6e ff ff ff e8 4e ca b3 ff 48 c7 c7 e0 e9 56 82 c6 05 f4 52 33 01 01 e8 eb 76 9c ff <0f> 0b e9 4f ff ff ff e8 2f ca b3 ff 0f b6 1d db 52 33 01 31 ff 89
RSP: 0018:ffffc9000efa3b00 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8880342c0000 RSI: ffffffff81096c6f RDI: 0000000000000001
RBP: ffff88802f37f380 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: ffff8880342c0850 R12: ffff88802f37b980
R13: 0000000000000000 R14: 0000000000000001 R15: ffff88802f37f398
FS:  0000000000000000(0000) GS:ffff88803ed00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f044182f445 CR3: 0000000002832006 CR4: 0000000000370ee0
Call Trace:
 <TASK>
 __refcount_add include/linux/refcount.h:199 [inline]
 __refcount_inc include/linux/refcount.h:250 [inline]
 refcount_inc include/linux/refcount.h:267 [inline]
 sock_hold include/net/sock.h:775 [inline]
 inet_csk_listen_stop+0x70e/0x910 net/ipv4/inet_connection_sock.c:1388
 __mptcp_close_ssk+0x36a/0x3a0 net/mptcp/protocol.c:2414
 mptcp_destroy_common+0x8a/0x1c0 net/mptcp/protocol.c:3267
 mptcp_destroy+0x41/0x60 net/mptcp/protocol.c:3294
 __mptcp_destroy_sock+0x6a/0x150 net/mptcp/protocol.c:2955
 __mptcp_close+0x3c8/0x4d0 net/mptcp/protocol.c:3047
 mptcp_close+0x24/0xe0 net/mptcp/protocol.c:3062
 inet_release+0x56/0xa0 net/ipv4/af_inet.c:429
 __sock_release+0x51/0xf0 net/socket.c:651
 sock_close+0x18/0x20 net/socket.c:1393
 __fput+0x113/0x430 fs/file_table.c:321
 task_work_run+0x96/0x100 kernel/task_work.c:179
 exit_task_work include/linux/task_work.h:38 [inline]
 do_exit+0x4fc/0x10c0 kernel/exit.c:869
 do_group_exit+0x51/0xf0 kernel/exit.c:1019
 get_signal+0x12b0/0x1390 kernel/signal.c:2859
 arch_do_signal_or_restart+0x25/0x260 arch/x86/kernel/signal.c:306
 exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
 exit_to_user_mode_prepare+0x131/0x1a0 kernel/entry/common.c:203
 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
 syscall_exit_to_user_mode+0x19/0x40 kernel/entry/common.c:296
 do_syscall_64+0x46/0x90 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7f04417b26a9
Code: Unable to access opcode bytes at 0x7f04417b267f.
RSP: 002b:00007f0440adfcd8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: 00000000003cecf0 RBX: 00000000006bbf80 RCX: 00007f04417b26a9
RDX: 0000000030000004 RSI: 0000000020000440 RDI: 0000000000000007
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006bbf8c
R13: fffffffffffffea8 R14: 00000000006bbf80 R15: 000000000001fe40
 </TASK>
irq event stamp: 10950
hardirqs last  enabled at (10962): [<ffffffff811357c6>] __up_console_sem+0x76/0x80 kernel/printk/printk.c:345
hardirqs last disabled at (10973): [<ffffffff811357ab>] __up_console_sem+0x5b/0x80 kernel/printk/printk.c:343
softirqs last  enabled at (10328): [<ffffffff81d3840c>] spin_unlock_bh include/linux/spinlock.h:395 [inline]
softirqs last  enabled at (10328): [<ffffffff81d3840c>] reqsk_queue_remove include/net/request_sock.h:213 [inline]
softirqs last  enabled at (10328): [<ffffffff81d3840c>] inet_csk_listen_stop+0x6ac/0x910 net/ipv4/inet_connection_sock.c:1381
softirqs last disabled at (10330): [<ffffffff81d37dc5>] spin_unlock_bh include/linux/spinlock.h:395 [inline]
softirqs last disabled at (10330): [<ffffffff81d37dc5>] reqsk_queue_remove include/net/request_sock.h:213 [inline]
softirqs last disabled at (10330): [<ffffffff81d37dc5>] inet_csk_listen_stop+0x65/0x910 net/ipv4/inet_connection_sock.c:1381
---[ end trace 0000000000000000 ]---

Reproducer:

Syzkaller reproducer:
# {Threaded:false Repeat:false RepeatTimes:0 Procs:1 Slowdown:1 Sandbox: SandboxArg:0 Leak:false NetInjection:false NetDevices:false NetReset:false Cgroups:false BinfmtMisc:false CloseFDs:false KCSAN:false DevlinkPCI:false NicVF:false USB:false VhciInjection:false Wifi:false IEEE802154:false Sysctl:false UseTmpDir:false HandleSegv:false Repro:false Trace:false LegacyOptions:{Collide:false Fault:false FaultCall:0 FaultNth:0}}
r0 = socket$inet_mptcp(0x2, 0x1, 0x106)
bind$inet(r0, &(0x7f0000000040)={0x2, 0x4e24, @empty}, 0x10)
listen(r0, 0x0)
r1 = socket$inet_mptcp(0x2, 0x1, 0x106)
sendmsg$inet(r1, &(0x7f0000000440)={&(0x7f00000000c0)={0x2, 0x4e24, @loopback}, 0x10, 0x0, 0x0, &(0x7f0000000480)=ANY=[], 0xc0}, 0x30000004)
r2 = dup(r1)
connect$unix(r2, &(0x7f0000000140)=@abs, 0x6e)

Kconfig:
Kconfig_k5_lockdep.txt

C-repro:
mptcp_issue365.c.txt

@cpaasch cpaasch added bug syzkaller reproducer Has a simple program to reproduce the bug labels Feb 27, 2023
@cpaasch
Copy link
Member Author

cpaasch commented Feb 27, 2023

C-reproducer confirmed to work.

@cpaasch
Copy link
Member Author

cpaasch commented Feb 27, 2023

Bisected to: f0ed3df

So, seems like the fix for #357 was incomplete?

@cpaasch cpaasch added the bisected Git commit introducing the bug is known label Feb 27, 2023
@matttbe
Copy link
Member

matttbe commented Feb 27, 2023

Bisected to: f0ed3df

Thank you for this bisect! I sent this patch to netdev today. I will ask the net maintainers to hold on.
cc @pabeni

@matttbe
Copy link
Member

matttbe commented Feb 28, 2023

I just applied new patches from Paolo fixing this:

New patches for t/upstream-net and t/upstream:

  • 430272c: "squashed" patch 1/2 in "mptcp: use the workqueue to destroy unaccepted sockets"
  • 03d9f4a: "squashed" patch 2/2 in "mptcp: fix UaF in listener shutdown"
  • debc0d5: tg:msg: add v2 note for t/mptcp-fix-UaF-in-listener-shutdown-net
  • d844182] tg:msg: add v2 note for t/mptcp-use-the-workqueue-to-destroy-unaccepted-sockets-net
  • Results: 514a81f..32ae3eb (export-net)
  • Results: 9c607bd..aea9c8c (export)

Tests are now in progress:

https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export-net/20230228T175908
https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20230228T175908

Please re-open it if it doesn't help.

@matttbe matttbe closed this as completed Feb 28, 2023
@matttbe
Copy link
Member

matttbe commented Feb 28, 2023

@pabeni: It looks like the new warnings you added are being hit:

# ./packetdrill/run_all.py -l -v mptcp/fastopen
(...)
[  561.421251] ------------[ cut here ]------------
[  561.427204] WARNING: CPU: 2 PID: 14229 at net/mptcp/protocol.c:2352 __mptcp_close_ssk (net/mptcp/protocol.c:2352 (discriminator 1)) 
[  561.438275] Modules linked in: xt_mark nft_compat nft_tproxy nf_tproxy_ipv6 nf_tproxy_ipv4 nft_socket nf_socket_ipv4 nf_socket_ipv6 nf_tables sch_netem mptcp_diag inet_diag mptcp_token_test mptcp_crypto_test kunit
[  561.453518] CPU: 2 PID: 14229 Comm: kworker/2:1 Tainted: G                 N 6.2.0-g496324b5ba80 #1
[  561.463336] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[  561.471713] Workqueue: events mptcp_worker
[  561.475511] RIP: 0010:__mptcp_close_ssk (net/mptcp/protocol.c:2352 (discriminator 1)) 
[ 561.480401] Code: ff e9 72 fe ff ff be 03 00 00 00 4c 89 ef e8 ff 6a 9e ff e9 84 fe ff ff be 04 00 00 00 4c 89 ef e8 ed 6a 9e ff e9 4e fe ff ff <0f> 0b eb b4 0f 0b eb 93 66 0f 1f 00 0f 1f 44 00 00 41 55 49 89 d5
All code
========
   0:	ff                   	(bad)  
   1:	e9 72 fe ff ff       	jmp    0xfffffffffffffe78
   6:	be 03 00 00 00       	mov    $0x3,%esi
   b:	4c 89 ef             	mov    %r13,%rdi
   e:	e8 ff 6a 9e ff       	call   0xffffffffff9e6b12
  13:	e9 84 fe ff ff       	jmp    0xfffffffffffffe9c
  18:	be 04 00 00 00       	mov    $0x4,%esi
  1d:	4c 89 ef             	mov    %r13,%rdi
  20:	e8 ed 6a 9e ff       	call   0xffffffffff9e6b12
  25:	e9 4e fe ff ff       	jmp    0xfffffffffffffe78
  2a:*	0f 0b                	ud2    		<-- trapping instruction
  2c:	eb b4                	jmp    0xffffffffffffffe2
  2e:	0f 0b                	ud2    
  30:	eb 93                	jmp    0xffffffffffffffc5
  32:	66 0f 1f 00          	nopw   (%rax)
  36:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  3b:	41 55                	push   %r13
  3d:	49 89 d5             	mov    %rdx,%r13

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2    
   2:	eb b4                	jmp    0xffffffffffffffb8
   4:	0f 0b                	ud2    
   6:	eb 93                	jmp    0xffffffffffffff9b
   8:	66 0f 1f 00          	nopw   (%rax)
   c:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  11:	41 55                	push   %r13
  13:	49 89 d5             	mov    %rdx,%r13
[  561.496040] RSP: 0018:ffff9c9008117e30 EFLAGS: 00010246
[  561.499828] RAX: 0000000000000100 RBX: ffff9a29832d1a40 RCX: 0000000000000000
[  561.505969] RDX: 0000000000000000 RSI: 00000000fffffe00 RDI: ffffffff963d9322
[  561.513316] RBP: ffff9a29891c7000 R08: ffff9a29811086b0 R09: ffff9a298503e7f4
[  561.519876] R10: 0000000000000018 R11: 0000000000000018 R12: ffff9a29831e6180
[  561.525694] R13: 0000000000000002 R14: 0000000000000000 R15: 0000000000000001
[  561.532257] FS:  0000000000000000(0000) GS:ffff9a29fdd00000(0000) knlGS:0000000000000000
[  561.538580] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  561.543285] CR2: 00007fff93d87000 CR3: 0000000006daa001 CR4: 0000000000170ee0
[  561.548558] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  561.553815] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  561.558513] Call Trace:
[  561.560415]  <TASK>
[  561.562364] mptcp_worker (net/mptcp/protocol.c:2417) 
[  561.565236] process_one_work (kernel/workqueue.c:2395) 
[  561.568346] worker_thread (include/linux/list.h:292) 
[  561.571289] ? process_one_work (kernel/workqueue.c:2480) 
[  561.574597] kthread (kernel/kthread.c:376) 
[  561.576998] ? kthread_complete_and_exit (kernel/kthread.c:331) 
[  561.580599] ret_from_fork (arch/x86/entry/entry_64.S:314) 
[  561.583185]  </TASK>
[  561.584973] ---[ end trace 0000000000000000 ]---

I just reverted the patches for the moment just to avoid the CI reporting these issues for other tests, I hope that's OK.

@matttbe matttbe reopened this Feb 28, 2023
@cpaasch
Copy link
Member Author

cpaasch commented Feb 28, 2023

Just saw the same on #367 . Saw your comment here too late :)

@matttbe
Copy link
Member

matttbe commented Mar 6, 2023

We can close this one, the new issue is handled in #367

@matttbe matttbe closed this as completed Mar 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bisected Git commit introducing the bug is known bug reproducer Has a simple program to reproduce the bug syzkaller
Projects
None yet
Development

No branches or pull requests

2 participants