Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Self-deadlock in usrsctp_close while recv_callback is being invoked #712

Closed
JonathanLennox opened this issue Apr 22, 2024 · 1 comment
Closed

Comments

@JonathanLennox
Copy link
Contributor

JonathanLennox commented Apr 22, 2024

I encountered a stack where sctp_free_assoc tries to acquire inp->inp_mtx even though it is called from sctp_close which already holds that lock. Since usrsctplib does not use recursive mutexes, this causes a deadlock.

This happened while the usrsctplib recv_callback was being invoked, which may be the cause of the problem.

Excerpted gdb info:

(gdb) info threads
  Id   Target Id                                           Frame
* 1    Thread 0xffffb0ab8020 (LWP 2195041) "crash_repro"   futex_wait (private=0, expected=2, futex_word=0xaaaad6579568)
    at ../sysdeps/nptl/futex-internal.h:146
  12   Thread 0xffffaa6cf120 (LWP 2198629) "crash_repro"   futex_wait (private=0, expected=2, futex_word=0xaaaad657e290)
    at ../sysdeps/nptl/futex-internal.h:146
(gdb) bt
#0  futex_wait (private=0, expected=2, futex_word=0xaaaad6579568) at ../sysdeps/nptl/futex-internal.h:146
#1  __GI___lll_lock_wait (futex=futex@entry=0xaaaad6579568, private=private@entry=0) at ./nptl/lowlevellock.c:49
#2  0x0000ffffb081070c in lll_mutex_lock_optimized (mutex=0xaaaad6579568) at ./nptl/pthread_mutex_lock.c:48
#3  ___pthread_mutex_lock (mutex=mutex@entry=0xaaaad6579568) at ./nptl/pthread_mutex_lock.c:93
#4  0x0000ffffb0a3e38c in sctp_free_assoc (inp=inp@entry=0xaaaad65791c0, stcb=stcb@entry=0xaaaad657da70, from_inpcbfree=from_inpcbfree@entry=2,
    from_location=from_location@entry=536870920) at ../../usrsctplib/netinet/sctp_pcb.c:5522
#5  0x0000ffffb0a40d8c in sctp_inpcb_free (inp=inp@entry=0xaaaad65791c0, immediate=immediate@entry=1, from=from@entry=1)
    at ../../usrsctplib/netinet/sctp_pcb.c:4116
#6  0x0000ffffb0a4855c in sctp_close (so=so@entry=0xaaaad657adf0) at ../../usrsctplib/netinet/sctp_usrreq.c:891
#7  0x0000ffffb09f3aa0 in sofree (so=0xaaaad657adf0) at ../../usrsctplib/user_socket.c:287
#8  0x0000ffffb09f7b80 in usrsctp_close (so=<optimized out>) at ../../usrsctplib/user_socket.c:2005
#9  0x0000aaaac3cf20f8 in close_socket (o=0xffff9400f460) at crash_repro.c:164
#4  0x0000ffffb0a3e38c in sctp_free_assoc (inp=inp@entry=0xaaaad65791c0, stcb=stcb@entry=0xaaaad657da70, from_inpcbfree=from_inpcbfree@entry=2,
    from_location=from_location@entry=536870920) at ../../usrsctplib/netinet/sctp_pcb.c:5522
5522                    SCTP_INP_READ_LOCK(inp);
(gdb) p inp->inp_mtx
$1 = {__data = {__lock = 1, __count = 0, __owner = 2195041, __nusers = 1, __kind = 2, __spins = 0, __list = {__prev = 0x0, __next = 0x0}},
  __size = "\001\000\000\000\000\000\000\000a~!\000\001\000\000\000\002", '\000' <repeats 30 times>, __align = 1}
(gdb) thread 12
[Switching to thread 12 (Thread 0xffffaa6cf120 (LWP 2198629))]
#0  futex_wait (private=0, expected=2, futex_word=0xaaaad657e290) at ../sysdeps/nptl/futex-internal.h:146
146     ../sysdeps/nptl/futex-internal.h: No such file or directory.
(gdb) bt
#0  futex_wait (private=0, expected=2, futex_word=0xaaaad657e290) at ../sysdeps/nptl/futex-internal.h:146
#1  __GI___lll_lock_wait (futex=futex@entry=0xaaaad657e290, private=private@entry=0) at ./nptl/lowlevellock.c:49
#2  0x0000ffffb081070c in lll_mutex_lock_optimized (mutex=0xaaaad657e290) at ./nptl/pthread_mutex_lock.c:48
#3  ___pthread_mutex_lock (mutex=mutex@entry=0xaaaad657e290) at ./nptl/pthread_mutex_lock.c:93
#4  0x0000ffffb0a6dfc0 in sctp_invoke_recv_callback (inp=inp@entry=0xaaaad65791c0, stcb=stcb@entry=0xaaaad657da70, control=control@entry=0xffff9400df70,
    inp_read_lock_held=inp_read_lock_held@entry=1) at ../../usrsctplib/netinet/sctputil.c:5349
#5  0x0000ffffb0a6e45c in sctp_add_to_readq (inp=0xaaaad65791c0, stcb=stcb@entry=0xaaaad657da70, control=0xffff9400df70, sb=0xaaaad657aea8,
    end=end@entry=1, inp_read_lock_held=inp_read_lock_held@entry=1, so_locked=so_locked@entry=0) at ../../usrsctplib/netinet/sctputil.c:5456
#6  0x0000ffffb0a6eb64 in sctp_notify_send_failed (stcb=stcb@entry=0xaaaad657da70, sent=sent@entry=1 '\001', error=error@entry=12,
    chk=chk@entry=0xffff940108a0, so_locked=so_locked@entry=0) at ../../usrsctplib/netinet/sctputil.c:3624
#7  0x0000ffffb0a70824 in sctp_ulp_notify (notification=notification@entry=5, stcb=stcb@entry=0xaaaad657da70, error=error@entry=12,
    data=data@entry=0xffff940108a0, so_locked=so_locked@entry=0) at ../../usrsctplib/netinet/sctputil.c:4334
#8  0x0000ffffb0a7199c in sctp_report_all_outbound (stcb=stcb@entry=0xaaaad657da70, error=error@entry=12, so_locked=so_locked@entry=0)
    at ../../usrsctplib/netinet/sctputil.c:4492
#9  0x0000ffffb0a768fc in sctp_abort_notification (so_locked=0, abort=0xffff940067bc, error=12, timeout=false, from_peer=true, stcb=0xaaaad657da70)
    at ../../usrsctplib/netinet/sctputil.c:4584
#10 sctp_abort_notification (stcb=stcb@entry=0xaaaad657da70, from_peer=from_peer@entry=true, timeout=timeout@entry=false, error=error@entry=12,
    abort=abort@entry=0xffff940067bc, so_locked=so_locked@entry=0) at ../../usrsctplib/netinet/sctputil.c:4556
#11 0x0000ffffb0a11df0 in sctp_handle_abort (abort=abort@entry=0xffff940067bc, stcb=stcb@entry=0xaaaad657da70, net=0xffff9400fb80)
    at ../../usrsctplib/netinet/sctp_input.c:851
#12 0x0000ffffb0a185f4 in sctp_process_control (m=m@entry=0xffff94006760, iphlen=iphlen@entry=0, offset=offset@entry=0xffffaa6ce66c,
    length=length@entry=20, src=src@entry=0xffffaa6ce7f8, dst=dst@entry=0xffffaa6ce808, sh=sh@entry=0xffff940067b0, ch=ch@entry=0xffff940067bc,
    inp=<optimized out>, stcb=stcb@entry=0xaaaad657da70, netp=netp@entry=0xffffaa6ce680, fwd_tsn_seen=fwd_tsn_seen@entry=0xffffaa6ce674,
    vrf_id=vrf_id@entry=0, port=port@entry=0) at ../../usrsctplib/netinet/sctp_input.c:5214
#13 0x0000ffffb0a1b184 in sctp_common_input_processing (mm=mm@entry=0xffffaa6ce7f0, iphlen=iphlen@entry=0, offset=<optimized out>, offset@entry=12,
    length=length@entry=20, src=src@entry=0xffffaa6ce7f8, dst=dst@entry=0xffffaa6ce808, sh=0xffff940067b0, ch=0xffff940067bc, compute_crc=1 '\001',
    ecn_bits=ecn_bits@entry=0 '\000', vrf_id=vrf_id@entry=0, port=port@entry=0) at ../../usrsctplib/netinet/sctp_input.c:5939
#14 0x0000ffffb09f9940 in usrsctp_conninput (addr=<optimized out>, buffer=0xffff9400bae0, length=20, ecn_bits=ecn_bits@entry=0 '\000')
    at ../../usrsctplib/user_socket.c:3321
#15 0x0000aaaac3cf1ad4 in input_packet_data (arg=0xffff9400e6f0) at crash_repro.c:373
#16 0x0000ffffb080d5c8 in start_thread (arg=0x0) at ./nptl/pthread_create.c:442
#17 0x0000ffffb0875edc in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:79
@JonathanLennox
Copy link
Contributor Author

Actually unfortunately it looks like this was a consequence of #710 -- unexpectedly, it looks like the library depends on the socket's reference count not going to zero during sctp_common_input_processing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant