Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock: usrsctp_conninput and user_sctp_timer_iterate #649

Open
eremeev opened this issue Feb 18, 2022 · 3 comments
Open

Deadlock: usrsctp_conninput and user_sctp_timer_iterate #649

eremeev opened this issue Feb 18, 2022 · 3 comments

Comments

@eremeev
Copy link

eremeev commented Feb 18, 2022

It seems my application dead locks.
5f3540a (commit) is used.

SCTP runs over DTLS/UDP.
I used https://github.com/jitsi/jitsi-sctp/blob/master/jniwrapper/native/src/org_jitsi_modified_sctp4j_SctpJni.c as template, put receive callback and send threshold callback to usrsctp_socket and send callback to usrsctp_init.

It seems user_sctp_timer_iterate dead locks with usrsctp_conninput.
usrsctp_conninput and usrsctp_close run on DTLS threads (IO threads).
SctpSocket can be accessed only from one thread.

Please, see thread dumps:

Thread 1607 (Thread 0x7f4524e7c700 (LWP 60587)):
#0  0x00007f45c3e784ed in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007f45c3e73dcb in _L_lock_883 () from /lib64/libpthread.so.0
#2  0x00007f45c3e73c98 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007f4527b24873 in sctp_invoke_recv_callback (inp=0x7f44c8ea4200, stcb=0x7f44c8ed4400, control=0x7f44df79f120, inp_read_lock_held=0) at netinet/sctputil.c:5328
#4  0x00007f4527b255fc in sctp_add_to_readq (inp=0x7f44c8ea4200, stcb=0x7f44c8ed4400, control=0x7f44df79f120, sb=0x7f44c8e5beb8, end=1, inp_read_lock_held=0, so_locked=0) at netinet/sctputil.c:5435
#5  0x00007f4527b1e5e9 in sctp_notify_send_failed (stcb=0x7f44c8ed4400, sent=1 '\001', error=0, chk=0x7f455f741680, so_locked=0) at netinet/sctputil.c:3627
#6  0x00007f4527b208bc in sctp_ulp_notify (notification=5, stcb=0x7f44c8ed4400, error=0, data=0x7f455f741680, so_locked=0) at netinet/sctputil.c:4318
#7  0x00007f4527b211c2 in sctp_report_all_outbound (stcb=0x7f44c8ed4400, error=0, so_locked=0) at netinet/sctputil.c:4474
#8  0x00007f4527b223b1 in sctp_abort_notification (stcb=0x7f44c8ed4400, from_peer=false, timeout=false, error=0, abort=0x0, so_locked=0) at netinet/sctputil.c:4565
#9  0x00007f4527b2274f in sctp_abort_an_association (inp=0x7f44c8ea4200, stcb=0x7f44c8ed4400, op_err=0x0, timedout=false, so_locked=0) at netinet/sctputil.c:4755
#10 0x00007f4527aa48c5 in sctp_chunk_retransmission (inp=0x7f44c8ea4200, stcb=0x7f44c8ed4400, asoc=0x7f44c8ed4458, cnt_out=0x7f4524e7a550, now=0x7f4524e7a590, now_filled=0x7f4524e7a558, fr_done=0x7f4524e7a55c, so_locked=0) at netinet/sctp_output.c:10184
#11 0x00007f4527aa5cbd in sctp_chunk_output (inp=0x7f44c8ea4200, stcb=0x7f44c8ed4400, from_where=1, so_locked=0) at netinet/sctp_output.c:10648
#12 0x00007f4527b174a3 in sctp_timeout_handler (t=0x7f44c8e5b030) at netinet/sctputil.c:1917
#13 0x00007f4527a74ddc in sctp_handle_tick (elapsed_ticks=10) at netinet/sctp_callout.c:172
#14 0x00007f4527a75028 in user_sctp_timer_iterate (arg=0x0) at netinet/sctp_callout.c:214
#15 0x00007f45c3e71dd5 in start_thread () from /lib64/libpthread.so.0
#16 0x00007f45c3996ead in clone () from /lib64/libc.so.6
Thread 811 (Thread 0x7f448b02c700 (LWP 61482)):
#0  0x00007f45c3e784ed in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007f45c3e73dcb in _L_lock_883 () from /lib64/libpthread.so.0
#2  0x00007f45c3e73c98 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007f4527b222de in sctp_abort_notification (stcb=0x7f44c8ed4400, from_peer=false, timeout=false, error=0, abort=0x0, so_locked=0) at netinet/sctputil.c:4562
#4  0x00007f4527b2274f in sctp_abort_an_association (inp=0x7f44c8ea4200, stcb=0x7f44c8ed4400, op_err=0x0, timedout=false, so_locked=0) at netinet/sctputil.c:4755
#5  0x00007f4527aa48c5 in sctp_chunk_retransmission (inp=0x7f44c8ea4200, stcb=0x7f44c8ed4400, asoc=0x7f44c8ed4458, cnt_out=0x7f448b029be0, now=0x7f448b029c20, now_filled=0x7f448b029be8, fr_done=0x7f448b029bec, so_locked=0) at netinet/sctp_output.c:10184
#6  0x00007f4527aa5cbd in sctp_chunk_output (inp=0x7f44c8ea4200, stcb=0x7f44c8ed4400, from_where=3, so_locked=0) at netinet/sctp_output.c:10648
#7  0x00007f4527a8ada9 in sctp_common_input_processing (mm=0x7f448b029e80, iphlen=0, offset=132, length=132, src=0x7f448b029ea0, dst=0x7f448b029eb0, sh=0x7f44c622ac50, ch=0x7f44c622ac5c, compute_crc=1 '\001', ecn_bits=0 '\000', vrf_id=0, port=0) at netinet/sctp_input.c:6155
#8  0x00007f4527a723a4 in usrsctp_conninput (addr=0x2144, buffer=0x7f44c622b800, length=132, ecn_bits=0 '\000') at user_socket.c:3336

I have plenty of threads stuck with the following stack:

Thread 1133 (Thread 0x7f455827b700 (LWP 61126)):
#0  0x00007f45c3e752ce in pthread_rwlock_wrlock () from /lib64/libpthread.so.0
#1  0x00007f4527ac4cd2 in sctp_inpcb_free (inp=0x7f44fb7c2600, immediate=1, from=1) at netinet/sctp_pcb.c:3907
#2  0x00007f4527adcf4f in sctp_close (so=0x7f44ef6cf380) at netinet/sctp_usrreq.c:855
#3  0x00007f4527a68cb8 in sofree (so=0x7f44ef6cf380) at user_socket.c:287
#4  0x00007f4527a6fd76 in usrsctp_close (so=0x7f44ef6cf380) at user_socket.c:2020
@tuexen
Copy link
Member

tuexen commented Feb 18, 2022

Which locks are the Thread 1607 and Thread 811 waiting for? Which thread owns the lock?

@eremeev
Copy link
Author

eremeev commented Mar 14, 2022

Sorry, for the late answer.
I have looked through the code. According to the code:
Thread 1607: locks TCB -> INP_READ -> unlocks TBC -> unlocks INP_READ -> locks TCB (after that, nothing happens)
Thread 811: locks TCB in sctp_send_abort_tcb -> locks TCB_SEND (after that, nothing happens)
I cannot find the place in which Thread 1607 locks TCB_SEND, but it is highly likely that Thread 1607 holds TCB_SEND.

@tuexen
Copy link
Member

tuexen commented Mar 14, 2022

I'm working on removing the TCP_SEND lock (for reasons other than avoiding a deadlock). Let me finish this and it would be great if you could test, if the problem persists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants