New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime assert in System.Runtime.Remoting #6998

Closed
kumpera opened this Issue Feb 13, 2018 · 7 comments

Comments

Projects
None yet
4 participants
@kumpera
Member

kumpera commented Feb 13, 2018

Found in CI.

Looks like this:

***** /mnt/jenkins/workspace/test-mono-pull-request-coop-i386/mcs/class/lib/net_4_x-linux/tests/net_4_x_System.Runtime.Remoting_test.dll
***** MonoTests.Remoting.ActivationTests
***** MonoTests.Remoting.ActivationTests.TestCreateHttpCao
***** MonoTests.Remoting.ActivationTests.TestCreateTcpCao
***** MonoTests.Remoting.ActivationTests.TestCreateTcpWkoSingleCall
***** MonoTests.Remoting.ActivationTests.TestCreateTcpWkoSingleton
* Assertion at mono-threads.c:563, condition `info' not met

Runtime backtrace useless.

@kumpera

This comment has been minimized.

Member

kumpera commented Feb 13, 2018

@marek-safar

This comment has been minimized.

Member

marek-safar commented Feb 15, 2018

Is this really coop specific?

@marek-safar

This comment has been minimized.

Member

marek-safar commented Feb 15, 2018

@lambdageek lambdageek self-assigned this Mar 8, 2018

@lambdageek

This comment has been minimized.

Member

lambdageek commented Mar 8, 2018

We have a theory for https://jenkins.mono-project.com/job/test-mono-pull-request-coop/1682

The issue is these two threads:

Thread 1 (Thread 0x7f81db92c740 (LWP 86025)):
#0  0x00007f81dadf9536 in do_futex_wait.constprop () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f81dadf95e4 in __new_sem_wait_slow.constprop.0 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x000055956730f941 in mono_os_sem_wait (flags=<optimized out>, sem=<optimized out>) at ../../mono/utils/mono-os-semaphore.h:209
#3  mono_os_sem_timedwait (sem=0x559567669160 <suspend_semaphore>, flags=MONO_SEM_FLAGS_NONE, timeout_ms=4294967295) at ../../mono/utils/mono-os-semaphore.h:242
#4  mono_threads_wait_pending_operations () at mono-threads.c:246
#5  0x00005595673105d8 in mono_thread_info_abort_socket_syscall_for_close (tid=<optimized out>) at mono-threads.c:1150
#6  0x0000559567310f47 in mono_thread_info_finish_interrupt (token=0x7f81a80093a0) at mono-threads.c:1554
#7  0x0000559567255f6d in async_abort_internal (thread=0x7f8197562b78, install_async_abort=1) at threads.c:4988
#8  0x0000559567259bc7 in ves_icall_System_Threading_Thread_Abort (thread=0x7f8197562b78, state=<optimized out>) at threads.c:2419
#9  0x000000004099fbf6 in ?? ()
#10 0x00007fffc5532ef0 in ?? ()
#11 0x00007fffc5532f00 in ?? ()
#12 0x00007f8197529550 in ?? ()
#13 0x00007f81da70a5a8 in ?? ()
#14 0x00007f81da727a78 in ?? ()
#15 0x00007fffc5533080 in ?? ()
#16 0x00007fffc5532f70 in ?? ()
#17 0x00007fffc5532e40 in ?? ()
#18 0x00007f81da727a78 in ?? ()
#19 0x000000004099fb6b in ?? ()
#20 0x00007f81da727a78 in ?? ()
#21 0x00000000409a0c8b in ?? ()
#22 0x0000559567313aa8 in mono_threads_enter_gc_safe_region_unbalanced_with_info (info=0x7fffc5532ef0, stackdata=<optimized out>) at mono-threads-coop.c:241
#23 0x0000000000000001 in ?? ()
#24 0x0000000000000006 in ?? ()
#25 0x00007fffc5533020 in ?? ()
#26 0x0000000040985974 in ?? ()
#27 0x0000000000000000 in ?? ()

This Thread 1 is in mono_thread_info_abort_socket_syscall_for_close and it just sent an abort syscall to Thread 30

Thread 30 (Thread 0x7f8190d21700 (LWP 86455)):
#0  0x00007f81dadfab3a in waitpid () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00005595670eff8e in mono_handle_native_crash (signal=0x55956733ffd6 "SIGABRT", ctx=<optimized out>, info=<optimized out>) at mini-exceptions.c:2723
#2  <signal handler called>
#3  0x00007f81da866fcf in raise () from /lib/x86_64-linux-gnu/libc.so.6
#4  0x00007f81da8683fa in abort () from /lib/x86_64-linux-gnu/libc.so.6
#5  0x0000559567305a34 in mono_log_write_logfile (log_domain=<optimized out>, level=<optimized out>, hdr=<optimized out>, message=0x7f81b402d7c0 "* Assertion at mono-threads.c:563, condition `info' not met\n") at mono-log-common.c:135
#6  0x000055956731ae60 in monoeg_g_logv (log_domain=log_domain@entry=0x0, log_level=log_level@entry=G_LOG_LEVEL_ERROR, format=format@entry=0x559567324138 "* Assertion at %s:%d, condition `%s' not met\n", args=args@entry=0x7f8190d206b8) at goutput.c:115
#7  0x000055956731afb6 in monoeg_assertion_message (format=format@entry=0x559567324138 "* Assertion at %s:%d, condition `%s' not met\n") at goutput.c:135
#8  0x000055956730f0e1 in mono_thread_info_current () at mono-threads.c:563
#9  0x00005595673131fc in suspend_signal_handler (_dummy=<optimized out>, info=<optimized out>, context=0x7f8190d20980) at mono-threads-posix-signals.c:134
#10 <signal handler called>
#11 0x00007f81da918597 in madvise () from /lib/x86_64-linux-gnu/libc.so.6
#12 0x00007f81dadf172a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#13 0x00007f81da91caff in clone () from /lib/x86_64-linux-gnu/libc.so.6

but Thread 30 is shutting down when it receives the signal and it can't lookup it's own thread info already. (It returned back into start_thread and already freed all its TLS variables - that madvise is pretty late in the thread shutdown code in pthreads.)

This happens because mono_thread_info_abort_socket_syscall_for_close in thread 1 races with shutdown of thread 30 - thread 1 looks up the thread info for thread 30 while its still running, but by the time thread 1 takes the thread info suspend lock, thread 30 is basically destroyed.

mono/mono/utils/mono-threads.c

Lines 1137 to 1149 in c2bf82f

info = mono_thread_info_lookup (tid);
if (!info)
return;
if (mono_thread_info_run_state (info) == STATE_DETACHED) {
mono_hazard_pointer_clear (hp, 1);
return;
}
mono_thread_info_suspend_lock ();
mono_threads_begin_global_suspend ();
mono_threads_suspend_abort_syscall (info);

We need to do another info = mono_thread_info_lookup (tid) once we take the thread info suspend lock.

@lambdageek

This comment has been minimized.

Member

lambdageek commented Mar 8, 2018

lambdageek added a commit to lambdageek/mono that referenced this issue Mar 8, 2018

[threads] Fix race between abort socket syscall and thread shutdown
The race happens when we call mono_thread_info_abort_socket_sycall_for_close on
a thread that's about to shut down.

We lookup the MonoThreadInfo for that thread outside the
mono_thread_info_suspend_lock, and then use it to send the signal once we've
taken the lock.

In between those two actions, the other thread already finished
unregister_thread and pthreads may already have destroyed its TLS keys and
otherwise started to dismantle the thread.

The fix is to do a second mono_thread_info_lookup once we have the lock and
only signal the other thread if it still around.

Fixes mono#6998
@lambdageek

This comment has been minimized.

Member

lambdageek commented Mar 8, 2018

Caught it in gdb on my linux vm (needs a pretty high load going concurrently). Can confirm that the abort syscall thread is trying to kill the thread that's shutting down.

lambdageek added a commit to lambdageek/mono that referenced this issue Mar 8, 2018

[threads] Fix race between abort socket syscall and thread shutdown
The race happens when we call mono_thread_info_abort_socket_sycall_for_close on
a thread that's about to shut down.

We lookup the MonoThreadInfo for that thread outside the
mono_thread_info_suspend_lock, and then use it to send the signal once we've
taken the lock.

In between those two actions, the other thread already finished
unregister_thread and pthreads may already have destroyed its TLS keys and
otherwise started to dismantle the thread.

The fix is to do a second mono_thread_info_lookup once we have the lock and
only signal the other thread if it still around.

Fixes mono#6998

monojenkins added a commit to monojenkins/mono that referenced this issue Mar 9, 2018

Fix race between abort socket syscall and thread shutdown
The race happens when we call mono_thread_info_abort_socket_sycall_for_close on
a thread that's about to shut down.

We lookup the MonoThreadInfo for that thread outside the
mono_thread_info_suspend_lock, and then use it to send the signal once we've
taken the lock.

In between those two actions, the other thread already finished
unregister_thread and pthreads may already have destroyed its TLS keys and
otherwise started to dismantle the thread.

The fix is to do a second mono_thread_info_lookup once we have the lock and
only signal the other thread if it still around.

Fixes mono#6998

@luhenry luhenry closed this in #7507 Mar 9, 2018

luhenry added a commit that referenced this issue Mar 9, 2018

[2018-02] [threads] Fix race between abort socket syscall and thread …
…shutdown (#7521)

* Fix race between abort socket syscall and thread shutdown

The race happens when we call mono_thread_info_abort_socket_sycall_for_close on
a thread that's about to shut down.

We lookup the MonoThreadInfo for that thread outside the
mono_thread_info_suspend_lock, and then use it to send the signal once we've
taken the lock.

In between those two actions, the other thread already finished
unregister_thread and pthreads may already have destroyed its TLS keys and
otherwise started to dismantle the thread.

The fix is to do a second mono_thread_info_lookup once we have the lock and
only signal the other thread if it still around.

Fixes #6998

* Simplify mono_thread_info_abort_socket_syscall_for_close

Call mono_thread_info_lookup while holding the thread info suspend lock.
We no longer need to check for STATE_DETACHED because removing MonoThreadInfo on
detach is done while holding the suspend lock.

luhenry added a commit that referenced this issue Mar 9, 2018

[threads] Fix race between abort socket syscall and thread shutdown (#…
…7507)

* [threads] Fix race between abort socket syscall and thread shutdown

The race happens when we call mono_thread_info_abort_socket_sycall_for_close on
a thread that's about to shut down.

We lookup the MonoThreadInfo for that thread outside the
mono_thread_info_suspend_lock, and then use it to send the signal once we've
taken the lock.

In between those two actions, the other thread already finished
unregister_thread and pthreads may already have destroyed its TLS keys and
otherwise started to dismantle the thread.

The fix is to do a second mono_thread_info_lookup once we have the lock and
only signal the other thread if it still around.

Fixes #6998

* [threads] Simplify mono_thread_info_abort_socket_syscall_for_close

Call mono_thread_info_lookup while holding the thread info suspend lock.
We no longer need to check for STATE_DETACHED because removing MonoThreadInfo on
detach is done while holding the suspend lock.

jonpryor added a commit to xamarin/xamarin-android that referenced this issue Apr 25, 2018

Bump to mono/2018-02/0c5a524e (#1289)
Bumps to Java.Interop/master/0afb2b0f
Bumps to llvm/master/a9cfb50e.

Fixes: https://bugzilla.xamarin.com/show_bug.cgi?id=11771
Fixes: https://bugzilla.xamarin.com/show_bug.cgi?id=15051
Fixes: https://bugzilla.xamarin.com/show_bug.cgi?id=19436
Fixes: https://bugzilla.xamarin.com/show_bug.cgi?id=45901
Fixes: https://bugzilla.xamarin.com/show_bug.cgi?id=56071
Fixes: https://bugzilla.xamarin.com/show_bug.cgi?id=58413
Fixes: https://bugzilla.xamarin.com/show_bug.cgi?id=58413
Fixes: https://bugzilla.xamarin.com/show_bug.cgi?id=58413
Fixes: https://bugzilla.xamarin.com/show_bug.cgi?id=59184
fixes: https://bugzilla.xamarin.com/show_bug.cgi?id=60065
Fixes: https://bugzilla.xamarin.com/show_bug.cgi?id=60225
Fixes: https://bugzilla.xamarin.com/show_bug.cgi?id=60298
Fixes: https://bugzilla.xamarin.com/show_bug.cgi?id=60359
Fixes: https://bugzilla.xamarin.com/show_bug.cgi?id=60568
Fixes: https://bugzilla.xamarin.com/show_bug.cgi?id=60756
Fixes: https://bugzilla.xamarin.com/show_bug.cgi?id=60848
Fixes: https://bugzilla.xamarin.com/show_bug.cgi?id=60862
Fixes: https://bugzilla.xamarin.com/show_bug.cgi?id=60900
Fixes: https://bugzilla.xamarin.com/show_bug.cgi?id=60904
Fixes: https://bugzilla.xamarin.com/show_bug.cgi?id=60986
Fixes: https://github.com/mono/mono/issues/59400
Fixes: mono/mono#6169
Fixes: mono/mono#6187
Fixes: mono/mono#6192
Fixes: mono/mono#6255
Fixes: mono/mono#6264
Fixes: mono/mono#6266
Fixes: mono/mono#6281
Fixes: mono/mono#6283
Fixes: mono/mono#6320
Fixes: mono/mono#6339
Fixes: mono/mono#6343
Fixes: mono/mono#6349
Fixes: mono/mono#6379
Fixes: mono/mono#6383
Fixes: mono/mono#6401.
Fixes: mono/mono#6411
Fixes: mono/mono#6414
Fixes: mono/mono#6490
Fixes: mono/mono#6721
Fixes: mono/mono#6767
Fixes: mono/mono#6777
Fixes: mono/mono#6848
Fixes: mono/mono#6940
Fixes: mono/mono#6948
Fixes: mono/mono#6998
Fixes: mono/mono#7016
Fixes: mono/mono#7085
Fixes: mono/mono#7086
Fixes: mono/mono#7095
Fixes: mono/mono#7137
Fixes: mono/mono#7184
Fixes: mono/mono#7240
Fixes: mono/mono#7262
Fixes: mono/mono#7289
Fixes: mono/mono#7338
Fixes: mono/mono#7356
Fixes: mono/mono#7364
Fixes: mono/mono#7378
Fixes: mono/mono#7389
Fixes: mono/mono#7460
Fixes: mono/mono#7535
Fixes: mono/mono#7536
Fixes: mono/mono#7610
Fixes: mono/mono#7624
Fixes: mono/mono#7637
Fixes: mono/mono#7655
Fixes: mono/mono#7657
Fixes: mono/mono#7685
Fixes: mono/mono#7786
Fixes: mono/mono#7792
Fixes: mono/mono#7822
Fixes: mono/mono#7860
Fixes: mono/mono#8089
Fixes: mono/mono#8267
Fixes: mono/mono#8409
Fixes: xamarin/maccore#628
Fixes: xamarin/maccore#629
Fixes: xamarin/maccore#673
Fixes: xamarin/maccore#673
Fixes: #1561

jonpryor added a commit to xamarin/xamarin-android that referenced this issue Aug 8, 2018

Bump to mono/mono:2018-04@f3a2216b (#1503)
Fixes: #1130
Fixes: #1561 (comment)
Fixes: #1845
Fixes: #1951

Context: https://bugzilla.xamarin.com/show_bug.cgi?id=10087
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=11771
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=12850
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=18941
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=19436
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=25444
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=33208
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=58413
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=59184
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=59400
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=59779
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=60065
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=60843
Context: mono/mono#6174
Context: mono/mono#6178
Context: mono/mono#6180
Context: mono/mono#6181
Context: mono/mono#6186
Context: mono/mono#6187
Context: mono/mono#6211
Context: mono/mono#6266
Context: mono/mono#6579
Context: mono/mono#6666
Context: mono/mono#6752
Context: mono/mono#6801
Context: mono/mono#6812
Context: mono/mono#6848
Context: mono/mono#6940
Context: mono/mono#6948
Context: mono/mono#6998
Context: mono/mono#6999
Context: mono/mono#7016
Context: mono/mono#7085
Context: mono/mono#7086
Context: mono/mono#7095
Context: mono/mono#7134
Context: mono/mono#7137
Context: mono/mono#7145
Context: mono/mono#7184
Context: mono/mono#7240
Context: mono/mono#7262
Context: mono/mono#7289
Context: mono/mono#7338
Context: mono/mono#7356
Context: mono/mono#7364
Context: mono/mono#7378
Context: mono/mono#7389
Context: mono/mono#7449
Context: mono/mono#7460
Context: mono/mono#7535
Context: mono/mono#7536
Context: mono/mono#7537
Context: mono/mono#7565
Context: mono/mono#7588
Context: mono/mono#7596
Context: mono/mono#7610
Context: mono/mono#7613
Context: mono/mono#7620
Context: mono/mono#7624
Context: mono/mono#7637
Context: mono/mono#7655
Context: mono/mono#7657
Context: mono/mono#7661
Context: mono/mono#7685
Context: mono/mono#7696
Context: mono/mono#7729
Context: mono/mono#7786
Context: mono/mono#7792
Context: mono/mono#7805
Context: mono/mono#7822
Context: mono/mono#7828
Context: mono/mono#7860
Context: mono/mono#7864
Context: mono/mono#7903
Context: mono/mono#7920
Context: mono/mono#8089
Context: mono/mono#8143
Context: mono/mono#8267
Context: mono/mono#8311
Context: mono/mono#8340
Context: mono/mono#8409
Context: mono/mono#8417
Context: mono/mono#8430
Context: mono/mono#8698
Context: mono/mono#8701
Context: mono/mono#8712
Context: mono/mono#8721
Context: mono/mono#8726
Context: mono/mono#8866
Context: mono/mono#9023
Context: mono/mono#9031
Context: mono/mono#9033
Context: mono/mono#9044
Context: mono/mono#9179
Context: mono/mono#9318
Context: mono/mono#9318
Context: xamarin/maccore#628
Context: xamarin/maccore#629
Context: xamarin/maccore#673

jonpryor added a commit to xamarin/xamarin-android that referenced this issue Aug 13, 2018

Bump to mono/mono:2018-04@f3a2216b (#1503)
Fixes: #1130
Fixes: #1561 (comment)
Fixes: #1845
Fixes: #1951

Context: https://bugzilla.xamarin.com/show_bug.cgi?id=10087
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=11771
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=12850
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=18941
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=19436
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=25444
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=33208
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=58413
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=59184
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=59400
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=59779
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=60065
Context: https://bugzilla.xamarin.com/show_bug.cgi?id=60843
Context: mono/mono#6174
Context: mono/mono#6178
Context: mono/mono#6180
Context: mono/mono#6181
Context: mono/mono#6186
Context: mono/mono#6187
Context: mono/mono#6211
Context: mono/mono#6266
Context: mono/mono#6579
Context: mono/mono#6666
Context: mono/mono#6752
Context: mono/mono#6801
Context: mono/mono#6812
Context: mono/mono#6848
Context: mono/mono#6940
Context: mono/mono#6948
Context: mono/mono#6998
Context: mono/mono#6999
Context: mono/mono#7016
Context: mono/mono#7085
Context: mono/mono#7086
Context: mono/mono#7095
Context: mono/mono#7134
Context: mono/mono#7137
Context: mono/mono#7145
Context: mono/mono#7184
Context: mono/mono#7240
Context: mono/mono#7262
Context: mono/mono#7289
Context: mono/mono#7338
Context: mono/mono#7356
Context: mono/mono#7364
Context: mono/mono#7378
Context: mono/mono#7389
Context: mono/mono#7449
Context: mono/mono#7460
Context: mono/mono#7535
Context: mono/mono#7536
Context: mono/mono#7537
Context: mono/mono#7565
Context: mono/mono#7588
Context: mono/mono#7596
Context: mono/mono#7610
Context: mono/mono#7613
Context: mono/mono#7620
Context: mono/mono#7624
Context: mono/mono#7637
Context: mono/mono#7655
Context: mono/mono#7657
Context: mono/mono#7661
Context: mono/mono#7685
Context: mono/mono#7696
Context: mono/mono#7729
Context: mono/mono#7786
Context: mono/mono#7792
Context: mono/mono#7805
Context: mono/mono#7822
Context: mono/mono#7828
Context: mono/mono#7860
Context: mono/mono#7864
Context: mono/mono#7903
Context: mono/mono#7920
Context: mono/mono#8089
Context: mono/mono#8143
Context: mono/mono#8267
Context: mono/mono#8311
Context: mono/mono#8340
Context: mono/mono#8409
Context: mono/mono#8417
Context: mono/mono#8430
Context: mono/mono#8698
Context: mono/mono#8701
Context: mono/mono#8712
Context: mono/mono#8721
Context: mono/mono#8726
Context: mono/mono#8866
Context: mono/mono#9023
Context: mono/mono#9031
Context: mono/mono#9033
Context: mono/mono#9044
Context: mono/mono#9179
Context: mono/mono#9318
Context: mono/mono#9318
Context: xamarin/maccore#628
Context: xamarin/maccore#629
Context: xamarin/maccore#673
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment