Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unhandled-exception-2.exe crashes crash reporting #10031

Closed
alexanderkyte opened this issue Aug 10, 2018 · 6 comments

Comments

@alexanderkyte
Copy link
Member

@alexanderkyte alexanderkyte commented Aug 10, 2018

There is a unhandled-exception-2.exe crash on CI that seems to hit the assertion at mini-exceptions.c:1196.

I need to investigate this for bugs week. If the domain isn't set, stack walking shouldn't die spectacularly. It should indicate failure to the caller.

@alexanderkyte alexanderkyte self-assigned this Aug 10, 2018
@alexanderkyte

This comment has been minimized.

Copy link
Member Author

@alexanderkyte alexanderkyte commented Aug 10, 2018

@akoeplinger @lambdageek please include any information you know about this that I haven't mentioned.

@akoeplinger akoeplinger added this to Bugs Pool in Bugs Week via automation Aug 10, 2018
akoeplinger added a commit that referenced this issue Aug 10, 2018
See #10031
@alexanderkyte alexanderkyte moved this from Bugs Pool to In Progress in Bugs Week Aug 13, 2018
@akoeplinger

This comment has been minimized.

Copy link
Member

@akoeplinger akoeplinger commented Aug 14, 2018

It seems to occur on unhandled-exception-4.exe too (see Jenkins PR), albeit way more infrequently than with -2.exe. Disabled it as well.

akoeplinger added a commit that referenced this issue Aug 14, 2018
@alexanderkyte

This comment has been minimized.

Copy link
Member Author

@alexanderkyte alexanderkyte commented Aug 16, 2018

It sounds like @lambdageek is debugging this one as well. We agreed that it seemed to be a race between mono_thread_interruption_checkpoint_void and getting the info block to pass to mono_thread_suspend_all_other_threads

@alexanderkyte

This comment has been minimized.

Copy link
Member Author

@alexanderkyte alexanderkyte commented Aug 16, 2018

frame #9: 0x0000000103ccd441 mono`mono_log_write_logfile(log_domain=0x0000000000000000, level=G_LOG_LEVEL_ERROR, hdr=0, message="* Assertion at mini-exceptions.c:1112, condition `state->valid' not met\n") at mono-log-common.c:135
   frame #10: 0x0000000103cc33bc mono`structured_log_adapter(log_domain=0x0000000000000000, log_level=G_LOG_LEVEL_ERROR, message="* Assertion at mini-exceptions.c:1112, condition `state->valid' not met\n", user_data=0x0000000000000000) at mono-logger.c:466
   frame #11: 0x0000000103cf74fa mono`monoeg_g_logstr(log_domain=0x0000000000000000, log_level=G_LOG_LEVEL_ERROR, msg="* Assertion at mini-exceptions.c:1112, condition `state->valid' not met\n") at goutput.c:117
 * frame #12: 0x0000000103cf6ff1 mono`monoeg_g_logv_nofree(log_domain=0x0000000000000000, log_level=G_LOG_LEVEL_ERROR, format="* Assertion at %s:%d, condition `%s' not met\n", args=0x00007ffeec44c4c0) at goutput.c:128
   frame #13: 0x0000000103cf7324 mono`monoeg_assertion_message(format="* Assertion at %s:%d, condition `%s' not met\n") at goutput.c:163
   frame #14: 0x00000001038d09d4 mono`mono_walk_stack_with_state(func=(mono`last_managed at threads.c:5259), state=0x00007fb57603c208, unwind_options=MONO_UNWIND_NONE, user_data=0x00007ffeec44c6f8) at mini-exceptions.c:1112
   frame #15: 0x0000000103b86df0 mono`mono_thread_info_get_last_managed(info=0x00007fb57603c000) at threads.c:5279
   frame #16: 0x0000000103b87014 mono`async_suspend_critical(info=0x00007fb57603c000, ud=0x00007ffeec44c808) at threads.c:5385
   frame #17: 0x0000000103ce44f2 mono`mono_thread_info_safe_suspend_and_run(id=0x0000700005ffb000, interrupt_kernel=1, callback=(mono`async_suspend_critical at threads.c:5378), user_data=0x00007ffeec44c808) at mono-threads.c:1205
   frame #18: 0x0000000103b811c6 mono`async_suspend_internal(thread=0x000000010459c650, interrupt=1) at threads.c:5423
   frame #19: 0x0000000103b80f68 mono`mono_thread_suspend_all_other_threads at threads.c:3720
   frame #20: 0x0000000103adc91b mono`ves_icall_System_Environment_Exit(result=255) at icall.c:6836

is racing with

thread #4, name = 'tid_1b03'
    frame #0: 0x00007fff74039cee libsystem_kernel.dylib`__psynch_cvwait + 10
    frame #1: 0x00007fff74176662 libsystem_pthread.dylib`_pthread_cond_wait + 732
    frame #2: 0x0000000103cc232d mono`mono_os_cond_wait(cond=0x0000700005bf4768, mutex=0x0000000103dde4e8) at mono-os-mutex.h:173
    frame #3: 0x0000000103cc2065 mono`mono_os_event_wait_multiple(events=0x0000700005bf47c8, nevents=1, waitall=1, timeout=4294967295, alertable=1) at os-event-unix.c:190
    frame #4: 0x0000000103cc1c63 mono`mono_os_event_wait_one(event=0x00007fb576a00b70, timeout=4294967295, alertable=1) at os-event-unix.c:94
    frame #5: 0x0000000103b862b6 mono`self_suspend_internal at threads.c:5452
    frame #6: 0x0000000103b7be40 mono`mono_thread_execute_interruption(pexc=0x0000700005bf4c00) at threads.c:4843
    frame #7: 0x0000000103b86127 mono`mono_thread_execute_interruption_ptr at threads.c:4876
    frame #8: 0x0000000103b829f5 mono`mono_thread_interruption_checkpoint_request(bypass_abort_protection=0) at threads.c:4983
    frame #9: 0x0000000103b8296b mono`mono_thread_interruption_checkpoint at threads.c:4994
    frame #10: 0x0000000103b82a3d mono`mono_thread_interruption_checkpoint_void at threads.c:5006
    frame #11: 0x0000000103a5e231 mono`monitor_thread(unused=0x0000000000000000) at threadpool-worker-default.c:712
    frame #12: 0x0000000103b8540f mono`start_wrapper_internal(start_info=0x0000000000000000, stack_ptr=0x0000700005bf5000) at threads.c:1167
    frame #13: 0x0000000103b8500c mono`start_wrapper(data=0x00007fb576a00b90) at threads.c:1227
    frame #14: 0x00007fff741756c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #15: 0x00007fff7417556d libsystem_pthread.dylib`_pthread_start + 377
    frame #16: 0x00007fff74174c5d libsystem_pthread.dylib`thread_start + 13
lambdageek added a commit to lambdageek/mono that referenced this issue Aug 16, 2018
So the issue is that calling begin_suspend_for_blocking_thread under hybrid
suspend will perform a preemptive suspension of the victim thread.

The preemptive suspend doesn't always succeed.
In particular, thread_state_init_from_sigctx or thread_state_init_from_handle
could return FALSE and into->suspend_can_continue will be set to FALSE.
When the thread_state_init_* functions return FALSE, they also set the
MonoThreadUnwindState:valid to FALSE.

So it is important to check the value of MonoThreadInfo:suspend_can_continue
before proceeding with a suspension.

As a result in suspend_sync in the ReqSuspendInitSuspendRunning case we call
check_async_suspend which checks (for preemptively suspended threads) whethere
suspend_can_continue is true.  If it isn't, we resume the victim thread and
loop once more in suspend_sync_nolock and try again.

We didn't have a check_async_suspend call in the ReqSuspendInitSuspendBlocking
case, so a thread could be considered suspended even if we couldn't capture a
valid MonoThreadUnwindState for it.

That breaks the assertion the call to mono_thread_info_get_last_managed in
async_suspend_critical which needs a valid MonoThreadUnwindState.

Fixes mono#10031
@lambdageek

This comment has been minimized.

Copy link
Member

@lambdageek lambdageek commented Aug 16, 2018

@alexanderkyte Thanks for helping to track this down! Take a look at #10161 for an attempted fix.

monojenkins added a commit to monojenkins/mono that referenced this issue Aug 16, 2018
So the issue is that calling begin_suspend_for_blocking_thread under hybrid
suspend will perform a preemptive suspension of the victim thread.

The preemptive suspend doesn't always succeed.
In particular, thread_state_init_from_sigctx or thread_state_init_from_handle
could return FALSE and into->suspend_can_continue will be set to FALSE.
When the thread_state_init_* functions return FALSE, they also set the
MonoThreadUnwindState:valid to FALSE.

So it is important to check the value of MonoThreadInfo:suspend_can_continue
before proceeding with a suspension.

As a result in suspend_sync in the ReqSuspendInitSuspendRunning case we call
check_async_suspend which checks (for preemptively suspended threads) whethere
suspend_can_continue is true.  If it isn't, we resume the victim thread and
loop once more in suspend_sync_nolock and try again.

We didn't have a check_async_suspend call in the ReqSuspendInitSuspendBlocking
case, so a thread could be considered suspended even if we couldn't capture a
valid MonoThreadUnwindState for it.

That breaks the assertion the call to mono_thread_info_get_last_managed in
async_suspend_critical which needs a valid MonoThreadUnwindState.

Fixes mono#10031
@alexanderkyte alexanderkyte moved this from In Progress to Done in Bugs Week Aug 17, 2018
lambdageek added a commit that referenced this issue Aug 21, 2018
…locking case (#10162)

* [coop] Check async suspend status in ReqSuspendInitSuspendBlocking case

So the issue is that calling begin_suspend_for_blocking_thread under hybrid
suspend will perform a preemptive suspension of the victim thread.

The preemptive suspend doesn't always succeed.
In particular, thread_state_init_from_sigctx or thread_state_init_from_handle
could return FALSE and into->suspend_can_continue will be set to FALSE.
When the thread_state_init_* functions return FALSE, they also set the
MonoThreadUnwindState:valid to FALSE.

So it is important to check the value of MonoThreadInfo:suspend_can_continue
before proceeding with a suspension.

As a result in suspend_sync in the ReqSuspendInitSuspendRunning case we call
check_async_suspend which checks (for preemptively suspended threads) whethere
suspend_can_continue is true.  If it isn't, we resume the victim thread and
loop once more in suspend_sync_nolock and try again.

We didn't have a check_async_suspend call in the ReqSuspendInitSuspendBlocking
case, so a thread could be considered suspended even if we couldn't capture a
valid MonoThreadUnwindState for it.

That breaks the assertion the call to mono_thread_info_get_last_managed in
async_suspend_critical which needs a valid MonoThreadUnwindState.

Fixes #10031

* fixup - remove coop assertion in check_async_suspend

result for coop could be iether BeginSuspendOkCooperative or
BeginSuspendOkNoWait depending on whether the thread was running (Ok) or
blocking (OkNoWait)

* [coop] suspend_can_continue if async raced with self-suspend (osx + win32)

Companion to 006d6ce from earlier this year
which applied this fix for POSIX signal based suspend.  This is the
corresponding OSX and Win32 fix.

In hybrid suspend, in the case where a suspend initiator needs to preemptively
suspend a thread, but the thread self-suspended while after the suspend
initiator put it into the blocking_suspend_requested state, we want to allow
the suspend to continue.  We do have to call the syscall to undo the native
async suspend, but after that, we can leave the victim to continue waiting for
a resume.

This should not have any effect on full cooperative or full preemptive suspend
- in the full coop case we never call async suspend, in the full preemptive
case the thread never self-suspends in blocking.
lambdageek added a commit that referenced this issue Aug 21, 2018
…se (#10161)

* [coop] Check async suspend status in ReqSuspendInitSuspendBlocking case

So the issue is that calling begin_suspend_for_blocking_thread under hybrid
suspend will perform a preemptive suspension of the victim thread.

The preemptive suspend doesn't always succeed.
In particular, thread_state_init_from_sigctx or thread_state_init_from_handle
could return FALSE and into->suspend_can_continue will be set to FALSE.
When the thread_state_init_* functions return FALSE, they also set the
MonoThreadUnwindState:valid to FALSE.

So it is important to check the value of MonoThreadInfo:suspend_can_continue
before proceeding with a suspension.

As a result in suspend_sync in the ReqSuspendInitSuspendRunning case we call
check_async_suspend which checks (for preemptively suspended threads) whethere
suspend_can_continue is true.  If it isn't, we resume the victim thread and
loop once more in suspend_sync_nolock and try again.

We didn't have a check_async_suspend call in the ReqSuspendInitSuspendBlocking
case, so a thread could be considered suspended even if we couldn't capture a
valid MonoThreadUnwindState for it.

That breaks the assertion the call to mono_thread_info_get_last_managed in
async_suspend_critical which needs a valid MonoThreadUnwindState.

Fixes #10031

* fixup - remove coop assertion in check_async_suspend

result for coop could be iether BeginSuspendOkCooperative or
BeginSuspendOkNoWait depending on whether the thread was running (Ok) or
blocking (OkNoWait)

* [coop] suspend_can_continue if async raced with self-suspend (osx + win32)

Companion to 006d6ce from earlier this year
which applied this fix for POSIX signal based suspend.  This is the
corresponding OSX and Win32 fix.

In hybrid suspend, in the case where a suspend initiator needs to preemptively
suspend a thread, but the thread self-suspended while after the suspend
initiator put it into the blocking_suspend_requested state, we want to allow
the suspend to continue.  We do have to call the syscall to undo the native
async suspend, but after that, we can leave the victim to continue waiting for
a resume.

This should not have any effect on full cooperative or full preemptive suspend
- in the full coop case we never call async suspend, in the full preemptive
case the thread never self-suspends in blocking.
@marek-safar marek-safar moved this from Done to Archived in Bugs Week Sep 24, 2018
jonpryor added a commit to xamarin/xamarin-android that referenced this issue Dec 6, 2018
Bumps to mono/api-snapshot@b99fc87.
Bumps to mono/bockbuild@5af573e.
Bumps to mono/boringssl@41221b4.
Bumps to mono/corefx@23d0b58.
Bumps to mono/corert@af496fc.
Bumps to mono/linker@7af03ce.
Bumps to mono/NUnitLite@00e259a.
Bumps to mono/reference-assemblies@9325826.
Bumps to mono/roslyn-binaries@249709f.
Bumps to mono/xunit-binaries@bb58347.

	$ git diff --shortstat b63e5378..23f2024a      # mono 
	 1630 files changed, 50926 insertions(+), 92212 deletions(-)

Fixes: mono/mono#6352
Fixes: mono/mono#6947
Fixes: mono/mono#6992
Fixes: mono/mono#7615
Fixes: mono/mono#8340
Fixes: mono/mono#8407
Fixes: mono/mono#8575
Fixes: mono/mono#8627
Fixes: mono/mono#8707
Fixes: mono/mono#8766
Fixes: mono/mono#8848
Fixes: mono/mono#8866
Fixes: mono/mono#8935
Fixes: mono/mono#9010
Fixes: mono/mono#9023
Fixes: mono/mono#9031
Fixes: mono/mono#9033
Fixes: mono/mono#9106
Fixes: mono/mono#9109
Fixes: mono/mono#9155
Fixes: mono/mono#9179
Fixes: mono/mono#9232
Fixes: mono/mono#9234
Fixes: mono/mono#9262
Fixes: mono/mono#9277
Fixes: mono/mono#9292
Fixes: mono/mono#9318
Fixes: mono/mono#9318
Fixes: mono/mono#9332
Fixes: mono/mono#9407
Fixes: mono/mono#9421
Fixes: mono/mono#9505
Fixes: mono/mono#9542
Fixes: mono/mono#9581
Fixes: mono/mono#9623
Fixes: mono/mono#9684
Fixes: mono/mono#9750
Fixes: mono/mono#9753
Fixes: mono/mono#9772
Fixes: mono/mono#9839
Fixes: mono/mono#9869
Fixes: mono/mono#9921
Fixes: mono/mono#9943
Fixes: mono/mono#9947
Fixes: mono/mono#9973
Fixes: mono/mono#9996
Fixes: mono/mono#10000
Fixes: mono/mono#10031
Fixes: mono/mono#10035
Fixes: mono/mono#10227
Fixes: mono/mono#10243
Fixes: mono/mono#10303
Fixes: mono/mono#10448
Fixes: mono/mono#10483
Fixes: mono/mono#10488
Fixes: mono/mono#10863
Fixes: mono/mono#11123
Fixes: mono/mono#11138
Fixes? mono/mono#11146
Fixes: mono/mono#11202
Fixes: mono/mono#11378
Fixes: mono/mono#11479
Fixes: mono/mono#11613
Fixes: #1951
Fixes: xamarin/xamarin-macios#4347
Fixes: xamarin/xamarin-macios#4617
Fixes: xamarin/xamarin-macios#4984
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Bugs Week
Archived
3 participants
You can’t perform that action at this time.