-
Notifications
You must be signed in to change notification settings - Fork 2.9k
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
intermittent segfault / abort / stuck processes when setting schedulers_online system flag starting in OTP 20 #4809
Comments
@rickard-green: any suggestions as to next steps? |
@zerth Sorry I haven't had the time to dig deeper into this yet. I'm hoping to be able to have a look at this next week. |
@rickard-green: thanks, it looks like #4980 resolves the issue! Applying it to OTP-23.3.4.4 and OTP-24.0.2, the above repro example runs cleanly in the standard and debug emulators, and the behavior above no longer occurs. Without the patch, one of the following still occurs:
Will look into enabling more extensive local testing. |
…aint * rickard/schedulers-online-fix/GH-4809/OTP-17500: Fix erlang:system_flag(schedulers_online, _)
…aint-24 * rickard/schedulers-online-fix/GH-4809/OTP-17500: Fix erlang:system_flag(schedulers_online, _)
#4980 has been released in the OTP 24.0.3 patch now. |
…aint-23 * rickard/schedulers-online-fix/GH-4809/OTP-17500: Fix erlang:system_flag(schedulers_online, _)
…aint-22 * rickard/schedulers-online-fix/GH-4809/OTP-17500: Fix erlang:system_flag(schedulers_online, _)
Describe the bug
When adjusting the number of schedulers online via
erlang:system_flag/2
on a Linux x86_64 (Broadwell/Skylake) build, a segfault inerl_process.c:add2runq
sometimes occurs:In a debug emulator, an abort in
erl_process.c:erts_set_schedulers_online
instead occurs:To Reproduce
This will intermittently abort (debug emulator) or segfault (standard emulator) on a Linux x86_64 Broadwell or Skylake host (can't reproduce on an OS X Intel host) under various OTP versions since 20:
For cases not resulting in segfault on the standard emulator, it seems the
Q
processes stop getting scheduled (haven't checked whether they become deadlocked).Expected behavior
The number of online schedulers is adjusted randomly for a while followed by a graceful emulator shutdown.
Affected versions
I couldn't get
git bisect run
working properly and so attempted a manual bisect based on tags. I was able to reproduce starting withOTP-20.0-rc1
, but not withOTP-19.3.6.9
. UnderOTP-24.0-rc3
, the above only aborts in the debug emulator and doesn't segfault in the standard emulator, but in the standard emulator theQ
processes appear to become stuck (number of schedulers online stops changing).The text was updated successfully, but these errors were encountered: