Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Debug ERTS] random Segmentation fault on startup: beam/erl_time_sup.c:219:update_last_mtime() Assertion failed: mtime >= esdp->last_monotonic_time #5554

Closed
yossarin opened this issue Dec 28, 2021 · 4 comments · Fixed by #5825
Assignees
Labels
bug Issue is reported as a bug in progress team:VM Assigned to OTP team VM

Comments

@yossarin
Copy link

yossarin commented Dec 28, 2021

Describe the bug

I have built debug ERTS on MacOS, and sometimes I get this random Segmentation fault on startup of the app when I load a certain NIF lib (erlkaf).

To Reproduce
I work with a 3rd part lib, erlkaf. I think this will be hard to reproduce on other machines.

What I am doing is simply running Elixirs mix test inside LLDB to catch the Segmentation fault.

$ cerl -debug -lldb -pa /Users/y/.asdf/installs/elixir/1.12.0-otp-24/bin/../lib/eex/ebin /Users/y/.asdf/installs/elixir/1.12.0-otp-24/bin/../lib/elixir/ebin /Users/y/.asdf/installs/elixir/1.12.0-otp-24/bin/../lib/ex_unit/ebin /Users/y/.asdf/installs/elixir/1.12.0-otp-24/bin/../lib/iex/ebin /Users/y/.asdf/installs/elixir/1.12.0-otp-24/bin/../lib/logger/ebin /Users/y/.asdf/installs/elixir/1.12.0-otp-24/bin/../lib/mix/ebin -elixir ansi_enabled true -noshell -s elixir start_cli -extra /Users/y/.asdf/installs/elixir/1.12.0-otp-24/bin/mix test
(lldb) target create "beam.debug.smp"
Current executable set to 'beam.debug.smp' (x86_64).
(lldb) settings set -- target.run-args  "--" "-root" "/Users/y/.asdf/plugins/erlang/kerl-home/builds/asdf_24.0.1/otp_src_24.0.1" "-progname" "/Users/y/.asdf/plugins/erlang/kerl-home/builds/asdf_24.0.1/otp_src_24.0.1/bin/cerl" "-debug" "--" "-home" "/Users/y" "--" "-pa" "/Users/y/.asdf/installs/elixir/1.12.0-otp-24/bin/../lib/eex/ebin" "/Users/y/.asdf/installs/elixir/1.12.0-otp-24/bin/../lib/elixir/ebin" "/Users/y/.asdf/installs/elixir/1.12.0-otp-24/bin/../lib/ex_unit/ebin" "/Users/y/.asdf/installs/elixir/1.12.0-otp-24/bin/../lib/iex/ebin" "/Users/y/.asdf/installs/elixir/1.12.0-otp-24/bin/../lib/logger/ebin" "/Users/y/.asdf/installs/elixir/1.12.0-otp-24/bin/../lib/mix/ebin" "-elixir" "ansi_enabled" "true" "-noshell" "-s" "elixir" "start_cli" "-extra" "/Users/y/.asdf/installs/elixir/1.12.0-otp-24/bin/mix" "test"
(lldb) command source -s 0 '/tmp/.cerllldb.14946'
Executing commands in '/tmp/.cerllldb.14946'.
(lldb) env TERM=dumb
(lldb) command script import /Users/y/.asdf/plugins/erlang/kerl-home/builds/asdf_24.0.1/otp_src_24.0.1/erts/etc/unix/etp.py
(lldb) run
Process 15102 launched: '/Users/y/.asdf/plugins/erlang/kerl-home/builds/asdf_24.0.1/otp_src_24.0.1/bin/x86_64-apple-darwin20.2.0/beam.debug.smp' (x86_64)
librdkafka fork already exist. delete deps/librdkafka for a fresh checkout ...
concurrentqueue fork already exist. delete deps/concurrentqueue for a fresh checkout ...
make[1]: `/Users/y/sportening/superbet_erlkaf/c_src/../priv/erlkaf_nif.so' is up to date.
===> Analyzing applications...
===> Compiling erlkaf
Loading library: "/Users/y/sportening/we-api-user-account/_build/test/lib/erlkaf/priv/erlkaf_nif" 
beam/erl_time_sup.c:219:update_last_mtime() Assertion failed: mtime >= esdp->last_monotonic_time
Process 15102 stopped
* thread #5, name = '1_scheduler', stop reason = signal SIGABRT
    frame #0: 0x00007fff20305462 libsystem_kernel.dylib`__pthread_kill + 10
libsystem_kernel.dylib`__pthread_kill:
->  0x7fff20305462 <+10>: jae    0x7fff2030546c            ; <+20>
    0x7fff20305464 <+12>: movq   %rax, %rdi
    0x7fff20305467 <+15>: jmp    0x7fff202ff6a1            ; cerror_nocancel
    0x7fff2030546c <+20>: retq   
Target 0: (beam.debug.smp) stopped.
(lldb) bt
* thread #5, name = '1_scheduler', stop reason = signal SIGABRT
  * frame #0: 0x00007fff20305462 libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fff20333610 libsystem_pthread.dylib`pthread_kill + 263
    frame #2: 0x00007fff20286720 libsystem_c.dylib`abort + 120
    frame #3: 0x00000001003fe86d beam.debug.smp`erl_assert_error(expr="mtime >= esdp->last_monotonic_time", func="update_last_mtime", file="beam/erl_time_sup.c", line=219) at sys.c:956:5
    frame #4: 0x0000000100291228 beam.debug.smp`update_last_mtime(esdp=0x000000014201e480, mtime=39004477000) at erl_time_sup.c:219:2
    frame #5: 0x00000001002922f3 beam.debug.smp`erts_get_monotonic_time(esdp=0x000000014201e480) at erl_time_sup.c:1806:5
    frame #6: 0x00000001000188f9 beam.debug.smp`scheduler_wait(fcalls=0x0000700006f3ccb0, esdp=0x000000014201e480, rq=0x000000014201bc40) at erl_process.c:3432:32
    frame #7: 0x00000001000109ec beam.debug.smp`erts_schedule(esdp=0x000000014201e480, p=0x00000001426e0f88, calls=3982) at erl_process.c:9568:6
    frame #8: 0x0000000100037bae beam.debug.smp`erts_debug_schedule(esdp=0x0000000000000000, c_p=0x00000001426e0f88, calls=3982) at beam_asm.cpp:323:11
(lldb) c
Process 15102 resuming
Process 15102 exited with status = 6 (0x00000006) Terminated due to signal 6
(lldb) ^D

Expected behavior
No random Segmentation faults on ERTS startup.

Affected versions
$ cerl
Erlang/OTP 24 [erts-12.0.1] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [jit]

Additional context
$ uname -v
Darwin Kernel Version 20.2.0: Wed Dec 2 20:39:59 PST 2020; root:xnu-7195.60.75~1/RELEASE_X86_64

@yossarin yossarin added the bug Issue is reported as a bug label Dec 28, 2021
@yossarin
Copy link
Author

yossarin commented Dec 28, 2021

Seems like related to this: asdf-vm/asdf-erlang#116 (comment)

As @garazdawi said:

Thanks @benoitc, so they have changed something with time... seems like the monotonic time primitive from the kernel does not return monotonic time values... this is not going to be an easy thing to fix...

@garazdawi
Copy link
Contributor

Can you print the values of mtime and esdp->last_monotonic_time?

Does the same thing happen if you do not start within lldb?

@garazdawi garazdawi added the team:VM Assigned to OTP team VM label Dec 29, 2021
@rickard-green rickard-green self-assigned this Jan 3, 2022
@rickard-green
Copy link
Contributor

This should be fixed by #5825 and will be released in a patch soon.

@rickard-green rickard-green linked a pull request Mar 26, 2022 that will close this issue
@rickard-green
Copy link
Contributor

The fix in #5825 has now been released in the OTP 24.3.3, OTP 23.3.4.13, and OTP 22.3.4.25 patch packages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue is reported as a bug in progress team:VM Assigned to OTP team VM
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants