Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault while writing crash dump #5981

Closed
max-au opened this issue May 10, 2022 · 1 comment
Closed

Segmentation fault while writing crash dump #5981

max-au opened this issue May 10, 2022 · 1 comment
Assignees
Labels
bug Issue is reported as a bug in progress team:VM Assigned to OTP team VM

Comments

@max-au
Copy link
Contributor

max-au commented May 10, 2022

Describe the bug
BEAM dumps the core in this line: https://github.com/erlang/otp/blob/master/erts/emulator/beam/erl_process.h#L2897=

This line assumes that esdp->current_process is not NULL, which isn't true all the time.

The trivial fix (check that it's not NULL) would be on a hot path. Given that the only way to reproduce this is via writing crash dump, I'd be interested in a more efficient fix.

** To reproduce **

It happens when BEAM is writing crash dump and dumping an ordered_set CATree table.

#0  0x00000000007878d0 in erts_sched_local_random (additional_seed=140670306335632) at beam/erl_process.h:2899
#1  do_random_join_with_low_probability (seed=140670306335632, tb=0x7ff05ba3b858) at beam/erl_db_catree.c:757
#2  runlock_base_node (tb=0x7ff05ba3b858, base_node=<optimized out>) at beam/erl_db_catree.c:841
#3  unlock_iter_base_node (iter=0x7ff098b8d390) at beam/erl_db_catree.c:928
#4  catree_find_nextprev_root (iter=iter@entry=0x7ff098b8d390, search_keyp=search_keyp@entry=0x0, forward=1) at beam/erl_db_catree.c:1755
#5  0x000000000078834e in catree_find_next_root (keyp=0x0, iter=0x7ff098b8d390) at beam/erl_db_catree.c:2205
#6  db_print_catree (to=0x832660 <erts_write_fp>, to_arg=0x7ff0901ab600, show=0, tbl=0x7ff05ba3b858) at beam/erl_db_catree.c:2205
#7  0x00000000006276d6 in print_table (to=to@entry=0x832660 <erts_write_fp>, to_arg=to_arg@entry=0x7ff0901ab600, show=show@entry=0, tb=tb@entry=0x7ff05ba3b858) at beam/erl_db.c:5291
#8  0x00000000006396e7 in db_info_print (vpdbip=<synthetic pointer>, tb=0x7ff05ba3b858) at beam/erl_db.c:5319
#9  erts_db_foreach_table (func=<optimized out>, alive_only=1, arg=<synthetic pointer>) at beam/erl_db.c:5356
#10 db_info (to=to@entry=0x832660 <erts_write_fp>, to_arg=to_arg@entry=0x7ff0901ab600, show=show@entry=0) at beam/erl_db.c:5330
#11 0x000000000067d36a in erl_crash_dump_v (file=file@entry=0x0, line=line@entry=0, fmt=fmt@entry=0x8b2dde "erl_child_setup closed\n", args=args@entry=0x7ff098b8e6c0) at beam/break.c:1026
#12 0x00000000005372e1 in erts_exit_vv (n=-3, flush_async=flush_async@entry=0, fmt=fmt@entry=0x8b2dde "erl_child_setup closed\n", args1=args1@entry=0x7ff098b8e6c0, args2=args2@entry=0x7ff098b8e6d8) at beam/erl_init.c:2594
#13 0x0000000000537410 in erts_exit (n=n@entry=-3, fmt=fmt@entry=0x8b2dde "erl_child_setup closed\n") at beam/erl_init.c:2621
#14 0x00000000007a00d3 in forker_ready_input (e=<optimized out>, fd=<optimized out>) at sys/unix/sys_drivers.c:1768
#15 0x00000000005e8875 in erts_port_task_execute (runq=runq@entry=0x7ff09b03db40, curr_port_pp=curr_port_pp@entry=0x7ff09b03eee0) at beam/erl_port_task.c:1764
#16 0x0000000000467bc4 in erts_schedule (esdp=<optimized out>, p=<optimized out>, calls=<optimized out>) at beam/erl_process.c:9788
@max-au max-au added the bug Issue is reported as a bug label May 10, 2022
@IngelaAndin IngelaAndin added the team:VM Assigned to OTP team VM label May 11, 2022
sverker added a commit to sverker/otp that referenced this issue Jun 13, 2022
Seen to cause problem (erlangGH-5981) when crash dump is iterating over
ETS ordered_set with write_concurrency (catree) and the
thread doing it does not have a 'current_process'.

Change erts_sched_local_random to instead use a single mutated rand_state.
sverker added a commit to sverker/otp that referenced this issue Jun 20, 2022
Seen to cause problem (erlangGH-5981) when crash dump is iterating over
ETS ordered_set with write_concurrency (catree) and the
thread doing it does not have a 'current_process'.

Change erts_sched_local_random to instead use a single mutated rand_state.
@sverker
Copy link
Contributor

sverker commented Jun 20, 2022

Fix #6080 merged to maint for OTP 25.1.

@sverker sverker closed this as completed Sep 26, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue is reported as a bug in progress team:VM Assigned to OTP team VM
Projects
None yet
Development

No branches or pull requests

4 participants