Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault when crash dump #8179

Closed
qzhuyan opened this issue Feb 22, 2024 · 5 comments · Fixed by #8181
Closed

segfault when crash dump #8179

qzhuyan opened this issue Feb 22, 2024 · 5 comments · Fixed by #8181
Assignees
Labels
bug Issue is reported as a bug team:VM Assigned to OTP team VM

Comments

@qzhuyan
Copy link
Contributor

qzhuyan commented Feb 22, 2024

Describe the bug
segfault when crashdump

To Reproduce

Runtime terminating during boot ({{no_namespace,<<"config">>,#{fields=>[#{default=>#{hocon=><<"true">>,oneliner=>true},name=><<"enable">>,type=>#{name=><<"boolean()">>,kind=>primitive},desc=><<"Enable or disable this bridge">>,aliases=>[],raw_default=>true},#{name=><<"tags">>,type=>#{kind=>array,elements=>#{name=><<"binary()">>,kind=>primitive}},desc=><<"Tags to annotate this config entry.">>,aliases=>[],importance=>low},#{default=>#{hocon=><<"\"\"">>,oneliner=>true},name=><<"description">>,type=>#{name=><<"string()">>,kind=>primitive},desc=><<"Descriptive text.">>,aliases=>[],importance=>low,raw_default=><<>>},#{default=>#{hocon=><<"{}">>,oneliner=>true},name=><<"resource_opts">>,type=>#{name=><<"bridge_mqtt:creation_opts">>,kind=>struct},desc=><<"Resource options.">>,aliases=>[],raw_default=>#{}},#{name=><<"mode">>,type=>#{symbols=>[<<"cluster_shareload">>],kind=>enum},desc=><<"Deprecated since v5.1.0 & e5.1.0.">>,aliases=>[]},#{name=><<"server">>,type=>#{name=><<"string()">>,kind=>primitive},desc=><<"The 

Crash dump is being written to: erl_crash.dump.../home/ubuntu/repo/emqx/build: line 125: 724326 Segmentation fault      (core dumped) erl -enable-feature maybe_expr -noshell -eval "ok = emqx_conf:dump_schema('$docdir', $SCHEMA_MODULE),          halt(0)."
make: *** [Makefile:175: emqx] Error 139

I don't think you are able to run this command it is called by some build script of EMQX with dirty context.

But it looks like relates to a race that calling halt(0) while formatting something.

Expected behavior
crashdump success without segfault.

Affected versions
The OTP versions that are affected by this bug.

Erlang/OTP 26 [erts-14.2.1] [source] [64-bit] [smp:6:6] [ds:6:6:10] [async-threads:1] [jit] [lttng]

Eshell V14.2.1 (press Ctrl+G to abort, type help(). for help)
[erl_crash_3.dump.gz](https://github.com/erlang/otp/files/14377431/erl_crash_3.dump.gz)

Additional context

(gdb) bt
#0  I_lg (xl=4398007891115, x=0xffff740db5c8) at beam/big.c:1636
#1  big_integer_estimate (x=x@entry=281472628798914, base=base@entry=10)
    at beam/big.c:1915
#2  0x0000aaaae0ec0fac in print_term (dcount=0xffff6e366658, obj=<optimized out>, 
    arg=0xffff5c214d80, fn=0xaaaae104e704 <erts_write_fp>)
    at beam/erl_printf_term.c:481
#3  erts_printf_term (fn=0xaaaae104e704 <erts_write_fp>, arg=0xffff5c214d80, 
    term=<optimized out>, precision=<optimized out>) at beam/erl_printf_term.c:793
#4  0x0000aaaae104c1c0 in erts_printf_format (
    fn=fn@entry=0xaaaae104e704 <erts_write_fp>, arg=arg@entry=0xffff5c214d80, 
    fmt=fmt@entry=0xaaaae10bd8d8 "B%T\n", ap=...) at common/erl_printf_format.c:817
#5  0x0000aaaae104f454 in erts_vcbprintf (cb_fn=0xaaaae104e704 <erts_write_fp>, 
    cb_arg=cb_arg@entry=0xffff5c214d80, format=format@entry=0xaaaae10bd8d8 "B%T\n", 
    arglist=...) at common/erl_printf.c:485
#6  0x0000aaaae0e99ad0 in erts_print (to=to@entry=0xaaaae104e704 <erts_write_fp>, 
    arg=arg@entry=0xffff5c214d80, format=format@entry=0xaaaae10bd8d8 "B%T\n")
    at beam/utils.c:441
#7  0x0000aaaae0f6dc24 in dump_module_literals (
    to=to@entry=0xaaaae104e704 <erts_write_fp>, to_arg=to_arg@entry=0xffff5c214d80, 
    lit_area=<optimized out>) at beam/erl_process_dump.c:933
#8  0x0000aaaae0f6f368 in dump_literals (to_arg=0xffff5c214d80, 
    to=0xaaaae104e704 <erts_write_fp>) at beam/erl_process_dump.c:872
#9  erts_deep_process_dump (to=to@entry=0xaaaae104e704 <erts_write_fp>, 
    to_arg=to_arg@entry=0xffff5c214d80) at beam/erl_process_dump.c:97
#10 0x0000aaaae0f4e900 in erl_crash_dump_v (file=file@entry=0x0, line=line@entry=0, 
    fmt=fmt@entry=0xaaaae10bc4b0 "%s\n", args=...) at beam/break.c:1043
#11 0x0000aaaae0e5eb9c in erts_exit_vv (n=n@entry=-3, flush=flush@entry=0, 
    fmt=fmt@entry=0xaaaae10bc4b0 "%s\n", args1=..., args2=...)
    at beam/erl_init.c:2673
#12 0x0000aaaae0e5ece0 in erts_exit (n=n@entry=-3, 
    fmt=fmt@entry=0xaaaae10bc4b0 "%s\n") at beam/erl_init.c:2700
#13 0x0000aaaae0eaa4ec in halt_2 (A__p=0xaaaae49881c0, BIF__ARGS=0xffff6e367f80, 
    A__I=<optimized out>) at beam/bif.c:4334
#14 0x0000ffff704f48cc in ?? ()
#15 0x0000ffff7055bb0c in ?? ()

attachments:

crashdump:
erl_crash_3.dump.gz

bt full: https://gist.github.com/qzhuyan/d540e9afeddafd971a768f9a11647477
minimal env for reproduce: #8179 (comment)

@qzhuyan qzhuyan added the bug Issue is reported as a bug label Feb 22, 2024
@garazdawi
Copy link
Contributor

Can you reproduce it easily? Can you print what you get if you do bt full in gdb?

@qzhuyan
Copy link
Contributor Author

qzhuyan commented Feb 23, 2024

@garazdawi

for bt full see https://gist.github.com/qzhuyan/d540e9afeddafd971a768f9a11647477

I managed to reproduce it twice in 10 tries. I am trying to build a minimal reproducible env.

@garazdawi garazdawi self-assigned this Feb 23, 2024
@qzhuyan
Copy link
Contributor Author

qzhuyan commented Feb 23, 2024

gh_issu_otp_8179.tar.gz
@garazdawi managed to reproduce the issue with above file context with command:

erl -pa emqx_new/lib/*/ebin/  -enable-feature maybe_expr -noshell -eval 'ok = emqx_conf:dump_schema("docgen/", emqx_conf_schema)'

@garazdawi
Copy link
Contributor

Thanks for the reproducer, using a debug emulator I get this assertion:

beam/erl_process_dump.c:1038:dump_module_literals() Assertion failed: ((ErlFunThing*)(htop))->num_free == 0
Aborted

I was able to reproduce it with 26.2, but not with 27.0-rc1. I think it is this commit that fixes the issue in 27.

Maybe @jhogberg has any idea of a fix for 26?

@garazdawi
Copy link
Contributor

Fix available in #8181

@IngelaAndin IngelaAndin added the team:VM Assigned to OTP team VM label Feb 23, 2024
qzhuyan pushed a commit to emqx/otp that referenced this issue Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue is reported as a bug team:VM Assigned to OTP team VM
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants