Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FreeBSD test segfaults masquerading as unhandled task errors #41943

Closed
ararslan opened this issue Aug 20, 2021 · 1 comment · Fixed by JuliaPackaging/Yggdrasil#3856 or #42970
Closed
Labels
kind:bug Indicates an unexpected problem or unintended behavior system:freebsd Affects only FreeBSD

Comments

@ararslan
Copy link
Member

ararslan commented Aug 20, 2021

The FreeBSD buildbots have been encountering UNHANDLED TASK ERROR: EOFError: read end of file while running the test suite on FreeBSD 12, which I can reproduce locally. What's actually happening is a segfault, but you wouldn't know unless you check /var/log/messages and/or see the coredump file left behind.

Running the tests with a debug build and examining the backtrace of the coredump with GDB, I get:

#0  0x00000008006e44cf in _ULx86_64_dwarf_search_unwind_table () from /usr/home/julia/Desktop/julia/usr/lib/libunwind.so.8
#1  0x00000008006d80f5 in _ULx86_64_Iextract_dynamic_proc_info () from /usr/home/julia/Desktop/julia/usr/lib/libunwind.so.8
#2  0x00000008006d821c in local_find_proc_info () from /usr/home/julia/Desktop/julia/usr/lib/libunwind.so.8
#3  0x00000008006d8167 in _ULx86_64_Ifind_dynamic_proc_info () from /usr/home/julia/Desktop/julia/usr/lib/libunwind.so.8
#4  0x00000008006e11f4 in fetch_proc_info () from /usr/home/julia/Desktop/julia/usr/lib/libunwind.so.8
#5  0x00000008006e092c in find_reg_state () from /usr/home/julia/Desktop/julia/usr/lib/libunwind.so.8
#6  0x00000008006e07ef in _ULx86_64_dwarf_step () from /usr/home/julia/Desktop/julia/usr/lib/libunwind.so.8
#7  0x00000008006da949 in _ULx86_64_step () from /usr/home/julia/Desktop/julia/usr/lib/libunwind.so.8
#8  0x000000080154cd34 in jl_unw_step (cursor=0x8598a52e0, from_signal_handler=0, ip=0x8598a5258, sp=0x8598a5250) at stackwalk.c:545
#9  0x000000080154afde in jl_unw_stepn (cursor=0x8598a52e0, bt_data=0x80616f940, bt_size=0x8598a52d8, sp=0x0, maxsize=80000, skip=0, ppgcstack=0x8598a56d8,
    from_signal_handler=0) at stackwalk.c:99
#10 0x000000080154b2dd in rec_backtrace (bt_data=0x80616f940, maxsize=80000, skip=2) at stackwalk.c:214
#11 0x000000080150cb9c in record_backtrace (ptls=0x801beb980, skip=1) at task.c:309
#12 0x000000080150cb0c in jl_throw (e=0x806cb2b40) at task.c:605
#13 0x0000000879a96d37 in chkfullrank () at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:698
#14 #cholesky!#147 () at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:308
#15 cholesky!##kw () at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:306
#16 julia_#cholesky!#149_12913 (tol=0, check=1 '\001', A=<error reading variable: Cannot access memory at address 0x0>)
    at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:334
#17 0x0000000879a96f27 in cholesky!##kw () at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:327
#18 julia_#cholesky#152_12910 (tol=0, check=1 '\001', A=<error reading variable: Cannot access memory at address 0x0>)
    at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:459
#19 0x0000000879a97134 in julia_cholesky_12907 (A=<error reading variable: Cannot access memory at address 0x877dd0070>)
    at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/LinearAlgebra/src/cholesky.jl:459
#20 0x0000000879a97215 in jfptr_cholesky_12908 ()
#21 0x00000008014e52dd in _jl_invoke (F=0x80e5a1dd0 <jl_system_image_data+43801616>, args=0x8598a5f98, nargs=2, mfunc=0x873092900, world=31263) at gf.c:2245
#22 0x00000008014e5383 in jl_apply_generic (F=0x80e5a1dd0 <jl_system_image_data+43801616>, args=0x8598a5f98, nargs=2) at gf.c:2427
#23 0x0000000801507040 in jl_apply (args=0x8598a5f90, nargs=3) at ./julia.h:1771
#24 0x0000000801506d01 in do_call (args=0x85c891a78, nargs=3, s=0x8598a6b30) at interpreter.c:125
#25 0x000000080150552b in eval_value (e=0x8090995b0, s=0x8598a6b30) at interpreter.c:214
#26 0x000000080150665d in eval_stmt_value (stmt=0x8090995b0, s=0x8598a6b30) at interpreter.c:165
#27 0x00000008015046d4 in eval_body (stmts=0x877f29c40, s=0x8598a6b30, ip=252, toplevel=1) at interpreter.c:579
#28 0x0000000801504190 in eval_body (stmts=0x877f29c40, s=0x8598a6b30, ip=249, toplevel=1) at interpreter.c:512
#29 0x0000000801504190 in eval_body (stmts=0x877f29c40, s=0x8598a6b30, ip=62, toplevel=1) at interpreter.c:512
#30 0x0000000801504190 in eval_body (stmts=0x877f29c40, s=0x8598a6b30, ip=10, toplevel=1) at interpreter.c:512
#31 0x0000000801504e34 in jl_interpret_toplevel_thunk (m=0x807eb9600, src=0x809bb3210) at interpreter.c:727
#32 0x000000080152d9ff in jl_toplevel_eval_flex (m=0x807eb9600, e=0x809085c70, fast=1, expanded=1) at toplevel.c:885
#33 0x000000080152e449 in jl_eval_module_expr (parent_module=0x807ebba90, ex=0x807b4e0d0) at toplevel.c:196
#34 0x000000080152c8d3 in jl_toplevel_eval_flex (m=0x807ebba90, e=0x807b4e0d0, fast=1, expanded=0) at toplevel.c:673
#35 0x000000080152d52e in jl_toplevel_eval_flex (m=0x807ebba90, e=0x807c7ded0, fast=1, expanded=0) at toplevel.c:830
#36 0x000000080152f054 in jl_toplevel_eval (m=0x807ebba90, v=0x807c7ded0) at toplevel.c:894
#37 0x000000080152f2ef in jl_toplevel_eval_in (m=0x807ebba90, ex=0x807c7ded0) at toplevel.c:944
#38 0x000000080b6ba028 in eval () at boot.jl:373
#39 japi1_include_string_39775 (mapexpr=..., mod=0x80639d390, code=0x59ac, filename=0x58) at loading.jl:1207
#40 0x00000008014dbe57 in jl_fptr_args (f=0x80ecbb1e0 <jl_system_image_data+51245088>, args=0x8598a8c20, nargs=4,
    m=0x80ecbb4f0 <jl_system_image_data+51245872>) at gf.c:2014
#41 0x00000008014e51f5 in _jl_invoke (F=0x80ecbb1e0 <jl_system_image_data+51245088>, args=0x8598a8c20, nargs=4,
    mfunc=0x80bf62d10 <jl_system_image_data+3697488>, world=31247) at gf.c:2226
#42 0x00000008014e5383 in jl_apply_generic (F=0x80ecbb1e0 <jl_system_image_data+51245088>, args=0x8598a8c20, nargs=4) at gf.c:2427
#43 0x000000080b88709f in japi1__include_32638 (mapexpr=0x80ca386b0 <jl_system_image_data+15058160>, mod=0x80639d390, _path=0x58) at loading.jl:1264
#44 0x000000081547ffa6 in include () at Base.jl:420
#45 macro expansion () at /usr/home/julia/Desktop/julia/test/testdefs.jl:24
#46 macro expansion () at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/Test/src/Test.jl:1283
#47 macro expansion () at /usr/home/julia/Desktop/julia/test/testdefs.jl:23
#48 macro expansion () at timing.jl:368
#49 julia_#runtests#1_917 (seed=107125921925593505577364040855661097293, name=<error reading variable: Cannot access memory at address 0x0>,
    path=<error reading variable: Cannot access memory at address 0x0>, isolate=1 '\001') at /usr/home/julia/Desktop/julia/test/testdefs.jl:21
#50 0x0000000815480ab7 in runtests##kw () at /usr/home/julia/Desktop/julia/test/testdefs.jl:6
#51 julia_runtests##kw_914 (name=<error reading variable: Cannot access memory at address 0x80>,
    path=<error reading variable: Cannot access memory at address 0x877dd0070>) at /usr/home/julia/Desktop/julia/test/testdefs.jl:6
#52 0x0000000815480b1a in jfptr_runtests##kw_915 ()
#53 0x00000008014e52dd in _jl_invoke (F=0x80639a668, args=0x8598a9838, nargs=4, mfunc=0x806ffe2c0, world=31247) at gf.c:2245
#54 0x00000008014e5383 in jl_apply_generic (F=0x80639a668, args=0x8598a9838, nargs=4) at gf.c:2427
#55 0x00000008014f7030 in jl_apply (args=0x8598a9830, nargs=5) at ./julia.h:1771
#56 0x00000008014f6dce in do_apply (args=0x8598a9a68, nargs=3, iterate=0x80ec994c0 <jl_system_image_data+51106560>) at builtins.c:713
#57 0x00000008014f5fdf in jl_f__apply_iterate (F=0x0, args=0x8598a9a60, nargs=4) at builtins.c:721
#58 0x0000000815475586 in julia_#106_765 () at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:278
#59 0x0000000815475812 in julia_run_work_thunk_762 (thunk=..., print_error=0 '\000')
    at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:63
#60 0x000000081547599f in macro expansion () at /usr/home/julia/Desktop/julia/usr/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:278
#61 julia_#105_759 () at task.jl:411
#62 0x0000000815475c60 in jfptr_#105_760 ()
#63 0x00000008014e51f5 in _jl_invoke (F=0x806fdcd80, args=0x807cfef08, nargs=0, mfunc=0x8079d6b80, world=31247) at gf.c:2226
#64 0x00000008014e5383 in jl_apply_generic (F=0x806fdcd80, args=0x807cfef08, nargs=0) at gf.c:2427
#65 0x000000080150bdd0 in jl_apply (args=0x807cfef00, nargs=1) at ./julia.h:1771
#66 0x000000080150dc2f in start_task () at task.c:881

@vchuravy looked at this a bit and posited that it was a crash in the unwinder while reading process data, caused by a bug in libunwind and/or buggy DWARF emission, and that something may be wrong with asynchronous unwind tables. He recommended the following:

  1. Build Julia with -fno-asynchronous-unwind-tables. I tried that but the crash persisted.
  2. Swap nongnu libunwind for LLVM libunwind. WIP in WIP: Use LLVM libunwind on FreeBSD #41955.
  3. Upgrade LLVM libunwind from 11.0.1. WIP in Add LLVM libunwind v12.0.1 JuliaPackaging/Yggdrasil#3504.

He also noted the following upstream bug reports, which may be a useful breadcrumb (for folks who understand these things 😅)

@ararslan ararslan added kind:bug Indicates an unexpected problem or unintended behavior system:freebsd Affects only FreeBSD labels Aug 20, 2021
@ararslan

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug Indicates an unexpected problem or unintended behavior system:freebsd Affects only FreeBSD
Projects
None yet
1 participant