You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Julia crashed on a long running job on a super computer. The code involves doing program synthesis in which many julia programs are generated and using meta programming are evaluated on examples.
The programs generated have a relatively high number of expressions. (See below)
The code is parallel and when I ran it on the super computer it had 40 available threads.
After 2 hours of running the code crashed with an unexpected Julia long error. Previously the same code worked fine for 3h 30 minues on the same super computer. I asked a friend to run the same code and he got a different error saying something with illegal instruction at <address>.
I think the issue might be because we create a lot functions at runtime and evaluate them
versioninfo gives this output
Julia Version 1.8.2
Commit 36034abf26 (2022-09-29 15:21 UTC)
Platform Info:
OS: Linux (x86_64-redhat-linux)
CPU: 32 × Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-13.0.1 (ORCJIT, cascadelake)
Threads: 1 on 32 virtual cores
Environment:
LD_LIBRARY_PATH = /apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/lib:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/utf8proc-2.8.0-x4aisz75qo5zpts7gsvultt5atqfelpy/lib64:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/suite-sparse-5.13.0-2n3or2e7vdtishslwcbsoy4bzbb6m2pv/lib:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/pcre2-10.39-olknsbg7ndigpkwjvc6fe6jpccrl5gfg/lib:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/p7zip-16.02-vjnx4hx7dk5tqe4e64vrgtr262f2c5qo/lib:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/openlibm-0.8.1-fyhgnpse34kjaea7fmouiunzip3dixql/lib:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/openblas-0.3.21-vu34ahxt36eavzfax7wo2vuy4aqp67ar/lib:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/nghttp2-1.47.0-kliudnwhedtbsqeqwyv5utbwkdlhiuo6/lib:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/mpfr-4.1.0-p4g4m7mf6nszjry2ev5rglwxeur2beak/lib:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/mbedtls-2.28.0-vyqhwpiizy75jclqpxwtau46is3n7t4d/lib:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/llvm-13.0.1-luuedcob23h45ategu2xuvx4frzgch7u/lib:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/zlib-1.2.13-3rvw6sqjn7egwnc3eqd3gxt7o256tza3/lib:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/ncurses-6.3-mcxazirdzaumrhp5sqbmfv5mwrtcnvrw/lib:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/libedit-3.1-20210216-t2eycez3pw2gjcljwufzdceqf4kapenn/lib:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/hwloc-2.8.0-nxja7lwcnuwvv4wi7uwldyohbebrbdjo/lib:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/binutils-2.38-uosga7lb6uqsgymbtd3z5j7fpqa2eijv/lib:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/libuv-julia-1.44.2-6an4m3tg3o27dvv7lvsbgiubrydngdud/lib:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/libunwind-1.6.2-a6k25wrx3ke5jrn6nccq7aqvu3yefjws/lib:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/libssh2-1.10.0-zj7tm75qbmsvsdjb4hdnd73sqabupn4m/lib64:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/libgit2-1.3.1-sb3tsdauyjgt57mgbi3um7kaxoaeudpr/lib64:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/libblastrampoline-5.2.0-fxmhgvychxlqv2s7pixujeumobisdyna/lib:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/gmp-6.2.1-r3qf755lclsogljrjm5ama6nlflodejr/lib:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/dsfmt-2.2.5-zqas5ww2t5cit4za6rs6vbucryp3i7df/lib:/apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/curl-7.78.0-kvmgnhao4wqc7qnju7vgb2ojjxvkhash/lib:/cm/shared/apps/slurm/current/lib64
JULIA_ROOT = /apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor
Julia 1.8.2 was already installed on the super computer
For MWE is quite difficult to reproduce though. Try to run this file from the HerbSearch.jl package. You would have to clone the repository, switch to meta-search branch and run the work_place.jl file using a few threads.
Error:
Internal error: encountered unexpected error in runtime:
MethodError(f=Core.Compiler.widenconst, args=(deepcopy_internal(Any, Base.IdDict{Any, Any}) from deepcopy_internal(Any, Base.IdDict{K, V} where V where K),), world=0x0000000000001342)
jl_method_error_bare at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:1879
jl_method_error at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:1897
jl_lookup_generic_ at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:2530 [inlined]
ijl_apply_generic at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:2545
sroa_pass! at ./compiler/ssair/passes.jl:766
run_passes at ./compiler/optimize.jl:542
optimize at ./compiler/optimize.jl:504 [inlined]
_typeinf at ./compiler/typeinfer.jl:257
typeinf at ./compiler/typeinfer.jl:213
typeinf_edge at ./compiler/typeinfer.jl:877
abstract_call_method at ./compiler/abstractinterpretation.jl:647
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:139
abstract_call_known at ./compiler/abstractinterpretation.jl:1716
abstract_call at ./compiler/abstractinterpretation.jl:1786
jfptr_abstract_call_14004 at /apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/lib/julia/sys.so (unknown line)
_jl_invoke at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:2367 [inlined]
ijl_apply_generic at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:2549
return_type_tfunc at ./compiler/tfuncs.jl:2058
abstract_call_known at ./compiler/abstractinterpretation.jl:1665
abstract_call at ./compiler/abstractinterpretation.jl:1786
abstract_call at ./compiler/abstractinterpretation.jl:1753
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1910
typeinf_local at ./compiler/abstractinterpretation.jl:2360
typeinf_nocycle at ./compiler/abstractinterpretation.jl:2482
_typeinf at ./compiler/typeinfer.jl:230
typeinf at ./compiler/typeinfer.jl:213
typeinf_edge at ./compiler/typeinfer.jl:877
abstract_call_method at ./compiler/abstractinterpretation.jl:647
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:139
abstract_call_known at ./compiler/abstractinterpretation.jl:1716
abstract_call at ./compiler/abstractinterpretation.jl:1786
abstract_call at ./compiler/abstractinterpretation.jl:1753
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1910
typeinf_local at ./compiler/abstractinterpretation.jl:2386
typeinf_nocycle at ./compiler/abstractinterpretation.jl:2482
_typeinf at ./compiler/typeinfer.jl:230
typeinf at ./compiler/typeinfer.jl:213
typeinf_edge at ./compiler/typeinfer.jl:877
abstract_call_method at ./compiler/abstractinterpretation.jl:647
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:139
abstract_call_known at ./compiler/abstractinterpretation.jl:1716
abstract_call at ./compiler/abstractinterpretation.jl:1786
abstract_call at ./compiler/abstractinterpretation.jl:1753
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1910
typeinf_local at ./compiler/abstractinterpretation.jl:2360
typeinf_nocycle at ./compiler/abstractinterpretation.jl:2482
_typeinf at ./compiler/typeinfer.jl:230
typeinf at ./compiler/typeinfer.jl:213
typeinf_edge at ./compiler/typeinfer.jl:877
abstract_call_method at ./compiler/abstractinterpretation.jl:647
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:139
abstract_call_known at ./compiler/abstractinterpretation.jl:1716
abstract_call at ./compiler/abstractinterpretation.jl:1786
abstract_call at ./compiler/abstractinterpretation.jl:1753
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1910
typeinf_local at ./compiler/abstractinterpretation.jl:2360
typeinf_nocycle at ./compiler/abstractinterpretation.jl:2482
_typeinf at ./compiler/typeinfer.jl:230
typeinf at ./compiler/typeinfer.jl:213
typeinf_edge at ./compiler/typeinfer.jl:877
abstract_call_method at ./compiler/abstractinterpretation.jl:647
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:139
abstract_call_known at ./compiler/abstractinterpretation.jl:1716
abstract_call at ./compiler/abstractinterpretation.jl:1786
abstract_call at ./compiler/abstractinterpretation.jl:1753
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1910
typeinf_local at ./compiler/abstractinterpretation.jl:2360
typeinf_nocycle at ./compiler/abstractinterpretation.jl:2482
_typeinf at ./compiler/typeinfer.jl:230
typeinf at ./compiler/typeinfer.jl:213
typeinf_edge at ./compiler/typeinfer.jl:877
abstract_call_method at ./compiler/abstractinterpretation.jl:647
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:139
abstract_call_known at ./compiler/abstractinterpretation.jl:1716
abstract_call at ./compiler/abstractinterpretation.jl:1786
abstract_apply at ./compiler/abstractinterpretation.jl:1357
abstract_call_known at ./compiler/abstractinterpretation.jl:1620
abstract_call at ./compiler/abstractinterpretation.jl:1786
abstract_call at ./compiler/abstractinterpretation.jl:1753
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1910
typeinf_local at ./compiler/abstractinterpretation.jl:2386
typeinf_nocycle at ./compiler/abstractinterpretation.jl:2482
_typeinf at ./compiler/typeinfer.jl:230
typeinf at ./compiler/typeinfer.jl:213
typeinf_ext at ./compiler/typeinfer.jl:967
typeinf_ext_toplevel at ./compiler/typeinfer.jl:1000
typeinf_ext_toplevel at ./compiler/typeinfer.jl:996
jfptr_typeinf_ext_toplevel_12208 at /apps/arch/2023r1/software/linux-rhel8-skylake_avx512/gcc-8.5.0/julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/lib/julia/sys.so (unknown line)
_jl_invoke at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:2367 [inlined]
ijl_apply_generic at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:2549
jl_apply at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/julia.h:1839 [inlined]
jl_type_infer at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:319
jl_generate_fptr_impl at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/jitlayers.cpp:319
jl_compile_method_internal at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:2081 [inlined]
jl_compile_method_internal at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:2025
_jl_invoke at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:2359 [inlined]
ijl_apply_generic at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:2549
jl_apply at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/julia.h:1839 [inlined]
do_apply at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/builtins.c:730
_cat_t at ./abstractarray.jl:1737
_jl_invoke at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:2367 [inlined]
ijl_apply_generic at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:2549
jl_apply at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/julia.h:1839 [inlined]
do_apply at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/builtins.c:730
_cat at ./abstractarray.jl:1728
_jl_invoke at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:2367 [inlined]
ijl_apply_generic at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:2549
jl_apply at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/julia.h:1839 [inlined]
do_apply at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/builtins.c:730
#cat#155 at ./abstractarray.jl:1916
jl_apply at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/julia.h:1839 [inlined]
do_apply at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/builtins.c:730
cat##kw at ./abstractarray.jl:1916
jl_apply at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/julia.h:1839 [inlined]
do_apply at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/builtins.c:730
vcat at ./abstractarray.jl:1815
_jl_invoke at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:2367 [inlined]
ijl_apply_generic at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:2549
#39441 at /scratch/nfilat/HerbSearch.jl/src/meta_search/meta_grammar_definition.jl:6
_jl_invoke at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:2367 [inlined]
ijl_apply_generic at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/gf.c:2549
jl_apply at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/julia.h:1839 [inlined]
jl_f__call_latest at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/builtins.c:774
#invokelatest#2 at ./essentials.jl:729 [inlined]
invokelatest at ./essentials.jl:726 [inlined]
evaluate_meta_program at /scratch/nfilat/HerbSearch.jl/src/meta_search/meta_grammar_definition.jl:44 [inlined]
macro expansion at ./timing.jl:463 [inlined]
macro expansion at /scratch/nfilat/HerbSearch.jl/src/meta_search/meta_runner.jl:85 [inlined]
#236#threadsfor_fun#110 at ./threadingconstructs.jl:84
#236#threadsfor_fun at ./threadingconstructs.jl:51 [inlined]
#1 at ./threadingconstructs.jl:30
jl_apply at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/julia.h:1839 [inlined]
start_task at /tmp/spack-stage/feverdij/spack-stage-julia-1.8.2-nr7lebgwpvj42wydlkrslxjhewgahaor/spack-src/src/task.c:931
There is also this error in the stacktrace somewhere
double free or corruption (out)
_int_free at /lib64/libc.so.6 (unknown line)
I am uploading the whole output file I got from the super computer to too (The file has my own prints too) run.txt
In the file you can see that the GC is sometimes taking up to 30% of the time of running a program.
As an example of an expression taken from the file is this one:
The text was updated successfully, but these errors were encountered:
nicolaefilat
changed the title
Julia crashes when running a long task on a super computer that uses meta prorgamming and threads
illegal instruction at <address> for a long running threaded program
Mar 26, 2024
Hi @nsajko. I can't upgrade on julia 1.10 because I would need to upgrade the julia version on the super computer but that is quite difficult to do and I do not have permissions to do so.
@nicolaefilat You can contact the support of your system and tell them that I'd like a newer version of Julia 🙂 or you can always download the official binary of julia from https://julialang.org/downloads/ in your userspace 😉
Julia crashed on a long running job on a super computer. The code involves doing program synthesis in which many julia programs are generated and using meta programming are evaluated on examples.
The programs generated have a relatively high number of expressions. (See below)
The code is parallel and when I ran it on the super computer it had 40 available threads.
After 2 hours of running the code crashed with an unexpected Julia long error. Previously the same code worked fine for 3h 30 minues on the same super computer. I asked a friend to run the same code and he got a different error saying something with
illegal instruction at <address>
.I think the issue might be because we create a lot functions at runtime and evaluate them
versioninfo
gives this outputHerbSearch.jl
package. You would have to clone the repository, switch tometa-search
branch and run thework_place.jl
file using a few threads.Error:
There is also this error in the stacktrace somewhere
I am uploading the whole output file I got from the super computer to too (The file has my own prints too)
run.txt
In the file you can see that the GC is sometimes taking up to 30% of the time of running a program.
As an example of an expression taken from the file is this one:
The text was updated successfully, but these errors were encountered: