Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault during NVPTX emission due to debuginfo #316

Closed
leios opened this issue May 20, 2022 · 17 comments · Fixed by #323
Closed

Segmentation fault during NVPTX emission due to debuginfo #316

leios opened this issue May 20, 2022 · 17 comments · Fixed by #323
Labels
Milestone

Comments

@leios
Copy link

leios commented May 20, 2022

I have been messing around with the new CUDA tests in Enzyme and realized that they all fail on and after CUDA#36488fb336b3f518857e933d2c44e94c3bafddc6: JuliaGPU/CUDA.jl@6c771c6 (Full error message below).

@vchuravy seems to think Enzyme might be outputting iffy debug info. Related: https://github.com/JuliaGPU/CUDA.jl/blob/04b98b1ebee0911be7902659f20e28446d8de080/src/compiler/gpucompiler.jl#L40

Note that the code runs fine when running julia with -g0

@maleadt

Error:

signal (11): Segmentation fault
in expression starting at /home/leios/projects/CESMIX/tests/enzyme_cuda_test.jl:18
_ZN4llvm10DwarfDebug23emitInitialLocDirectiveERKNS_15MachineFunctionEj at /home/leios/builds/julia-1.7.1/bin/../lib/julia/libLLVM-12jl.so (unknown line)
_ZN4llvm10AsmPrinter31emitInitialRawDwarfLocDirectiveERKNS_15MachineFunctionE at /home/leios/builds/julia-1.7.1/bin/../lib/julia/libLLVM-12jl.so (unknown line)
_ZN4llvm15NVPTXAsmPrinter22emitFunctionEntryLabelEv at /home/leios/builds/julia-1.7.1/bin/../lib/julia/libLLVM-12jl.so (unknown line)
_ZN4llvm10AsmPrinter18emitFunctionHeaderEv at /home/leios/builds/julia-1.7.1/bin/../lib/julia/libLLVM-12jl.so (unknown line)
_ZN4llvm10AsmPrinter16emitFunctionBodyEv at /home/leios/builds/julia-1.7.1/bin/../lib/julia/libLLVM-12jl.so (unknown line)
_ZN4llvm15NVPTXAsmPrinter20runOnMachineFunctionERNS_15MachineFunctionE at /home/leios/builds/julia-1.7.1/bin/../lib/julia/libLLVM-12jl.so (unknown line)
_ZN4llvm19MachineFunctionPass13runOnFunctionERNS_8FunctionE at /home/leios/builds/julia-1.7.1/bin/../lib/julia/libLLVM-12jl.so (unknown line)
_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE at /home/leios/builds/julia-1.7.1/bin/../lib/julia/libLLVM-12jl.so (unknown line)
_ZN4llvm13FPPassManager11runOnModuleERNS_6ModuleE at /home/leios/builds/julia-1.7.1/bin/../lib/julia/libLLVM-12jl.so (unknown line)
_ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE at /home/leios/builds/julia-1.7.1/bin/../lib/julia/libLLVM-12jl.so (unknown line)
_ZL21LLVMTargetMachineEmitP23LLVMOpaqueTargetMachineP16LLVMOpaqueModuleRN4llvm17raw_pwrite_streamE19LLVMCodeGenFileTypePPc at /home/leios/builds/julia-1.7.1/bin/../lib/julia/libLLVM-12jl.so (unknown line)
LLVMTargetMachineEmitToMemoryBuffer at /home/leios/builds/julia-1.7.1/bin/../lib/julia/libLLVM-12jl.so (unknown line)
LLVMTargetMachineEmitToMemoryBuffer at /home/leios/.julia/packages/LLVM/szqwr/lib/12/libLLVM_h.jl:965 [inlined]
emit at /home/leios/.julia/packages/LLVM/szqwr/src/targetmachine.jl:45
mcgen at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/mcgen.jl:74
unknown function (ip: 0x7f9d0c50e65f)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
macro expansion at /home/leios/.julia/packages/TimerOutputs/LDL7n/src/TimerOutput.jl:252 [inlined]
macro expansion at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/driver.jl:421 [inlined]
macro expansion at /home/leios/.julia/packages/TimerOutputs/LDL7n/src/TimerOutput.jl:252 [inlined]
macro expansion at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/driver.jl:418 [inlined]
#emit_asm#143 at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/utils.jl:64
emit_asm##kw at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/utils.jl:62
unknown function (ip: 0x7f9d0c504bf8)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
cufunction_compile at /home/leios/.julia/packages/CUDA/SrFuA/src/compiler/execution.jl:337
#260 at /home/leios/.julia/packages/CUDA/SrFuA/src/compiler/execution.jl:330 [inlined]
JuliaContext at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/driver.jl:74
unknown function (ip: 0x7f9d0c52eeda)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
cufunction_compile at /home/leios/.julia/packages/CUDA/SrFuA/src/compiler/execution.jl:329
cached_compilation at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/cache.jl:90
#cufunction#255 at /home/leios/.julia/packages/CUDA/SrFuA/src/compiler/execution.jl:301
cufunction at /home/leios/.julia/packages/CUDA/SrFuA/src/compiler/execution.jl:295
unknown function (ip: 0x7f9d0c52e8ed)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1788 [inlined]
do_call at /buildworker/worker/package_linux64/build/src/interpreter.c:126
eval_value at /buildworker/worker/package_linux64/build/src/interpreter.c:215
eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:461
eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:516
eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:516
jl_interpret_toplevel_thunk at /buildworker/worker/package_linux64/build/src/interpreter.c:731
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:885
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:830
jl_toplevel_eval_in at /buildworker/worker/package_linux64/build/src/toplevel.c:944
eval at ./boot.jl:373 [inlined]
include_string at ./loading.jl:1196
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
_include at ./loading.jl:1253
include at ./client.jl:451
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1788 [inlined]
do_call at /buildworker/worker/package_linux64/build/src/interpreter.c:126
eval_value at /buildworker/worker/package_linux64/build/src/interpreter.c:215
eval_stmt_value at /buildworker/worker/package_linux64/build/src/interpreter.c:166 [inlined]
eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:587
jl_interpret_toplevel_thunk at /buildworker/worker/package_linux64/build/src/interpreter.c:731
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:885
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:830
jl_toplevel_eval_in at /buildworker/worker/package_linux64/build/src/toplevel.c:944
eval at ./boot.jl:373 [inlined]
eval_user_input at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:150
repl_backend_loop at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:244
start_repl_backend at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:229
#run_repl#47 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:362
run_repl at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:349
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
#930 at ./client.jl:394
jfptr_YY.930_40362.clone_1 at /home/leios/builds/julia-1.7.1/lib/julia/sys.so (unknown line)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1788 [inlined]
jl_f__call_latest at /buildworker/worker/package_linux64/build/src/builtins.c:757
#invokelatest#2 at ./essentials.jl:716 [inlined]
invokelatest at ./essentials.jl:714 [inlined]
run_main_repl at ./client.jl:379
exec_options at ./client.jl:309
_start at ./client.jl:495
jfptr__start_40531.clone_1 at /home/leios/builds/julia-1.7.1/lib/julia/sys.so (unknown line)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2247 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2429
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1788 [inlined]
true_main at /buildworker/worker/package_linux64/build/src/jlapi.c:559
jl_repl_entrypoint at /buildworker/worker/package_linux64/build/src/jlapi.c:701
main at julia (unknown line)
__libc_start_call_main at /usr/lib/libc.so.6 (unknown line)
__libc_start_main at /usr/lib/libc.so.6 (unknown line)
unknown function (ip: 0x400808)
Allocations: 65094343 (Pool: 65071237; Big: 23106); GC: 69
Segmentation fault (core dumped)
@vchuravy vchuravy added the cuda label May 20, 2022
@vchuravy vchuravy added this to the release-0.10 milestone May 20, 2022
@maleadt
Copy link

maleadt commented May 20, 2022

The check is there because ptxas often doesn't like our (legal) debug info. The crash here is within our own LLVM, so some Julia code is probably messing up debug info in deed. Try running with LLVM assertions enabled, it might reveal something.

@leios
Copy link
Author

leios commented May 22, 2022

Ok, build Julia again with:

LLVM_ASSERTIONS=1
FORCE_ASSERTIONS=1

got this segfault:

signal (11): Segmentation fault
in expression starting at /home/leios/projects/CESMIX/tests/enzyme_cuda_test.jl:18
__atomic_add at /usr/include/c++/11.2.0/ext/atomicity.h:71 [inlined]
__atomic_add_dispatch at /usr/include/c++/11.2.0/ext/atomicity.h:111 [inlined]
_M_add_ref_copy at /usr/include/c++/11.2.0/bits/shared_ptr_base.h:148 [inlined]
__shared_count at /usr/include/c++/11.2.0/bits/shared_ptr_base.h:712 [inlined]
__shared_ptr at /usr/include/c++/11.2.0/bits/shared_ptr_base.h:1152 [inlined]
shared_ptr at /usr/include/c++/11.2.0/bits/shared_ptr.h:150 [inlined]
ThreadSafeContext at /home/leios/builds/julia/usr/include/llvm/ExecutionEngine/Orc/ThreadSafeModule.h:29 [inlined]
getContext at /home/leios/builds/julia/usr/include/llvm/ExecutionEngine/Orc/ThreadSafeModule.h:153 [inlined]
jl_create_native_impl at /home/leios/builds/julia/src/aotcompile.cpp:268
#compile_method_instance#54 at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/jlgen.jl:366
compile_method_instance##kw at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/jlgen.jl:328 [inlined]
macro expansion at /home/leios/.julia/packages/TimerOutputs/LDL7n/src/TimerOutput.jl:252 [inlined]
#irgen#56 at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/irgen.jl:5
irgen##kw at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/irgen.jl:3 [inlined]
macro expansion at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/driver.jl:208 [inlined]
macro expansion at /home/leios/.julia/packages/TimerOutputs/LDL7n/src/TimerOutput.jl:252 [inlined]
macro expansion at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/driver.jl:207 [inlined]
#emit_llvm#116 at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/utils.jl:64
unknown function (ip: 0x7f01d53ec62f)
_jl_invoke at /home/leios/builds/julia/src/gf.c:2373 [inlined]
ijl_invoke at /home/leios/builds/julia/src/gf.c:2399
unknown function (ip: 0x7f01d53aa4d9)
unknown function (ip: 0x7f01d53aa499)
emit_llvm##kw at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/utils.jl:62 [inlined]
cufunction_compile at /home/leios/.julia/packages/CUDA/SrFuA/src/compiler/execution.jl:336
#260 at /home/leios/.julia/packages/CUDA/SrFuA/src/compiler/execution.jl:330 [inlined]
Context at /home/leios/.julia/packages/LLVM/szqwr/src/core/context.jl:22
JuliaContext at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/driver.jl:72
unknown function (ip: 0x7f01d53ab25a)
_jl_invoke at /home/leios/builds/julia/src/gf.c:2373 [inlined]
ijl_apply_generic at /home/leios/builds/julia/src/gf.c:2574
cufunction_compile at /home/leios/.julia/packages/CUDA/SrFuA/src/compiler/execution.jl:329
cached_compilation at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/cache.jl:90
#cufunction#255 at /home/leios/.julia/packages/CUDA/SrFuA/src/compiler/execution.jl:301
cufunction at /home/leios/.julia/packages/CUDA/SrFuA/src/compiler/execution.jl:294
unknown function (ip: 0x7f01d5389cbd)
_jl_invoke at /home/leios/builds/julia/src/gf.c:2373 [inlined]
ijl_apply_generic at /home/leios/builds/julia/src/gf.c:2574
jl_apply at /home/leios/builds/julia/src/julia.h:1841 [inlined]
do_call at /home/leios/builds/julia/src/interpreter.c:126
eval_value at /home/leios/builds/julia/src/interpreter.c:215
eval_body at /home/leios/builds/julia/src/interpreter.c:467
eval_body at /home/leios/builds/julia/src/interpreter.c:522
eval_body at /home/leios/builds/julia/src/interpreter.c:522
jl_interpret_toplevel_thunk at /home/leios/builds/julia/src/interpreter.c:750
jl_toplevel_eval_flex at /home/leios/builds/julia/src/toplevel.c:912
jl_toplevel_eval_flex at /home/leios/builds/julia/src/toplevel.c:856
ijl_toplevel_eval_in at /home/leios/builds/julia/src/toplevel.c:971
eval at ./boot.jl:370 [inlined]
include_string at ./loading.jl:1349
_jl_invoke at /home/leios/builds/julia/src/gf.c:2373 [inlined]
ijl_apply_generic at /home/leios/builds/julia/src/gf.c:2574
_include at ./loading.jl:1409
include at ./client.jl:472
unknown function (ip: 0x7f01d5300062)
_jl_invoke at /home/leios/builds/julia/src/gf.c:2373 [inlined]
ijl_apply_generic at /home/leios/builds/julia/src/gf.c:2574
jl_apply at /home/leios/builds/julia/src/julia.h:1841 [inlined]
do_call at /home/leios/builds/julia/src/interpreter.c:126
eval_value at /home/leios/builds/julia/src/interpreter.c:215
eval_stmt_value at /home/leios/builds/julia/src/interpreter.c:166 [inlined]
eval_body at /home/leios/builds/julia/src/interpreter.c:598
jl_interpret_toplevel_thunk at /home/leios/builds/julia/src/interpreter.c:750
jl_toplevel_eval_flex at /home/leios/builds/julia/src/toplevel.c:912
jl_toplevel_eval_flex at /home/leios/builds/julia/src/toplevel.c:856
ijl_toplevel_eval_in at /home/leios/builds/julia/src/toplevel.c:971
eval at ./boot.jl:370 [inlined]
eval_user_input at /home/leios/builds/julia/usr/share/julia/stdlib/v1.9/REPL/src/REPL.jl:152
repl_backend_loop at /home/leios/builds/julia/usr/share/julia/stdlib/v1.9/REPL/src/REPL.jl:247
start_repl_backend at /home/leios/builds/julia/usr/share/julia/stdlib/v1.9/REPL/src/REPL.jl:232
#run_repl#47 at /home/leios/builds/julia/usr/share/julia/stdlib/v1.9/REPL/src/REPL.jl:369
run_repl at /home/leios/builds/julia/usr/share/julia/stdlib/v1.9/REPL/src/REPL.jl:355
jfptr_run_repl_51305 at /home/leios/builds/julia/usr/lib/julia/sys.so (unknown line)
_jl_invoke at /home/leios/builds/julia/src/gf.c:2373 [inlined]
ijl_apply_generic at /home/leios/builds/julia/src/gf.c:2574
#963 at ./client.jl:415
jfptr_YY.963_37757 at /home/leios/builds/julia/usr/lib/julia/sys.so (unknown line)
_jl_invoke at /home/leios/builds/julia/src/gf.c:2373 [inlined]
ijl_apply_generic at /home/leios/builds/julia/src/gf.c:2574
jl_apply at /home/leios/builds/julia/src/julia.h:1841 [inlined]
jl_f__call_latest at /home/leios/builds/julia/src/builtins.c:774
run_main_repl at ./client.jl:400
exec_options at ./client.jl:314
_start at ./client.jl:516
jfptr__start_35163 at /home/leios/builds/julia/usr/lib/julia/sys.so (unknown line)
_jl_invoke at /home/leios/builds/julia/src/gf.c:2373 [inlined]
ijl_apply_generic at /home/leios/builds/julia/src/gf.c:2574
jl_apply at /home/leios/builds/julia/src/julia.h:1841 [inlined]
true_main at /home/leios/builds/julia/src/jlapi.c:566
jl_repl_entrypoint at /home/leios/builds/julia/src/jlapi.c:710
main at /home/leios/builds/julia/cli/loader_exe.c:59
__libc_start_call_main at /usr/lib/libc.so.6 (unknown line)
__libc_start_main at /usr/lib/libc.so.6 (unknown line)
_start at /home/leios/builds/julia/julia (unknown line)
Allocations: 50925879 (Pool: 50891399; Big: 34480); GC: 51
Segmentation fault (core dumped)

@leios
Copy link
Author

leios commented May 22, 2022

Honestly a bit surprised to see __atomic_add here. I noticed the shared_ptr underneath and naively thought this might be due to an add on shmem, but here is the test I am running:

using CUDA
using Enzyme
using Test

function mul_kernel(A)
    i = threadIdx().x
    if i <= length(A)
        A[i] *= A[i]
    end
    return nothing
end

function grad_mul_kernel(A, dA)
    Enzyme.autodiff_deferred(mul_kernel, Const, Duplicated(A, dA))
    return nothing
end

@testset "mul_kernel" begin
    A = CUDA.ones(64,)
    @cuda threads=length(A) mul_kernel(A)
    dA = similar(A)
    dA .= 1
    @cuda threads=length(A) grad_mul_kernel(A, dA)

    println(A, '\n', dA)
    @test all(dA .== 2)
end

No shmem or add there

@wsmoses
Copy link
Member

wsmoses commented May 22, 2022

The atomic add is coming from the derivative. It appears that A[i] wasn’t proven an entirely thread local location so it falls back to an atomic add

edit: or never mind that is coming from some other calling context code

@maleadt
Copy link

maleadt commented May 23, 2022

ThreadSafeContext at /home/leios/builds/julia/usr/include/llvm/ExecutionEngine/Orc/ThreadSafeModule.h:29 [inlined]

Are you using Julia 1.9? That version is not supported. You need to use an LLVM assertions build based on Julia's release-1.8 branch for now.

@leios
Copy link
Author

leios commented May 23, 2022

I was using Julia v 1.7.1 for the initial segfault, but built the lastest version of Julia for the llvm assertions segfault above. Here is the error on release-1.8:

julia> include("enzyme_cuda_test.jl")
[ Info: Precompiling CUDA [052768ef-5323-5732-b1bb-66c8b64840ba]
WARNING: could not import Printf.ini_hex into BFloat16s
WARNING: could not import Printf.ini_HEX into BFloat16s
[ Info: Precompiling Enzyme [7da242da-08ed-463a-9acd-ee780be4f1d9]

signal (11): Segmentation fault
in expression starting at /home/leios/projects/CESMIX/tests/enzyme_cuda_test.jl:18
_ZN4llvm10DwarfDebug23emitInitialLocDirectiveERKNS_15MachineFunctionEj at /home/leios/builds/julia/usr/bin/../lib/libLLVM-13jl.so (unknown line)
_ZN4llvm10AsmPrinter31emitInitialRawDwarfLocDirectiveERKNS_15MachineFunctionE at /home/leios/builds/julia/usr/bin/../lib/libLLVM-13jl.so (unknown line)
_ZN4llvm15NVPTXAsmPrinter22emitFunctionEntryLabelEv at /home/leios/builds/julia/usr/bin/../lib/libLLVM-13jl.so (unknown line)
_ZN4llvm10AsmPrinter18emitFunctionHeaderEv at /home/leios/builds/julia/usr/bin/../lib/libLLVM-13jl.so (unknown line)
_ZN4llvm10AsmPrinter16emitFunctionBodyEv at /home/leios/builds/julia/usr/bin/../lib/libLLVM-13jl.so (unknown line)
_ZN4llvm15NVPTXAsmPrinter20runOnMachineFunctionERNS_15MachineFunctionE at /home/leios/builds/julia/usr/bin/../lib/libLLVM-13jl.so (unknown line)
_ZN4llvm19MachineFunctionPass13runOnFunctionERNS_8FunctionE at /home/leios/builds/julia/usr/bin/../lib/libLLVM-13jl.so (unknown line)
_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE at /home/leios/builds/julia/usr/bin/../lib/libLLVM-13jl.so (unknown line)
_ZN4llvm13FPPassManager11runOnModuleERNS_6ModuleE at /home/leios/builds/julia/usr/bin/../lib/libLLVM-13jl.so (unknown line)
_ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE at /home/leios/builds/julia/usr/bin/../lib/libLLVM-13jl.so (unknown line)
_ZL21LLVMTargetMachineEmitP23LLVMOpaqueTargetMachineP16LLVMOpaqueModuleRN4llvm17raw_pwrite_streamE19LLVMCodeGenFileTypePPc at /home/leios/builds/julia/usr/bin/../lib/libLLVM-13jl.so (unknown line)
LLVMTargetMachineEmitToMemoryBuffer at /home/leios/builds/julia/usr/bin/../lib/libLLVM-13jl.so (unknown line)
LLVMTargetMachineEmitToMemoryBuffer at /home/leios/.julia/packages/LLVM/szqwr/lib/13/libLLVM_h.jl:947 [inlined]
emit at /home/leios/.julia/packages/LLVM/szqwr/src/targetmachine.jl:45
mcgen at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/mcgen.jl:74
unknown function (ip: 0x7f1c9a8d462f)
_jl_invoke at /home/leios/builds/julia/src/gf.c:2339 [inlined]
ijl_apply_generic at /home/leios/builds/julia/src/gf.c:2540
macro expansion at /home/leios/.julia/packages/TimerOutputs/LDL7n/src/TimerOutput.jl:252 [inlined]
macro expansion at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/driver.jl:421 [inlined]
macro expansion at /home/leios/.julia/packages/TimerOutputs/LDL7n/src/TimerOutput.jl:252 [inlined]
macro expansion at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/driver.jl:418 [inlined]
#emit_asm#143 at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/utils.jl:64
emit_asm##kw at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/utils.jl:62 [inlined]
cufunction_compile at /home/leios/.julia/packages/CUDA/SrFuA/src/compiler/execution.jl:337
#260 at /home/leios/.julia/packages/CUDA/SrFuA/src/compiler/execution.jl:330 [inlined]
JuliaContext at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/driver.jl:74
unknown function (ip: 0x7f1c9a8f23ba)
_jl_invoke at /home/leios/builds/julia/src/gf.c:2339 [inlined]
ijl_apply_generic at /home/leios/builds/julia/src/gf.c:2540
cufunction_compile at /home/leios/.julia/packages/CUDA/SrFuA/src/compiler/execution.jl:329
cached_compilation at /home/leios/.julia/packages/GPUCompiler/XyxTy/src/cache.jl:90
#cufunction#255 at /home/leios/.julia/packages/CUDA/SrFuA/src/compiler/execution.jl:301
cufunction at /home/leios/.julia/packages/CUDA/SrFuA/src/compiler/execution.jl:295
unknown function (ip: 0x7f1c9a8f1d0d)
_jl_invoke at /home/leios/builds/julia/src/gf.c:2339 [inlined]
ijl_apply_generic at /home/leios/builds/julia/src/gf.c:2540
jl_apply at /home/leios/builds/julia/src/julia.h:1831 [inlined]
do_call at /home/leios/builds/julia/src/interpreter.c:126
eval_value at /home/leios/builds/julia/src/interpreter.c:215
eval_body at /home/leios/builds/julia/src/interpreter.c:467
eval_body at /home/leios/builds/julia/src/interpreter.c:522
eval_body at /home/leios/builds/julia/src/interpreter.c:522
jl_interpret_toplevel_thunk at /home/leios/builds/julia/src/interpreter.c:750
jl_toplevel_eval_flex at /home/leios/builds/julia/src/toplevel.c:906
jl_toplevel_eval_flex at /home/leios/builds/julia/src/toplevel.c:850
ijl_toplevel_eval_in at /home/leios/builds/julia/src/toplevel.c:965
eval at ./boot.jl:368 [inlined]
include_string at ./loading.jl:1277
_jl_invoke at /home/leios/builds/julia/src/gf.c:2339 [inlined]
ijl_apply_generic at /home/leios/builds/julia/src/gf.c:2540
_include at ./loading.jl:1334
include at ./client.jl:476
unknown function (ip: 0x7f1d14160582)
_jl_invoke at /home/leios/builds/julia/src/gf.c:2339 [inlined]
ijl_apply_generic at /home/leios/builds/julia/src/gf.c:2540
jl_apply at /home/leios/builds/julia/src/julia.h:1831 [inlined]
do_call at /home/leios/builds/julia/src/interpreter.c:126
eval_value at /home/leios/builds/julia/src/interpreter.c:215
eval_stmt_value at /home/leios/builds/julia/src/interpreter.c:166 [inlined]
eval_body at /home/leios/builds/julia/src/interpreter.c:598
jl_interpret_toplevel_thunk at /home/leios/builds/julia/src/interpreter.c:750
jl_toplevel_eval_flex at /home/leios/builds/julia/src/toplevel.c:906
jl_toplevel_eval_flex at /home/leios/builds/julia/src/toplevel.c:850
ijl_toplevel_eval_in at /home/leios/builds/julia/src/toplevel.c:965
eval at ./boot.jl:368 [inlined]
eval_user_input at /home/leios/builds/julia/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:151
repl_backend_loop at /home/leios/builds/julia/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:247
start_repl_backend at /home/leios/builds/julia/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:232
#run_repl#47 at /home/leios/builds/julia/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:369
run_repl at /home/leios/builds/julia/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:356
jfptr_run_repl_63419 at /home/leios/builds/julia/usr/lib/julia/sys.so (unknown line)
_jl_invoke at /home/leios/builds/julia/src/gf.c:2339 [inlined]
ijl_apply_generic at /home/leios/builds/julia/src/gf.c:2540
#964 at ./client.jl:419
jfptr_YY.964_46638 at /home/leios/builds/julia/usr/lib/julia/sys.so (unknown line)
_jl_invoke at /home/leios/builds/julia/src/gf.c:2339 [inlined]
ijl_apply_generic at /home/leios/builds/julia/src/gf.c:2540
jl_apply at /home/leios/builds/julia/src/julia.h:1831 [inlined]
jl_f__call_latest at /home/leios/builds/julia/src/builtins.c:769
#invokelatest#2 at ./essentials.jl:729 [inlined]
invokelatest at ./essentials.jl:727 [inlined]
run_main_repl at ./client.jl:404
exec_options at ./client.jl:318
_start at ./client.jl:522
jfptr__start_50700 at /home/leios/builds/julia/usr/lib/julia/sys.so (unknown line)
_jl_invoke at /home/leios/builds/julia/src/gf.c:2339 [inlined]
ijl_apply_generic at /home/leios/builds/julia/src/gf.c:2540
jl_apply at /home/leios/builds/julia/src/julia.h:1831 [inlined]
true_main at /home/leios/builds/julia/src/jlapi.c:567
jl_repl_entrypoint at /home/leios/builds/julia/src/jlapi.c:711
main at /home/leios/builds/julia/cli/loader_exe.c:59
__libc_start_call_main at /usr/lib/libc.so.6 (unknown line)
__libc_start_main at /usr/lib/libc.so.6 (unknown line)
_start at /home/leios/builds/julia/julia (unknown line)
Allocations: 93837590 (Pool: 93793947; Big: 43643); GC: 85
Segmentation fault (core dumped)

@leios
Copy link
Author

leios commented May 23, 2022

This looks quite similar to the 1.7.1 segfault to me. Maybe I built julia with LLVM ASSERTIONS wrong? I created a Make.user with:

[leios@noema julia]$ cat Make.user 
LLVM_ASSERTIONS=1
FORCE_ASSERTIONS=1

and then ran make -j 12. Did I need to do something else to specifically use the Make.user file?

@maleadt
Copy link

maleadt commented May 23, 2022

You can check with LLVM.jl if your LLVM build has assertions.

@leios
Copy link
Author

leios commented May 23, 2022

Pasting here (so I remember how to check whether I built with LLVM assertions in the future):

julia> llvm_assertions = try
           cglobal((:_ZN4llvm24DisableABIBreakingChecksE, Base.libllvm_path()), Cvoid)
           false
       catch
           true
       end
true

@maleadt
Copy link

maleadt commented May 23, 2022

Can you dump the bitcode before it calls emit (and crashes)? Does that file then also crash when emitting using llc?

@leios
Copy link
Author

leios commented May 23, 2022

Here is a quick reproducer (made with @device_code dir = "/tmp/whatever" @cuda threads=length(A) grad_mul_kernel(A, dA)

reproducer.txt

I think you can run with llc reproducer.ll (renamed from reproducer.txt)

@maleadt
Copy link

maleadt commented May 23, 2022

Reduced:

; ModuleID = 'reduced.ll'
source_filename = "text"
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v16:16:16-v32:32:32-v64:64:64-v128:128:128-n16:32:64"
target triple = "nvptx64-nvidia-cuda"

define void @kernel() local_unnamed_addr {
top:
  %0 = load float, float addrspace(1)* undef, align 4
  %1 = fmul float %0, %0
  store float %1, float addrspace(1)* undef, align 4, !dbg !4
  ret void
}

!llvm.module.flags = !{!0, !1}
!llvm.dbg.cu = !{!2}

!0 = !{i32 2, !"Dwarf Version", i32 4}
!1 = !{i32 2, !"Debug Info Version", i32 3}
!2 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !3, isOptimized: false, runtimeVersion: 0, emissionKind: LineTablesOnly)
!3 = !DIFile(filename: "foo", directory: ".")
!4 = !DILocation(line: 0, scope: !5)
!5 = distinct !DISubprogram(scope: !6, spFlags: DISPFlagDefinition, unit: !2)
!6 = !DIFile(filename: "bar.jl", directory: ".")

@maleadt
Copy link

maleadt commented May 23, 2022

Looks fixed on LLVM ToT. I'll try bisecting.

@maleadt
Copy link

maleadt commented May 23, 2022

dd75a6b2ae5c9c6628fb855473dc2f31073440d0 is the first good commit
commit dd75a6b2ae5c9c6628fb855473dc2f31073440d0
Author: Johannes Doerfert <johannes@jdoerfert.de>
Date:   Wed Oct 27 17:00:41 2021 -0500

    [DWARF][FIX] Try not to crash for nvptx with missing debug information

    This prevents crashes in the OpenMP offload pipeline as not everything
    is properly annotated with debug information, e.g., the runtimes we link
    in. While we might want to have them annotated, it seems to be generally
    useful to gracefully handle missing debug info rather than crashing.

    TODO: A test is missing and can hopefully be distilled prior to landing.

    This fixes #51079.

    Differential Revision: https://reviews.llvm.org/D116959

 llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp         |  5 ++++
 .../DebugInfo/NVPTX/crash-missing-DISubprogram.ll  | 27 ++++++++++++++++++++++
 2 files changed, 32 insertions(+)
 create mode 100644 llvm/test/DebugInfo/NVPTX/crash-missing-DISubprogram.ll

@vchuravy This (and the reduced bitcode) should help you figure out what's wrong. Backporting the commit is the easy fix, but Enzyme.jl is probably somehow botching the debug info.

@vchuravy
Copy link
Member

vchuravy commented May 23, 2022 via email

@wsmoses
Copy link
Member

wsmoses commented May 25, 2022

@maleadt @vchuravy this appears not to be a bug in Enzyme.jl, since even the original module itself (i.e. without Enzyme codegen) triggers the problem (and namely there is no debug info on the function itself, whereas the instructions inside have it).

mod = ; ModuleID = 'text'
source_filename = "text"
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v16:16:16-v32:32:32-v64:64:64-v128:128:128-n16:32:64-ni:10:11:12:13"
target triple = "nvptx64-nvidia-cuda"

%printf_args.3 = type { i64 }

@0 = private unnamed_addr constant [36 x i8] c"ERROR: Out-of-bounds array access.\0A\00", align 1
@exception = private unnamed_addr constant [10 x i8] c"exception\00", align 1
@1 = private unnamed_addr constant [110 x i8] c"WARNING: could not signal exception status to the host, execution will continue.\0A         Please file a bug.\0A\00", align 1
@2 = private unnamed_addr constant [108 x i8] c"ERROR: a %s was thrown during kernel execution.\0A       Run Julia on debug level 2 for device stack traces.\0A\00", align 1

; Function Attrs: readnone
declare {}*** @julia.get_pgcstack() local_unnamed_addr #0

; Function Attrs: noinline noreturn
define internal fastcc noalias nonnull align 536870912 dereferenceable(4294967295) {} addrspace(10)* @julia__throw_boundserror_3669() unnamed_addr #1 !dbg !39 {
top:
  %0 = call {}*** @julia.get_pgcstack()
  %1 = call i32 @vprintf(i8* noundef getelementptr inbounds ([36 x i8], [36 x i8]* @0, i64 0, i64 0), i8* noundef null), !dbg !41
  call fastcc void @gpu_report_exception(i64 noundef ptrtoint ([10 x i8]* @exception to i64)), !dbg !57
  call fastcc void @gpu_signal_exception(), !dbg !57
  call void asm sideeffect "exit;", ""() #4, !dbg !57
  unreachable, !dbg !57
}

declare i32 @vprintf(i8*, i8*) local_unnamed_addr

define internal fastcc void @gpu_report_exception(i64 noundef zeroext %0) unnamed_addr #2 !dbg !58 {
top:
  %1 = alloca %printf_args.3, align 8
  %2 = call {}*** @julia.get_pgcstack()
  %3 = bitcast %printf_args.3* %1 to i8*, !dbg !59
  call void @llvm.lifetime.start.p0i8(i64 noundef 8, i8* noundef nonnull %3), !dbg !59
  %4 = getelementptr inbounds %printf_args.3, %printf_args.3* %1, i64 0, i32 0, !dbg !59
  store i64 ptrtoint ([10 x i8]* @exception to i64), i64* %4, align 8, !dbg !59
  %5 = call i32 @vprintf(i8* noundef getelementptr inbounds ([108 x i8], [108 x i8]* @2, i64 0, i64 0), i8* noundef nonnull %3), !dbg !59
  call void @llvm.lifetime.end.p0i8(i64 noundef 8, i8* noundef nonnull %3), !dbg !59
  ret void, !dbg !66
}

define internal fastcc void @gpu_signal_exception() unnamed_addr #2 !dbg !67 {
top:
  %0 = call {}*** @julia.get_pgcstack()
  %state.i = call [1 x i64] @julia.gpu.state_getter(), !dbg !68
  %state.i.fca.0.extract = extractvalue [1 x i64] %state.i, 0, !dbg !68
  %.not = icmp eq i64 %state.i.fca.0.extract, 0, !dbg !77
  br i1 %.not, label %L12, label %L8, !dbg !77

L8:                                               ; preds = %top
  %1 = inttoptr i64 %state.i.fca.0.extract to i64*, !dbg !78
  store i64 1, i64* %1, align 1, !dbg !78, !tbaa !83
  call void @llvm.nvvm.membar.sys(), !dbg !87
  br label %L15, !dbg !90

L12:                                              ; preds = %top
  %2 = call i32 @vprintf(i8* noundef getelementptr inbounds ([110 x i8], [110 x i8]* @1, i64 0, i64 0), i8* noundef null), !dbg !91
  br label %L15, !dbg !91

L15:                                              ; preds = %L12, %L8
  ret void, !dbg !97
}

; Function Attrs: readnone
declare [1 x i64] @julia.gpu.state_getter() local_unnamed_addr #3

; Function Attrs: nounwind
declare void @llvm.nvvm.membar.sys() #4

; Function Attrs: argmemonly nofree nosync nounwind willreturn
declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #5

; Function Attrs: argmemonly nofree nosync nounwind willreturn
declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #5

define void @julia_mul_kernel_3665_inner7({ i8 addrspace(1)*, i64, [1 x i64], i64 } %0) local_unnamed_addr #6 {
entry:
  %1 = alloca [1 x i64], align 8
  %2 = alloca { i8 addrspace(1)*, i64, [1 x i64], i64 }, align 8
  %.fca.0.extract = extractvalue { i8 addrspace(1)*, i64, [1 x i64], i64 } %0, 0
  %.fca.0.gep = getelementptr inbounds { i8 addrspace(1)*, i64, [1 x i64], i64 }, { i8 addrspace(1)*, i64, [1 x i64], i64 }* %2, i64 0, i32 0
  store i8 addrspace(1)* %.fca.0.extract, i8 addrspace(1)** %.fca.0.gep, align 8
  %.fca.1.extract = extractvalue { i8 addrspace(1)*, i64, [1 x i64], i64 } %0, 1
  %.fca.1.gep = getelementptr inbounds { i8 addrspace(1)*, i64, [1 x i64], i64 }, { i8 addrspace(1)*, i64, [1 x i64], i64 }* %2, i64 0, i32 1
  store i64 %.fca.1.extract, i64* %.fca.1.gep, align 8
  %.fca.2.0.extract = extractvalue { i8 addrspace(1)*, i64, [1 x i64], i64 } %0, 2, 0
  %.fca.2.0.gep = getelementptr inbounds { i8 addrspace(1)*, i64, [1 x i64], i64 }, { i8 addrspace(1)*, i64, [1 x i64], i64 }* %2, i64 0, i32 2, i64 0
  store i64 %.fca.2.0.extract, i64* %.fca.2.0.gep, align 8
  %.fca.3.extract = extractvalue { i8 addrspace(1)*, i64, [1 x i64], i64 } %0, 3
  %.fca.3.gep = getelementptr inbounds { i8 addrspace(1)*, i64, [1 x i64], i64 }, { i8 addrspace(1)*, i64, [1 x i64], i64 }* %2, i64 0, i32 3
  store i64 %.fca.3.extract, i64* %.fca.3.gep, align 8
  %3 = bitcast [1 x i64]* %1 to i8*
  call void @llvm.lifetime.start.p0i8(i64 noundef 8, i8* noundef nonnull align 8 dereferenceable(8) %3) #7
  %4 = call {}*** @julia.get_pgcstack()
  %5 = getelementptr inbounds [1 x i64], [1 x i64]* %1, i64 0, i64 0, !dbg !98
  store i64 1, i64* %5, align 8, !dbg !98, !tbaa !109
  %6 = icmp slt i64 %.fca.2.0.extract, 1, !dbg !111
  br i1 %6, label %L14.i, label %julia_mul_kernel_3665_inner.exit, !dbg !131

L14.i:                                            ; preds = %entry
  %7 = call fastcc nonnull {} addrspace(10)* @julia__throw_boundserror_3669() #8, !dbg !131
  unreachable, !dbg !131

julia_mul_kernel_3665_inner.exit:                 ; preds = %entry
  %8 = bitcast i8 addrspace(1)* %.fca.0.extract to float addrspace(1)*, !dbg !131
  store float 2.000000e+00, float addrspace(1)* %8, align 4, !dbg !132, !tbaa !144
  call void @llvm.lifetime.end.p0i8(i64 noundef 8, i8* noundef nonnull %3), !dbg !147
  ret void
}

attributes #0 = { readnone "enzyme_inactive" }
attributes #1 = { noinline noreturn "probe-stack"="inline-asm" }
attributes #2 = { "enzyme_inactive" "probe-stack"="inline-asm" }
attributes #3 = { readnone }
attributes #4 = { nounwind }
attributes #5 = { argmemonly nofree nosync nounwind willreturn }
attributes #6 = { "probe-stack"="inline-asm" }
attributes #7 = { willreturn }
attributes #8 = { noreturn "probe-stack"="inline-asm" }

!llvm.module.flags = !{!0, !1}
!llvm.dbg.cu = !{!2, !5, !7, !9, !10, !11, !13, !14, !15, !16, !17, !18, !19, !20, !21, !22, !23, !24, !25, !26, !27, !28, !29, !30, !31, !32, !33, !34, !35, !36, !38}

!0 = !{i32 2, !"Dwarf Version", i32 4}
!1 = !{i32 2, !"Debug Info Version", i32 3}
!2 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !3, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!3 = !DIFile(filename: "/mnt/Data/git/Enzyme.jl/hil.jl", directory: ".")
!4 = !{}
!5 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !6, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!6 = !DIFile(filename: "/home/wmoses/.julia/packages/CUDA/fAEDi/src/device/quirks.jl", directory: ".")
!7 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!8 = !DIFile(filename: "/home/wmoses/.julia/packages/GPUCompiler/XyxTy/src/runtime.jl", directory: ".")
!9 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!10 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!11 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !12, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!12 = !DIFile(filename: "/home/wmoses/.julia/packages/CUDA/fAEDi/src/device/runtime.jl", directory: ".")
!13 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!14 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!15 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !12, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!16 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!17 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!18 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !12, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!19 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!20 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !12, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!21 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!22 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!23 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!24 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!25 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!26 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!27 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!28 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!29 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!30 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!31 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!32 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!33 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!34 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!35 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !8, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!36 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !37, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!37 = !DIFile(filename: "/home/wmoses/.julia/packages/CUDA/fAEDi/src/device/intrinsics/memory_dynamic.jl", directory: ".")
!38 = distinct !DICompileUnit(language: DW_LANG_Julia, file: !12, producer: "julia", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !4, nameTableKind: None)
!39 = distinct !DISubprogram(name: "#throw_boundserror", linkageName: "julia_#throw_boundserror_3669", scope: null, file: !6, line: 40, type: !40, scopeLine: 40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !5, retainedNodes: !4)
!40 = !DISubroutineType(types: !4)
!41 = !DILocation(line: 45, scope: !42, inlinedAt: !44)
!42 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !43, file: !43, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !5, retainedNodes: !4)
!43 = !DIFile(filename: "/home/wmoses/.julia/packages/LLVM/gE6U9/src/interop/base.jl", directory: ".")
!44 = !DILocation(line: 38, scope: !45, inlinedAt: !47)
!45 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !46, file: !46, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !5, retainedNodes: !4)
!46 = !DIFile(filename: "/home/wmoses/.julia/packages/CUDA/fAEDi/src/device/intrinsics/output.jl", directory: ".")
!47 = !DILocation(line: 38, scope: !48, inlinedAt: !49)
!48 = distinct !DISubprogram(name: "_cuprintf;", linkageName: "_cuprintf", scope: !46, file: !46, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !5, retainedNodes: !4)
!49 = !DILocation(line: 173, scope: !45, inlinedAt: !50)
!50 = !DILocation(line: 0, scope: !51, inlinedAt: !53)
!51 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !52, file: !52, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !5, retainedNodes: !4)
!52 = !DIFile(filename: "none", directory: ".")
!53 = !DILocation(line: 0, scope: !54, inlinedAt: !55)
!54 = distinct !DISubprogram(name: "_cuprint;", linkageName: "_cuprint", scope: !52, file: !52, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !5, retainedNodes: !4)
!55 = !DILocation(line: 222, scope: !45, inlinedAt: !56)
!56 = !DILocation(line: 3, scope: !39)
!57 = !DILocation(line: 4, scope: !39)
!58 = distinct !DISubprogram(name: "report_exception", linkageName: "julia_report_exception_2467", scope: null, file: !12, line: 49, type: !40, scopeLine: 49, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !11, retainedNodes: !4)
!59 = !DILocation(line: 45, scope: !60, inlinedAt: !61)
!60 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !43, file: !43, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !11, retainedNodes: !4)
!61 = !DILocation(line: 38, scope: !62, inlinedAt: !63)
!62 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !46, file: !46, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !11, retainedNodes: !4)
!63 = !DILocation(line: 38, scope: !64, inlinedAt: !65)
!64 = distinct !DISubprogram(name: "_cuprintf;", linkageName: "_cuprintf", scope: !46, file: !46, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !11, retainedNodes: !4)
!65 = !DILocation(line: 50, scope: !58)
!66 = !DILocation(line: 54, scope: !58)
!67 = distinct !DISubprogram(name: "signal_exception", linkageName: "julia_signal_exception_2745", scope: null, file: !12, line: 35, type: !40, scopeLine: 35, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !20, retainedNodes: !4)
!68 = !DILocation(line: 45, scope: !69, inlinedAt: !70)
!69 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !43, file: !43, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !20, retainedNodes: !4)
!70 = !DILocation(line: 0, scope: !71, inlinedAt: !72)
!71 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !52, file: !52, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !20, retainedNodes: !4)
!72 = !DILocation(line: 0, scope: !73, inlinedAt: !74)
!73 = distinct !DISubprogram(name: "kernel_state;", linkageName: "kernel_state", scope: !52, file: !52, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !20, retainedNodes: !4)
!74 = !DILocation(line: 33, scope: !75, inlinedAt: !76)
!75 = distinct !DISubprogram(name: "exception_flag;", linkageName: "exception_flag", scope: !12, file: !12, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !20, retainedNodes: !4)
!76 = !DILocation(line: 36, scope: !67)
!77 = !DILocation(line: 37, scope: !67)
!78 = !DILocation(line: 118, scope: !79, inlinedAt: !81)
!79 = distinct !DISubprogram(name: "unsafe_store!;", linkageName: "unsafe_store!", scope: !80, file: !80, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !20, retainedNodes: !4)
!80 = !DIFile(filename: "pointer.jl", directory: ".")
!81 = !DILocation(line: 118, scope: !79, inlinedAt: !82)
!82 = !DILocation(line: 38, scope: !67)
!83 = !{!84, !84, i64 0}
!84 = !{!"jtbaa_data", !85, i64 0}
!85 = !{!"jtbaa", !86, i64 0}
!86 = !{!"jtbaa"}
!87 = !DILocation(line: 121, scope: !88, inlinedAt: !90)
!88 = distinct !DISubprogram(name: "threadfence_system;", linkageName: "threadfence_system", scope: !89, file: !89, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !20, retainedNodes: !4)
!89 = !DIFile(filename: "/home/wmoses/.julia/packages/CUDA/fAEDi/src/device/intrinsics/synchronization.jl", directory: ".")
!90 = !DILocation(line: 39, scope: !67)
!91 = !DILocation(line: 45, scope: !69, inlinedAt: !92)
!92 = !DILocation(line: 38, scope: !93, inlinedAt: !94)
!93 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !46, file: !46, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !20, retainedNodes: !4)
!94 = !DILocation(line: 38, scope: !95, inlinedAt: !96)
!95 = distinct !DISubprogram(name: "_cuprintf;", linkageName: "_cuprintf", scope: !46, file: !46, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !20, retainedNodes: !4)
!96 = !DILocation(line: 41, scope: !67)
!97 = !DILocation(line: 46, scope: !67)
!98 = !DILocation(line: 654, scope: !99, inlinedAt: !101)
!99 = distinct !DISubprogram(name: "checkbounds;", linkageName: "checkbounds", scope: !100, file: !100, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!100 = !DIFile(filename: "abstractarray.jl", directory: ".")
!101 = distinct !DILocation(line: 151, scope: !102, inlinedAt: !104)
!102 = distinct !DISubprogram(name: "#arrayset;", linkageName: "#arrayset", scope: !103, file: !103, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!103 = !DIFile(filename: "/home/wmoses/.julia/packages/CUDA/fAEDi/src/device/array.jl", directory: ".")
!104 = distinct !DILocation(line: 194, scope: !105, inlinedAt: !106)
!105 = distinct !DISubprogram(name: "setindex!;", linkageName: "setindex!", scope: !103, file: !103, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!106 = distinct !DILocation(line: 6, scope: !107, inlinedAt: !108)
!107 = distinct !DISubprogram(name: "mul_kernel", linkageName: "julia_mul_kernel_3665", scope: null, file: !3, line: 5, type: !40, scopeLine: 5, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!108 = distinct !DILocation(line: 0, scope: !107)
!109 = !{!110, !110, i64 0}
!110 = !{!"jtbaa_stack", !85, i64 0}
!111 = !DILocation(line: 479, scope: !112, inlinedAt: !114)
!112 = distinct !DISubprogram(name: "max;", linkageName: "max", scope: !113, file: !113, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!113 = !DIFile(filename: "promotion.jl", directory: ".")
!114 = distinct !DILocation(line: 398, scope: !115, inlinedAt: !117)
!115 = distinct !DISubprogram(name: "OneTo;", linkageName: "OneTo", scope: !116, file: !116, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!116 = !DIFile(filename: "range.jl", directory: ".")
!117 = distinct !DILocation(line: 411, scope: !115, inlinedAt: !118)
!118 = distinct !DILocation(line: 413, scope: !119, inlinedAt: !120)
!119 = distinct !DISubprogram(name: "oneto;", linkageName: "oneto", scope: !116, file: !116, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!120 = distinct !DILocation(line: 221, scope: !121, inlinedAt: !123)
!121 = distinct !DISubprogram(name: "map;", linkageName: "map", scope: !122, file: !122, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!122 = !DIFile(filename: "tuple.jl", directory: ".")
!123 = distinct !DILocation(line: 95, scope: !124, inlinedAt: !125)
!124 = distinct !DISubprogram(name: "axes;", linkageName: "axes", scope: !100, file: !100, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!125 = distinct !DILocation(line: 116, scope: !126, inlinedAt: !127)
!126 = distinct !DISubprogram(name: "axes1;", linkageName: "axes1", scope: !100, file: !100, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!127 = distinct !DILocation(line: 335, scope: !128, inlinedAt: !129)
!128 = distinct !DISubprogram(name: "eachindex;", linkageName: "eachindex", scope: !100, file: !100, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!129 = distinct !DILocation(line: 641, scope: !99, inlinedAt: !130)
!130 = distinct !DILocation(line: 656, scope: !99, inlinedAt: !101)
!131 = !DILocation(line: 656, scope: !99, inlinedAt: !101)
!132 = !DILocation(line: 45, scope: !133, inlinedAt: !134)
!133 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !43, file: !43, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!134 = distinct !DILocation(line: 44, scope: !135, inlinedAt: !137)
!135 = distinct !DISubprogram(name: "macro expansion;", linkageName: "macro expansion", scope: !136, file: !136, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!136 = !DIFile(filename: "/home/wmoses/.julia/packages/LLVM/gE6U9/src/interop/pointer.jl", directory: ".")
!137 = distinct !DILocation(line: 44, scope: !138, inlinedAt: !139)
!138 = distinct !DISubprogram(name: "pointerset;", linkageName: "pointerset", scope: !136, file: !136, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!139 = distinct !DILocation(line: 84, scope: !140, inlinedAt: !141)
!140 = distinct !DISubprogram(name: "unsafe_store!;", linkageName: "unsafe_store!", scope: !136, file: !136, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!141 = distinct !DILocation(line: 162, scope: !142, inlinedAt: !143)
!142 = distinct !DISubprogram(name: "arrayset_bits;", linkageName: "arrayset_bits", scope: !103, file: !103, type: !40, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !2, retainedNodes: !4)
!143 = distinct !DILocation(line: 153, scope: !102, inlinedAt: !104)
!144 = !{!145, !145, i64 0, i64 0}
!145 = !{!"custom_tbaa_addrspace(1)", !146, i64 0}
!146 = !{!"custom_tbaa"}
!147 = !DILocation(line: 7, scope: !107, inlinedAt: !108)

@wsmoses
Copy link
Member

wsmoses commented May 25, 2022

Okay, turns out its in an Enzyme.jl wrapper LLVM function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants