Skip to content

Segfault during trampoline allocation when querying occupancy from multiple threads #707

Closed
JuliaLang/julia
#39621
@norci

Description

@norci

using the master branch.

I always get this error, in my project.
But I'm not able to reproduce it with a minimal code.

the operation is something like

x ./= sum(x; dims = 2)

Update:
It fails occasionally, when using a few @async tasks.

log:

signal (11): Segmentation fault
in expression starting at REPL[1]:1
trampoline_alloc at /buildworker/worker/package_linux64/build/src/runtime_ccall.cpp:244 [inlined]
jl_get_cfunction_trampoline at /buildworker/worker/package_linux64/build/src/runtime_ccall.cpp:350
#33 at /julia_depot/dev/CUDA/lib/cudadrv/occupancy.jl:64 [inlined]
lock at ./lock.jl:187
unknown function (ip: 0x7fe4ff04df05)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2238 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2420
#launch_configuration#31 at /julia_depot/dev/CUDA/lib/cudadrv/occupancy.jl:63
launch_configuration##kw at /julia_depot/dev/CUDA/lib/cudadrv/occupancy.jl:56 [inlined]
#mapreducedim!#335 at /julia_depot/dev/CUDA/src/mapreduce.jl:194
mapreducedim!##kw at /julia_depot/dev/CUDA/src/mapreduce.jl:143 [inlined]
#_mapreduce#17 at /julia_depot/packages/GPUArrays/WV76E/src/host/mapreduce.jl:62
_mapreduce##kw at /julia_depot/packages/GPUArrays/WV76E/src/host/mapreduce.jl:34 [inlined]
#mapreduce#15 at /julia_depot/packages/GPUArrays/WV76E/src/host/mapreduce.jl:28 [inlined]
mapreduce at /julia_depot/packages/GPUArrays/WV76E/src/host/mapreduce.jl:28 [inlined]
#_sum#684 at ./reducedim.jl:878 [inlined]
_sum at ./reducedim.jl:878 [inlined]
#_sum#683 at ./reducedim.jl:877 [inlined]
_sum at ./reducedim.jl:877 [inlined]
#sum#681 at ./reducedim.jl:873 [inlined]
sum at ./reducedim.jl:873 [inlined]
unknown function (ip: 0x7fe4ff170e31)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2238 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2420
run at /julia_depot/dev/ReinforcementLearningCore/src/core/experiment.jl:45
unknown function (ip: 0x7fe4fc27e9af)
#63 at /julia_depot/dev/CUDA/src/state.jl:540 [inlined]
task_local_storage at ./task.jl:276
stream! at /julia_depot/dev/CUDA/src/state.jl:537
macro expansion at /julia_depot/dev/CUDA/lib/nvtx/highlevel.jl:73 [inlined]
#31 at ./threadingconstructs.jl:169
unknown function (ip: 0x7fe4fc27ab7c)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2238 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2420
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1703 [inlined]
start_task at /buildworker/worker/package_linux64/build/src/task.c:839
unknown function (ip: (nil))
Allocations: 422072602 (Pool: 421921897; Big: 150705); GC: 145
Segmentation fault (core dumped)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions