Skip to content

[AMDGPU] double free or corruption (!prev) in runTreeUpDown while linking RCCL #165062

@cgmb

Description

@cgmb

When building RCCL using LLVM 21 on Debian, I am seeing a crash in the linking stage. Admittedly, since RCCL is built using -fgpu-rdc, that is most of the compilation process. The crashing function is void (anonymous namespace)::runTreeUpDown<unsigned int, FuncPreMulSum<unsigned int>, ProtoSimple<1, 1, 2, 0, 0>, 2>(int, int, ncclDevWorkColl*) as per the backtrace:

double free or corruption (!prev)
PLEASE submit a bug report to 
https://github.com/llvm/llvm-project/issues/ and include the crash 
backtrace.
Stack dump:
0.    Running pass 'CallGraph Pass Manager' on module 'ld-temp.o'.
1.    Running pass 'Machine Instruction Scheduler' on function 
'@_ZN12_GLOBAL__N_113runTreeUpDownIj13FuncPreMulSumIjE11ProtoSimpleILi1ELi1ELi2ELi0ELi0EELi2EEEviiP15ncclDevWorkColl'
double free or corruption (!prev)
clang++-21: error: unable to execute command: Aborted
clang++-21: error: amdgcn-link command failed due to signal (use -v to 
see invocation)
Debian clang version 21.1.4 (1)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/lib/llvm-21/bin
clang++-21: warning: treating 'c' input as 'c++' when in C++ mode, this 
behavior is deprecated [-Wdeprecated]
clang++-21: warning: treating 'c' input as 'c++' when in C++ mode, this 
behavior is deprecated [-Wdeprecated]
clang++-21: warning: treating 'c' input as 'c++' when in C++ mode, this 
behavior is deprecated [-Wdeprecated]
clang++-21: note: diagnostic msg: Error generating preprocessed source(s).
failed to execute:/usr/bin/clang++-21 --driver-mode=g++ --hip-link 
--offload-arch=gfx803 --offload-arch=gfx900 --offload-arch=gfx906 
--offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx1010 
--offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 
--offload-arch=gfx1102  -fPIC -gz -g -O2 
-ffile-prefix-map=/root/rccl/rccl=. -Xarch_host -fstack-protector-strong 
-fstack-clash-protection -Wformat -Werror=format-security -Xarch_host 
-fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2 -O3 -DNDEBUG -Xlinker 
--dependency-file=CMakeFiles/rccl.dir/link.d -v -Wl,-z,relro -Wl,-z,now 
-shared -Wl,-soname,librccl.so.1 -o "librccl.so.1.0" 
CMakeFiles/rccl.dir/hipify/src/bootstrap.cc.o 
CMakeFiles/rccl.dir/hipify/src/channel.cc.o 
CMakeFiles/rccl.dir/hipify/src/collectives.cc.o 
CMakeFiles/rccl.dir/hipify/src/debug.cc.o 
CMakeFiles/rccl.dir/hipify/src/enqueue.cc.o 
CMakeFiles/rccl.dir/hipify/src/group.cc.o 
CMakeFiles/rccl.dir/hipify/src/init.cc.o 
CMakeFiles/rccl.dir/hipify/src/init_nvtx.cc.o 
CMakeFiles/rccl.dir/hipify/src/net.cc.o 
CMakeFiles/rccl.dir/hipify/src/msccl.cc.o 
CMakeFiles/rccl.dir/hipify/src/proxy.cc.o 
CMakeFiles/rccl.dir/hipify/src/rccl_wrap.cc.o 
CMakeFiles/rccl.dir/hipify/src/register.cc.o 
CMakeFiles/rccl.dir/hipify/src/transport.cc.o 
CMakeFiles/rccl.dir/hipify/src/device/common.cu.cpp.o 
CMakeFiles/rccl.dir/hipify/src/device/onerank.cu.cpp.o 
CMakeFiles/rccl.dir/hipify/src/graph/connect.cc.o 
CMakeFiles/rccl.dir/hipify/src/graph/paths.cc.o 
CMakeFiles/rccl.dir/hipify/src/graph/rings.cc.o 
CMakeFiles/rccl.dir/hipify/src/graph/rome_models.cc.o 
CMakeFiles/rccl.dir/hipify/src/graph/search.cc.o 
CMakeFiles/rccl.dir/hipify/src/graph/topo.cc.o 
CMakeFiles/rccl.dir/hipify/src/graph/trees.cc.o 
CMakeFiles/rccl.dir/hipify/src/graph/tuning.cc.o 
CMakeFiles/rccl.dir/hipify/src/graph/xml.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/alt_rsmi.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/archinfo.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/argcheck.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/api_trace.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/ibvsymbols.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/ibvwrap.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/ipcsocket.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/npkit.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/nvmlwrap_stub.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/param.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/profiler.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/rocm_smi_wrap.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/rocmwrap.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/roctx.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/shmutils.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/signals.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/socket.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/strongstream.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/tuner.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/utils.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_lifecycle.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_parser.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_setup.cc.o 
CMakeFiles/rccl.dir/hipify/src/misc/msccl/msccl_status.cc.o 
CMakeFiles/rccl.dir/hipify/src/transport/coll_net.cc.o 
CMakeFiles/rccl.dir/hipify/src/transport/generic.cc.o 
CMakeFiles/rccl.dir/hipify/src/transport/net_tmp.cc.o 
CMakeFiles/rccl.dir/hipify/src/transport/net_ib.cc.o 
CMakeFiles/rccl.dir/hipify/src/transport/net_socket.cc.o 
CMakeFiles/rccl.dir/hipify/src/transport/nvls.cc.o 
CMakeFiles/rccl.dir/hipify/src/transport/p2p.cc.o 
CMakeFiles/rccl.dir/hipify/src/transport/shm.cc.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_gather_sum_i8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_bf16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_bf8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_f8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_u32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_u64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_minmax_u8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_bf16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_bf8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_f8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_u32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_u64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_premulsum_u8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_bf16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_bf8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_f8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_u32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_u64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_prod_u8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_bf16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_bf8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_f8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_u32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_u64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sum_u8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_i32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_i64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_i8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_u32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_u64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/all_reduce_sumpostdiv_u8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/alltoall_pivot_sum_i8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/broadcast_sum_i8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/device_table.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/host_table.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_bf16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_bf8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_f8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_u32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_u64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_minmax_u8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_bf16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_bf8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_f8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_u32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_u64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_premulsum_u8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_bf16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_bf8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_f8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_u32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_u64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_prod_u8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_bf16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_bf8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_f8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_u32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_u64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_minmax_u8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_bf16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_bf8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_f8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_u32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_u64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_premulsum_u8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_bf16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_bf8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_f8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_u32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_u64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_prod_u8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_bf16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_bf8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_f8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_u32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_u64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sum_u8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_i32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_i64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_i8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_u32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_u64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_scatter_sumpostdiv_u8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_bf16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_bf8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f16.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_f8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_u32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_u64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_sum_u8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_i32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_i64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_i8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_u32.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_u64.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/reduce_sumpostdiv_u8.cpp.o 
CMakeFiles/rccl.dir/hipify/gensrc/sendrecv_sum_i8.cpp.o 
CMakeFiles/rccl.dir/git_version.cpp.o -fgpu-rdc -ldl 
/usr/lib/x86_64-linux-gnu/librocm_smi64.so.1.0 
/usr/lib/x86_64-linux-gnu/libamdhip64.so.5.7.31921 --hip-link 
/usr/lib/llvm-21/lib/clang/21/lib/linux/libclang_rt.builtins-x86_64.a 
-lpthread -lrt -ldl

This output was produced using clang-21 1:21.1.4-1 on Debian Unstable. I'll try to follow up on this issue after going thorugh with the instructions on for identifying if this is a front-end, middle-end, or back-end issue and attempting to reproduce the issue with the official LLVM 21.1.4 release.

Originally reported by @tflink in Debian Bug #1118478.

Metadata

Metadata

Assignees

No one assigned

    Labels

    crashPrefer [crash-on-valid] or [crash-on-invalid]needs-reductionLarge reproducer that should be reduced into a simpler form

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions