Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ubuntu16.04 build from source caffe2 #12913

Open
jyh890622 opened this issue Oct 21, 2018 · 10 comments
Open

ubuntu16.04 build from source caffe2 #12913

jyh890622 opened this issue Oct 21, 2018 · 10 comments
Labels

Comments

@jyh890622
Copy link

❓ Questions and Help

Please note that this issue tracker is not a help form and this issue will be closed.

cm@cm-Vostro-2421:~/pytorch$ python setup.py install
Building wheel torch-1.0.0a0+ed02619
running install
setup.py::run()
running build_deps
setup.py::build_deps::run()

  • SYNC_COMMAND=cp
    ++ command -v rsync
  • '[' -x /usr/bin/rsync ']'
  • SYNC_COMMAND='rsync -lptgoD'
  • CMAKE_COMMAND=cmake
    ++ command -v cmake3
  • [[ -x '' ]]
  • USE_CUDA=0
  • USE_ROCM=0
  • USE_NNPACK=0
  • USE_MKLDNN=0
  • USE_GLOO_IBVERBS=0
  • CAFFE2_STATIC_LINK_CUDA=0
  • RERUN_CMAKE=1
  • [[ 8 -gt 0 ]]
  • case "$1" in
  • USE_CUDA=1
  • shift
  • [[ 7 -gt 0 ]]
  • case "$1" in
  • USE_NNPACK=1
  • shift
  • [[ 6 -gt 0 ]]
  • case "$1" in
  • break
  • CMAKE_INSTALL='make install'
  • BUILD_SHARED_LIBS=ON
  • USER_CFLAGS=
  • USER_LDFLAGS=
  • [[ -n '' ]]
  • [[ -n '' ]]
  • [[ -n '' ]]
    ++ uname
  • '[' Linux == Darwin ']'
    +++ dirname ../tools/build_pytorch_libs.sh
    ++ cd ../tools/..
    +++ pwd
    ++ printf '%q\n' /home/cm/pytorch
  • BASE_DIR=/home/cm/pytorch
  • TORCH_LIB_DIR=/home/cm/pytorch/torch/lib
  • INSTALL_DIR=/home/cm/pytorch/torch/lib/tmp_install
  • THIRD_PARTY_DIR=/home/cm/pytorch/third_party
  • C_FLAGS=' -I"/home/cm/pytorch/torch/lib/tmp_install/include" -I"/home/cm/pytorch/torch/lib/tmp_install/include/TH" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THC" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THS" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THCS" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THNN" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THCUNN"'
  • C_FLAGS=' -I"/home/cm/pytorch/torch/lib/tmp_install/include" -I"/home/cm/pytorch/torch/lib/tmp_install/include/TH" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THC" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THS" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THCS" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THNN" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THCUNN" -DOMPI_SKIP_MPICXX=1'
  • LDFLAGS='-L"/home/cm/pytorch/torch/lib/tmp_install/lib" '
  • LD_POSTFIX=.so
    ++ uname
  • [[ Linux == \D\a\r\w\i\n ]]
  • [[ 0 -eq 1 ]]
  • LDFLAGS='-L"/home/cm/pytorch/torch/lib/tmp_install/lib" -Wl,-rpath,$ORIGIN'
  • CPP_FLAGS=' -std=c++11 '
  • GLOO_FLAGS='-DBUILD_TEST=OFF '
  • THD_FLAGS=
  • NCCL_ROOT_DIR=/home/cm/pytorch/torch/lib/tmp_install
  • [[ 1 -eq 1 ]]
  • GLOO_FLAGS+='-DUSE_CUDA=1 -DNCCL_ROOT_DIR=/home/cm/pytorch/torch/lib/tmp_install'
  • [[ 0 -eq 1 ]]
  • CWRAP_FILES='/home/cm/pytorch/torch/lib/ATen/Declarations.cwrap;/home/cm/pytorch/torch/lib/THNN/generic/THNN.h;/home/cm/pytorch/torch/lib/THCUNN/generic/THCUNN.h;/home/cm/pytorch/torch/lib/ATen/nn.yaml'
  • CUDA_NVCC_FLAGS=' -I"/home/cm/pytorch/torch/lib/tmp_install/include" -I"/home/cm/pytorch/torch/lib/tmp_install/include/TH" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THC" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THS" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THCS" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THNN" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THCUNN" -DOMPI_SKIP_MPICXX=1'
  • [[ -z '' ]]
  • CUDA_DEVICE_DEBUG=0
  • '[' -z '' ']'
    ++ getconf _NPROCESSORS_ONLN
  • MAX_JOBS=4
  • BUILD_TYPE=Release
  • [[ -n '' ]]
  • [[ -n '' ]]
  • echo 'Building in Release mode'
    Building in Release mode
  • mkdir -p /home/cm/pytorch/torch/lib/tmp_install
  • for arg in '"$@"'
  • [[ nccl == \n\c\c\l ]]
  • pushd /home/cm/pytorch/third_party
    ~/pytorch/third_party ~/pytorch/build
  • build_nccl
  • mkdir -p build/nccl
  • pushd build/nccl
    ~/pytorch/third_party/build/nccl ~/pytorch/third_party ~/pytorch/build
  • [[ 1 -eq 1 ]]
  • cmake ../../nccl -DCMAKE_MODULE_PATH=/home/cm/pytorch/cmake/Modules_CUDA_fix -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/home/cm/pytorch/torch/lib/tmp_install '-DCMAKE_C_FLAGS= -I"/home/cm/pytorch/torch/lib/tmp_install/include" -I"/home/cm/pytorch/torch/lib/tmp_install/include/TH" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THC" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THS" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THCS" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THNN" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THCUNN" -DOMPI_SKIP_MPICXX=1 ' '-DCMAKE_CXX_FLAGS= -I"/home/cm/pytorch/torch/lib/tmp_install/include" -I"/home/cm/pytorch/torch/lib/tmp_install/include/TH" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THC" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THS" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THCS" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THNN" -I"/home/cm/pytorch/torch/lib/tmp_install/include/THCUNN" -DOMPI_SKIP_MPICXX=1 -std=c++11 ' -DCMAKE_SHARED_LINKER_FLAGS= -DCMAKE_UTILS_PATH=/home/cm/pytorch/cmake/public/utils.cmake -DNUM_JOBS=4
    -- The C compiler identification is GNU 5.4.0
    -- The CXX compiler identification is GNU 5.4.0
    -- Check for working C compiler: /usr/bin/cc
    -- Check for working C compiler: /usr/bin/cc -- works
    -- Detecting C compiler ABI info
    -- Detecting C compiler ABI info - done
    -- Detecting C compile features
    -- Detecting C compile features - done
    -- Check for working CXX compiler: /usr/bin/c++
    -- Check for working CXX compiler: /usr/bin/c++ -- works
    -- Detecting CXX compiler ABI info
    -- Detecting CXX compiler ABI info - done
    -- Detecting CXX compile features
    -- Detecting CXX compile features - done
    -- Looking for pthread.h
    -- Looking for pthread.h - found
    -- Looking for pthread_create
    -- Looking for pthread_create - not found
    -- Looking for pthread_create in pthreads
    -- Looking for pthread_create in pthreads - not found
    -- Looking for pthread_create in pthread
    -- Looking for pthread_create in pthread - found
    -- Found Threads: TRUE
    -- Found CUDA: /usr/local/cuda (found suitable version "8.0", minimum required is "7.0")
    -- Autodetected CUDA architecture(s): 2.1(2.0)
    -- Set NVCC_GENCODE for building NCCL: -gencode=arch=compute_20,code=sm_21
    -- Configuring done
    -- Generating done
    -- Build files have been written to: /home/cm/pytorch/third_party/build/nccl
  • make install -j4
    Scanning dependencies of target nccl
    [100%] Generating lib/libnccl.so
    make[3]: 警告: 子 make 中强制 -jN: 关闭 jobserver 模式。
    Generating nccl.h.in > nccl.h
    Compiling init.cu > /home/cm/pytorch/third_party/build/nccl/obj/init.o
    Compiling ring.cu > /home/cm/pytorch/third_party/build/nccl/obj/ring.o
    Compiling bootstrap.cu > /home/cm/pytorch/third_party/build/nccl/obj/bootstrap.o
    Compiling transport.cu > /home/cm/pytorch/third_party/build/nccl/obj/transport.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    init.cu:52:1: warning: ‘ncclNet’ initialized and declared ‘extern’
    ncclNet_t* ncclNet = NULL;
    ^
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Compiling misc/group.cu > /home/cm/pytorch/third_party/build/nccl/obj/misc/group.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Compiling misc/nvmlwrap.cu > /home/cm/pytorch/third_party/build/nccl/obj/misc/nvmlwrap.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Compiling misc/ibvwrap.cu > /home/cm/pytorch/third_party/build/nccl/obj/misc/ibvwrap.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Compiling misc/rings.cu > /home/cm/pytorch/third_party/build/nccl/obj/misc/rings.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    Compiling misc/utils.cu > /home/cm/pytorch/third_party/build/nccl/obj/misc/utils.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Compiling misc/enqueue.cu > /home/cm/pytorch/third_party/build/nccl/obj/misc/enqueue.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Compiling transport/p2p.cu > /home/cm/pytorch/third_party/build/nccl/obj/transport/p2p.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Compiling transport/shm.cu > /home/cm/pytorch/third_party/build/nccl/obj/transport/shm.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Compiling transport/net.cu > /home/cm/pytorch/third_party/build/nccl/obj/transport/net.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    Compiling transport/net_socket.cu > /home/cm/pytorch/third_party/build/nccl/obj/transport/net_socket.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Compiling transport/net_ib.cu > /home/cm/pytorch/third_party/build/nccl/obj/transport/net_ib.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Compiling collectives/all_reduce.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/all_reduce.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Compiling collectives/all_gather.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/all_gather.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Compiling collectives/broadcast.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/broadcast.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Compiling collectives/reduce.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/reduce.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Compiling collectives/reduce_scatter.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/reduce_scatter.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Compiling all_reduce.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/all_reduce_prod.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Grabbing nccl.h > /home/cm/pytorch/third_party/build/nccl/include/nccl.h
    Compiling broadcast.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/broadcast_prod.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Compiling reduce.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/reduce_prod.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Compiling all_gather.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/all_gather_prod.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    Compiling reduce_scatter.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/reduce_scatter_prod.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    Compiling all_reduce.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/all_reduce_min.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    Compiling broadcast.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/broadcast_min.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    Compiling reduce.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/reduce_min.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Compiling all_gather.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/all_gather_min.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    Compiling reduce_scatter.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/reduce_scatter_min.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    Compiling all_reduce.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/all_reduce_max.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    Compiling broadcast.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/broadcast_max.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    Compiling reduce.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/reduce_max.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    Compiling all_gather.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/all_gather_max.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    Compiling reduce_scatter.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/reduce_scatter_max.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Compiling all_reduce.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/all_reduce_sum.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    Compiling broadcast.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/broadcast_sum.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    Compiling reduce.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/reduce_sum.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    Compiling all_gather.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/all_gather_sum.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    Compiling reduce_scatter.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/reduce_scatter_sum.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    Compiling functions.cu > /home/cm/pytorch/third_party/build/nccl/obj/collectives/device/functions.o
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    ptxas warning : Too big maxrregcount value specified 96, will be ignored
    nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    nvlink fatal : Internal error: reference to deleted section
    Makefile:83: recipe for target '/home/cm/pytorch/third_party/build/nccl/obj/collectives/device/devlink.o' failed
    make[5]: *** [/home/cm/pytorch/third_party/build/nccl/obj/collectives/device/devlink.o] Error 1
    Makefile:45: recipe for target 'devicelib' failed
    make[4]: *** [devicelib] Error 2
    Makefile:24: recipe for target 'src.build' failed
    make[3]: *** [src.build] Error 2
    CMakeFiles/nccl.dir/build.make:60: recipe for target 'lib/libnccl.so' failed
    make[2]: *** [lib/libnccl.so] Error 2
    CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/nccl.dir/all' failed
    make[1]: *** [CMakeFiles/nccl.dir/all] Error 2
    Makefile:127: recipe for target 'all' failed
    make: *** [all] Error 2
    Failed to run 'bash ../tools/build_pytorch_libs.sh --use-cuda --use-nnpack nccl caffe2 libshm gloo c10d THD'
@zou3519 zou3519 added the caffe2 label Oct 22, 2018
@zhaiyuqiang
Copy link

zhaiyuqiang commented Nov 3, 2018

I have the same problem, how to solve it?
my cudnn version is 7.0, does it matter?

@gerhc
Copy link

gerhc commented Nov 3, 2018

I got this error too, the reason was I was trying to compile master instead of the 0.4.1 tag.

@II-Matto
Copy link

II-Matto commented Nov 3, 2018

Thanks, @gerhc. I also encountered the nvlink fatal : Internal error: reference to deleted section error with the latest clone, and after I switched to the 0.4.1 tag as you mentioned, everything went well.

Hi, @jyh890622 @zhaiyuqiang. You can use git checkout tags/v0.4.1 and git submodule update --recursive to checkout the 0.4.1 tag and then rebuild Caffe2 to see whether the errors will be resolved.

@zhaiyuqiang
Copy link

thank you very much, @II-Matto @gerhc
After I reinstalled nccl, this problem has gone.

$ git clone https://github.com/NVIDIA/nccl.git

$ cd nccl

$ sudo make -j4

and then copy build/lib and build/include files to /usr/local/

@Mrils
Copy link

Mrils commented Nov 6, 2018

@zhaiyuqiang
Hello,which version of nccl do you use? After i reinstalled nccl, i got this error:
[ 86%] Linking CXX executable ../bin/batch_matmul_op_gpu_test
/usr/local/cuda/lib64/libnccl.so.2: undefined reference to `__fatbinwrap_66_tmpxft_00001101_00000000_13_cuda_device_runtime_compute_62_cpp1_ii_8b1a5d37'
collect2: error: ld returned 1 exit status

Did you meet this error and solved it?

@lizxchen
Copy link

@zhaiyuqiang I reinstalled nccl,but it also has the problem.What should I do?

@YingXiuHe
Copy link

i solved this problem, maybe you can try to delete this two folders of nccl and nccl_external-prefix under the build folder when you get this problem, and try to run 'python setup.py install' again.

@zhaiyuqiang
Copy link

I just git pull the lastest nccl repo and the lastest commit info is

commit b56650c7f59b8cd40d18809784a6d6be38ef8acb
Author: David Addison <daddison@nvidia.com>
Date:   Wed Oct 24 14:44:59 2018 -0700

    2.3.7-1
    
    Improved LL tuning for multi-node jobs.
    Improved bootstrap for large job scaling.
    Fixed a hang during bootstrap due to socket reuse.
    Added operation name to the COLL INFO logging.

and I didn't change anything.

@zhaiyuqiang
Copy link

After you reinstalled nccl, Still the same problem? @lizxchen

@rmorros
Copy link

rmorros commented Nov 29, 2018

I installed a system wide nccl and the problem is solved now:

Download from Nvidia site nccl-repo-ubuntu1604-2.2.13-ga-cuda8.0_1-1_amd64.deb
Change version for your architecture

sudo dpkg -i nccl-repo-ubuntu1604-2.2.13-ga-cuda8.0_1-1_amd64.deb

sudo apt-get install libnccl2
sudo apt-get install libnccl-dev
export USE_SYSTEM_NCCL=ON

python setup.py install
...
-- USE_NCCL : ON
-- USE_SYSTEM_NCCL : ON
...

and the error is gone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants