Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

install error #944

Closed
wangguangyuan opened this issue Mar 23, 2019 · 2 comments
Closed

install error #944

wangguangyuan opened this issue Mar 23, 2019 · 2 comments

Comments

@wangguangyuan
Copy link

wangguangyuan commented Mar 23, 2019

HOROVOD_NCCL_HOME=/home/guangyuan/lib/nccl_2.4.2-1_cuda10 HOROVOD_GPU_ALLREDUCE=NCCL conda install --no-cache-dir horovod
usage: conda [-h] [-V] command ...
conda: error: unrecognized arguments: --no-cache-dir --user
(tf1.0) [guangyuan@172-16-30-154 ~]$ module load basic
(tf1.0) [guangyuan@172-16-30-154 ~]$ HOROVOD_NCCL_HOME=/home/guangyuan/lib/nccl_2.4.2-1_cuda10 HOROVOD_GPU_ALLREDUCE=NCCL conda install --no-cache-dir horovod --user
usage: conda [-h] [-V] command ...
conda: error: unrecognized arguments: --no-cache-dir --user
(tf1.0) [guangyuan@172-16-30-154 ~]$ HOROVOD_NCCL_HOME=/home/guangyuan/lib/nccl_2.4.2-1_cuda10 HOROVOD_GPU_ALLREDUCE=NCCL pip install --no-cache-dir horovod
Collecting horovod
Downloading https://files.pythonhosted.org/packages/89/70/327e1ce9bee0fb8a879b98f8265fb7a41ae6d04a3ee019b2bafba8b66333/horovod-0.16.1.tar.gz (2.6MB)
100% |████████████████████████████████| 2.6MB 4.0MB/s
Requirement already satisfied: cffi>=1.4.0 in ./miniconda3/envs/tf1.0/lib/python3.6/site-packages (from horovod) (1.12.2)
Requirement already satisfied: cloudpickle in ./miniconda3/envs/tf1.0/lib/python3.6/site-packages (from horovod) (0.8.0)
Requirement already satisfied: psutil in ./miniconda3/envs/tf1.0/lib/python3.6/site-packages (from horovod) (5.6.1)
Requirement already satisfied: six in ./miniconda3/envs/tf1.0/lib/python3.6/site-packages (from horovod) (1.12.0)
Requirement already satisfied: pycparser in ./miniconda3/envs/tf1.0/lib/python3.6/site-packages (from cffi>=1.4.0->horovod) (2.19)
Installing collected packages: horovod
Running setup.py install for horovod ... error
Complete output from command /home/guangyuan/miniconda3/envs/tf1.0/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-install-kskj8uzi/horovod/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-record-6oosh180/install-record.txt --single-version-externally-managed --compile:
running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.6
creating build/lib.linux-x86_64-3.6/horovod
copying horovod/init.py -> build/lib.linux-x86_64-3.6/horovod
creating build/lib.linux-x86_64-3.6/horovod/tensorflow
copying horovod/tensorflow/compression.py -> build/lib.linux-x86_64-3.6/horovod/tensorflow
copying horovod/tensorflow/mpi_ops.py -> build/lib.linux-x86_64-3.6/horovod/tensorflow
copying horovod/tensorflow/init.py -> build/lib.linux-x86_64-3.6/horovod/tensorflow
copying horovod/tensorflow/util.py -> build/lib.linux-x86_64-3.6/horovod/tensorflow
creating build/lib.linux-x86_64-3.6/horovod/_keras
copying horovod/_keras/callbacks.py -> build/lib.linux-x86_64-3.6/horovod/_keras
copying horovod/_keras/init.py -> build/lib.linux-x86_64-3.6/horovod/_keras
creating build/lib.linux-x86_64-3.6/horovod/spark
copying horovod/spark/init.py -> build/lib.linux-x86_64-3.6/horovod/spark
creating build/lib.linux-x86_64-3.6/horovod/run
copying horovod/run/init.py -> build/lib.linux-x86_64-3.6/horovod/run
copying horovod/run/run.py -> build/lib.linux-x86_64-3.6/horovod/run
copying horovod/run/task_fn.py -> build/lib.linux-x86_64-3.6/horovod/run
creating build/lib.linux-x86_64-3.6/horovod/keras
copying horovod/keras/callbacks.py -> build/lib.linux-x86_64-3.6/horovod/keras
copying horovod/keras/init.py -> build/lib.linux-x86_64-3.6/horovod/keras
creating build/lib.linux-x86_64-3.6/horovod/torch
copying horovod/torch/compression.py -> build/lib.linux-x86_64-3.6/horovod/torch
copying horovod/torch/mpi_ops.py -> build/lib.linux-x86_64-3.6/horovod/torch
copying horovod/torch/init.py -> build/lib.linux-x86_64-3.6/horovod/torch
creating build/lib.linux-x86_64-3.6/horovod/common
copying horovod/common/init.py -> build/lib.linux-x86_64-3.6/horovod/common
creating build/lib.linux-x86_64-3.6/horovod/mxnet
copying horovod/mxnet/mpi_ops.py -> build/lib.linux-x86_64-3.6/horovod/mxnet
copying horovod/mxnet/init.py -> build/lib.linux-x86_64-3.6/horovod/mxnet
creating build/lib.linux-x86_64-3.6/horovod/tensorflow/keras
copying horovod/tensorflow/keras/callbacks.py -> build/lib.linux-x86_64-3.6/horovod/tensorflow/keras
copying horovod/tensorflow/keras/init.py -> build/lib.linux-x86_64-3.6/horovod/tensorflow/keras
creating build/lib.linux-x86_64-3.6/horovod/spark/task
copying horovod/spark/task/mpirun_exec_fn.py -> build/lib.linux-x86_64-3.6/horovod/spark/task
copying horovod/spark/task/init.py -> build/lib.linux-x86_64-3.6/horovod/spark/task
copying horovod/spark/task/task_service.py -> build/lib.linux-x86_64-3.6/horovod/spark/task
creating build/lib.linux-x86_64-3.6/horovod/spark/driver
copying horovod/spark/driver/job_id.py -> build/lib.linux-x86_64-3.6/horovod/spark/driver
copying horovod/spark/driver/mpirun_rsh.py -> build/lib.linux-x86_64-3.6/horovod/spark/driver
copying horovod/spark/driver/init.py -> build/lib.linux-x86_64-3.6/horovod/spark/driver
copying horovod/spark/driver/driver_service.py -> build/lib.linux-x86_64-3.6/horovod/spark/driver
creating build/lib.linux-x86_64-3.6/horovod/run/task
copying horovod/run/task/init.py -> build/lib.linux-x86_64-3.6/horovod/run/task
copying horovod/run/task/task_service.py -> build/lib.linux-x86_64-3.6/horovod/run/task
creating build/lib.linux-x86_64-3.6/horovod/run/util
copying horovod/run/util/network.py -> build/lib.linux-x86_64-3.6/horovod/run/util
copying horovod/run/util/cache.py -> build/lib.linux-x86_64-3.6/horovod/run/util
copying horovod/run/util/threads.py -> build/lib.linux-x86_64-3.6/horovod/run/util
copying horovod/run/util/init.py -> build/lib.linux-x86_64-3.6/horovod/run/util
creating build/lib.linux-x86_64-3.6/horovod/run/driver
copying horovod/run/driver/init.py -> build/lib.linux-x86_64-3.6/horovod/run/driver
copying horovod/run/driver/driver_service.py -> build/lib.linux-x86_64-3.6/horovod/run/driver
creating build/lib.linux-x86_64-3.6/horovod/run/common
copying horovod/run/common/init.py -> build/lib.linux-x86_64-3.6/horovod/run/common
creating build/lib.linux-x86_64-3.6/horovod/run/common/service
copying horovod/run/common/service/init.py -> build/lib.linux-x86_64-3.6/horovod/run/common/service
copying horovod/run/common/service/driver_service.py -> build/lib.linux-x86_64-3.6/horovod/run/common/service
copying horovod/run/common/service/task_service.py -> build/lib.linux-x86_64-3.6/horovod/run/common/service
creating build/lib.linux-x86_64-3.6/horovod/run/common/util
copying horovod/run/common/util/host_hash.py -> build/lib.linux-x86_64-3.6/horovod/run/common/util
copying horovod/run/common/util/timeout.py -> build/lib.linux-x86_64-3.6/horovod/run/common/util
copying horovod/run/common/util/secret.py -> build/lib.linux-x86_64-3.6/horovod/run/common/util
copying horovod/run/common/util/network.py -> build/lib.linux-x86_64-3.6/horovod/run/common/util
copying horovod/run/common/util/safe_shell_exec.py -> build/lib.linux-x86_64-3.6/horovod/run/common/util
copying horovod/run/common/util/init.py -> build/lib.linux-x86_64-3.6/horovod/run/common/util
copying horovod/run/common/util/codec.py -> build/lib.linux-x86_64-3.6/horovod/run/common/util
creating build/lib.linux-x86_64-3.6/horovod/torch/mpi_lib
copying horovod/torch/mpi_lib/init.py -> build/lib.linux-x86_64-3.6/horovod/torch/mpi_lib
creating build/lib.linux-x86_64-3.6/horovod/torch/mpi_lib_impl
copying horovod/torch/mpi_lib_impl/init.py -> build/lib.linux-x86_64-3.6/horovod/torch/mpi_lib_impl
running build_ext
gcc -pthread -B /home/guangyuan/miniconda3/envs/tf1.0/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -std=c++11 -fPIC -O2 -Wall -mf16c -mavx -I/home/guangyuan/miniconda3/envs/tf1.0/include/python3.6m -c build/temp.linux-x86_64-3.6/test_compile/test_cpp_flags.cc -o build/temp.linux-x86_64-3.6/test_compile/test_cpp_flags.o
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default]
gcc -pthread -shared -B /home/guangyuan/miniconda3/envs/tf1.0/compiler_compat -L/home/guangyuan/miniconda3/envs/tf1.0/lib -Wl,-rpath=/home/guangyuan/miniconda3/envs/tf1.0/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.6/test_compile/test_cpp_flags.o -o build/temp.linux-x86_64-3.6/test_compile/test_cpp_flags.so
gcc -pthread -B /home/guangyuan/miniconda3/envs/tf1.0/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/guangyuan/miniconda3/envs/tf1.0/include/python3.6m -c build/temp.linux-x86_64-3.6/test_compile/test_link_flags.cc -o build/temp.linux-x86_64-3.6/test_compile/test_link_flags.o
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default]
gcc -pthread -shared -B /home/guangyuan/miniconda3/envs/tf1.0/compiler_compat -L/home/guangyuan/miniconda3/envs/tf1.0/lib -Wl,-rpath=/home/guangyuan/miniconda3/envs/tf1.0/lib -Wl,--no-as-needed -Wl,--sysroot=/ -Wl,--version-script=horovod.lds build/temp.linux-x86_64-3.6/test_compile/test_link_flags.o -o build/temp.linux-x86_64-3.6/test_compile/test_link_flags.so
gcc -pthread -B /home/guangyuan/miniconda3/envs/tf1.0/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -std=c++11 -fPIC -O2 -Wall -mf16c -mavx -I/usr/local/cuda/include -I/home/guangyuan/miniconda3/envs/tf1.0/include/python3.6m -c build/temp.linux-x86_64-3.6/test_compile/test_cuda.cc -o build/temp.linux-x86_64-3.6/test_compile/test_cuda.o
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default]
gcc -pthread -shared -B /home/guangyuan/miniconda3/envs/tf1.0/compiler_compat -L/home/guangyuan/miniconda3/envs/tf1.0/lib -Wl,-rpath=/home/guangyuan/miniconda3/envs/tf1.0/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.6/test_compile/test_cuda.o -L/usr/local/cuda/lib -L/usr/local/cuda/lib64 -lcudart -o build/temp.linux-x86_64-3.6/test_compile/test_cuda.so
gcc -pthread -B /home/guangyuan/miniconda3/envs/tf1.0/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -std=c++11 -fPIC -O2 -Wall -mf16c -mavx -I/home/guangyuan/lib/nccl_2.4.2-1_cuda10/include -I/usr/local/cuda/include -I/home/guangyuan/miniconda3/envs/tf1.0/include/python3.6m -c build/temp.linux-x86_64-3.6/test_compile/test_nccl.cc -o build/temp.linux-x86_64-3.6/test_compile/test_nccl.o
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default]
gcc -pthread -shared -B /home/guangyuan/miniconda3/envs/tf1.0/compiler_compat -L/home/guangyuan/miniconda3/envs/tf1.0/lib -Wl,-rpath=/home/guangyuan/miniconda3/envs/tf1.0/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.6/test_compile/test_nccl.o -L/home/guangyuan/lib/nccl_2.4.2-1_cuda10/lib -L/home/guangyuan/lib/nccl_2.4.2-1_cuda10/lib64 -L/usr/local/cuda/lib -L/usr/local/cuda/lib64 -lnccl_static -o build/temp.linux-x86_64-3.6/test_compile/test_nccl.so
collect2: error: ld returned 1 exit status
error: NCCL 2.0 library or its later version was not found (see error above).
Please specify correct NCCL location with the HOROVOD_NCCL_HOME environment variable or combination of HOROVOD_NCCL_INCLUDE and HOROVOD_NCCL_LIB environment variables.

HOROVOD_NCCL_HOME - path where NCCL include and lib directories can be found
HOROVOD_NCCL_INCLUDE - path to NCCL include directory
HOROVOD_NCCL_LIB - path to NCCL lib directory

----------------------------------------

Command "/home/guangyuan/miniconda3/envs/tf1.0/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-install-kskj8uzi/horovod/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-record-6oosh180/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-kskj8uzi/horovod/

@alsrgv
Copy link
Member

alsrgv commented Mar 24, 2019

@wangguangyuan, what's the output of find /home/guangyuan/lib/nccl_2.4.2-1_cuda10?

@stale
Copy link

stale bot commented Nov 7, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Nov 7, 2020
@stale stale bot closed this as completed Nov 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants