cudaMemcpyAsync failed: invalid argument during training #404

ppwwyyxx · 2018-07-26T20:31:46Z

Software:
horovod 0.13.10
TF: v1.9.0-0-g25c197e023 1.9.0
cuda 9.0
openmpi 2.1.1
NCCL 2.2.13

I ran a job on 6 nodes (48 GPUs). It's a very normal horovod job with a allreduce every step, and a broadcast once a while. It runs well for 17 hours until horovod throws this error on rank 3:

Caused by op 'HVDAllReduce/HorovodAllreduce_gradients_maskrcnn_conv_BiasAdd_grad_BiasAddGrad_0', defined at:
  File "./train.py[1,3]<stderr>:", line 607, in <module>
    launch_train_with_config(traincfg, trainer)
  File "/HOME/tensorpack/tensorpack/train/interface.py", line 83, in launch_train_with_config
    model._build_graph_get_cost, model.get_optimizer)
  File "/HOME/tensorpack/tensorpack/utils/argtools.py", line 181, in wrapper
    return func(*args, **kwargs)
  File "/HOME/tensorpack/tensorpack/train/tower.py", line 172, in setup_graph
    train_callbacks = self._setup_graph(input, get_cost_fn, get_opt_fn)
  File "/HOME/tensorpack/tensorpack/train/trainers.py", line 378, in _setup_graph
  File "/HOME/tensorpack/tensorpack/train/trainers.py", line 369, in allreduce
    avg_grad = hvd.allreduce(grad, average=self._average)
  File "/HOME/.local/lib/python3.6/site-packages/horovod/tensorflow/__init__.py", line 82, in allreduce
    summed_tensor = _allreduce(tensor)
  File "/HOME/.local/lib/python3.6/site-packages/horovod/tensorflow/mpi_ops.py", line 78, in _allreduce
    return MPI_LIB.horovod_allreduce(tensor, name=name)
  File "<string>", line 50, in horovod_allreduce
  File "/HOME/.local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/HOME/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3414, in create_op
    op_def=op_def)
  File "/HOME/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1740, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

UnknownError (see above for traceback): cudaMemcpyAsync failed: invalid argument
   [[Node: HVDAllReduce/HorovodAllreduce_gradients_maskrcnn_conv_BiasAdd_grad_BiasAddGrad_0 = HorovodAllreduce[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](gradients/maskrcnn/conv/BiasAdd_grad/BiasAddGrad)]]

Post it in case someone sees similar issues. I understand this is probably not a reproducible error, or perhaps it's not a horovod issue.

The text was updated successfully, but these errors were encountered:

winwinJJiang · 2018-07-26T20:41:29Z

I met the same issue when training and stops.

alsrgv · 2018-07-28T01:07:48Z

@ppwwyyxx, @winwinJJiang, anything in dmesg?

ppwwyyxx · 2018-08-02T17:58:15Z

No, nothing was printed in dmesg on the day the error happens.

alsrgv · 2018-08-03T03:15:46Z

I'd recommend running the job with NCCL_DEBUG=INFO. That may shed some light next time it happens.

ppwwyyxx · 2018-08-03T03:25:57Z

The job was run with NCCL_DEBUG=INFO. It only prints some normal stuff at the very beginning of the training and nothing afterwards:

INFO NET : Using interface enp1s0f0:xx.xx.xx.xx<0>
INFO NET/IB : Using interface enp1s0f0 for sideband communication
INFO NET/IB: [0] mlx5_3:1/IB
INFO NET/IB: [1] mlx5_2:1/IB
INFO NET/IB: [2] mlx5_1:1/IB
INFO NET/IB: [3] mlx5_0:1/IB
INFO Using internal Network IB
INFO Using NCCL Low-latency algorithm for sizes below 16384
INFO comm 0x7f9faeec2b80 rank 3 nranks 48
INFO NET : Using interface enp1s0f0:xx.xx.xx.xx<0>
INFO NET/Socket : 1 interfaces found
INFO CUDA Dev 3, IB Ports : mlx5_3/1(SOC) mlx5_2/1(SOC) mlx5_1/1(PIX) mlx5_0/1(PHB)
INFO Ring 00 : 3[3] -> 1[1] via P2P/IPC
INFO Ring 01 : 3[3] -> 7[7] via P2P/IPC
INFO NET/IB: Dev 2 Port 1 qpn 5285 mtu 5 LID 194
INFO Ring 03 : 3[3] -> 2[2] via P2P/IPC

alsrgv · 2018-08-03T03:40:48Z

Gotcha. Actually, since it's failing at cudaMemcpyAsync, I'm guessing it's either this or this operation. It may be a race condition or a CUDA issue. Do you have any sort of repro for it?

ppwwyyxx · 2018-08-03T03:48:55Z

I'm afraid I don't know how to reproduce it. Right now I've only seen it once.

Today I saw an issue (tensorflow/tensorflow#21338) which basically says some unchecked cuda error in a buggy TensorFlow op may leak to other ops, making the other op appear to fail. I guess this is a possible explanation.

andfoy · 2018-09-21T13:17:29Z

I'm receiving this error also in PyTorch, but I'm unable to find a reproduction scenario, it occurs randomly some time after starting a new epoch.

alsrgv · 2018-09-28T00:25:18Z

I have published a branch with debug code to narrow down the invalid argument issue. To install it, use [your normal flags] pip install -v --no-cache-dir git+https://github.com/uber/horovod@debug_before_memcpy.

@andfoy, @ppwwyyxx, @abidmalikwaterloo, could you try running it and report if you observe any issues?

The primary difference in debug branch is that it checks CUDA errors both before and after calls to cudaMemcpyAsync, and also has a unique ID for each call to help narrow it down.

abidmalikwaterloo · 2018-09-28T14:25:43Z

@alsrgv Thanks for this. However, I am not able to install horovod with this

here is the complete track:

(/home/amalik/Pytorch_virtual_enviornment) [amalik@hpc1 hpc1_runs]$ pip uninstall -y horovod
Uninstalling horovod-0.14.1:
Successfully uninstalled horovod-0.14.1
(/home/amalik/Pytorch_virtual_enviornment) [amalik@hpc1 hpc1_runs]$ HOROVOD_GPU_ALLREDUCE=NCCL HOROVOD_NCCL_HOME=/home/amalik/nccl_2.1.4-1+cuda8.0_x86_64 pip install --user -v --no-cache-dir git+https://github.com/uber/horovod@debug_before_memcpy
Created temporary directory: /tmp/pip-ephem-wheel-cache-cw8AP3
Created temporary directory: /tmp/pip-install-OzRcAf
Collecting git+https://github.com/uber/horovod@debug_before_memcpy
Created temporary directory: /tmp/pip-req-build-04qBat
Cloning https://github.com/uber/horovod (to revision debug_before_memcpy) to /tmp/pip-req-build-04qBat
Running command git clone -q https://github.com/uber/horovod /tmp/pip-req-build-04qBat
Running command git show-ref debug_before_memcpy
2d42310 refs/remotes/origin/debug_before_memcpy
Running command git rev-parse HEAD
c9435dc
Running command git checkout -q 2d42310
Running setup.py (path:/tmp/pip-req-build-04qBat/setup.py) egg_info for package from git+https://github.com/uber/horovod@debug_before_memcpy
Running command python setup.py egg_info
running egg_info
creating pip-egg-info/horovod.egg-info
writing requirements to pip-egg-info/horovod.egg-info/requires.txt
writing pip-egg-info/horovod.egg-info/PKG-INFO
writing top-level names to pip-egg-info/horovod.egg-info/top_level.txt
writing dependency_links to pip-egg-info/horovod.egg-info/dependency_links.txt
writing manifest file 'pip-egg-info/horovod.egg-info/SOURCES.txt'
reading manifest file 'pip-egg-info/horovod.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
no previously-included directories found matching '.eggs'
writing manifest file 'pip-egg-info/horovod.egg-info/SOURCES.txt'
Source in /tmp/pip-req-build-04qBat has version 0.14.1, which satisfies requirement horovod==0.14.1 from git+https://github.com/uber/horovod@debug_before_memcpy
Requirement already satisfied: cffi>=1.4.0 in /home/amalik/Pytorch_virtual_enviornment/lib/python2.7/site-packages (from horovod==0.14.1) (1.11.5)
Requirement already satisfied: pycparser in /home/amalik/Pytorch_virtual_enviornment/lib/python2.7/site-packages (from cffi>=1.4.0->horovod==0.14.1) (2.18)
mkl-fft 1.0.4 requires cython, which is not installed.
mkl-random 1.0.1 requires cython, which is not installed.
mxnet-cu90 1.1.0 requires requests==2.18.4, which is not installed.
grpcio 1.14.1 requires enum34>=1.0.4, which is not installed.
tensorflow 1.10.0 requires enum34>=1.1.6, which is not installed.
tensorflow 1.10.0 requires mock>=2.0.0, which is not installed.
tensorflow 1.10.0 has requirement numpy<=1.14.5,>=1.13.3, but you'll have numpy 1.15.1 which is incompatible.
mxnet-cu90 1.1.0 has requirement numpy<=1.13.3, but you'll have numpy 1.15.1 which is incompatible.
Installing collected packages: horovod
Created temporary directory: /tmp/pip-record-lkU3vO
Running setup.py install for horovod ... Running command /home/amalik/Pytorch_virtual_enviornment/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-req-build-04qBat/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-record-lkU3vO/install-record.txt --single-version-externally-managed --compile --user --prefix=
running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-2.7
creating build/lib.linux-x86_64-2.7/horovod
copying horovod/init.py -> build/lib.linux-x86_64-2.7/horovod
creating build/lib.linux-x86_64-2.7/horovod/common
copying horovod/common/init.py -> build/lib.linux-x86_64-2.7/horovod/common
creating build/lib.linux-x86_64-2.7/horovod/keras
copying horovod/keras/init.py -> build/lib.linux-x86_64-2.7/horovod/keras
copying horovod/keras/callbacks.py -> build/lib.linux-x86_64-2.7/horovod/keras
copying horovod/keras/callbacks_impl.py -> build/lib.linux-x86_64-2.7/horovod/keras
copying horovod/keras/impl.py -> build/lib.linux-x86_64-2.7/horovod/keras
creating build/lib.linux-x86_64-2.7/horovod/tensorflow
copying horovod/tensorflow/init.py -> build/lib.linux-x86_64-2.7/horovod/tensorflow
copying horovod/tensorflow/mpi_ops.py -> build/lib.linux-x86_64-2.7/horovod/tensorflow
creating build/lib.linux-x86_64-2.7/horovod/torch
copying horovod/torch/init.py -> build/lib.linux-x86_64-2.7/horovod/torch
copying horovod/torch/mpi_ops.py -> build/lib.linux-x86_64-2.7/horovod/torch
creating build/lib.linux-x86_64-2.7/horovod/tensorflow/keras
copying horovod/tensorflow/keras/init.py -> build/lib.linux-x86_64-2.7/horovod/tensorflow/keras
copying horovod/tensorflow/keras/callbacks.py -> build/lib.linux-x86_64-2.7/horovod/tensorflow/keras
creating build/lib.linux-x86_64-2.7/horovod/torch/mpi_lib
copying horovod/torch/mpi_lib/init.py -> build/lib.linux-x86_64-2.7/horovod/torch/mpi_lib
creating build/lib.linux-x86_64-2.7/horovod/torch/mpi_lib_impl
copying horovod/torch/mpi_lib_impl/init.py -> build/lib.linux-x86_64-2.7/horovod/torch/mpi_lib_impl
running build_ext
mpicc -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -std=c++11 -fPIC -O2 -I/home/amalik/Pytorch_virtual_enviornment/include/python2.7 -c build/temp.linux-x86_64-2.7/test_compile/test_cpp_flags.cc -o build/temp.linux-x86_64-2.7/test_compile/test_cpp_flags.o
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
gcc -pthread -shared -B /home/amalik/Pytorch_virtual_enviornment/compiler_compat -L/home/amalik/Pytorch_virtual_enviornment/lib -Wl,-rpath=/home/amalik/Pytorch_virtual_enviornment/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-2.7/test_compile/test_cpp_flags.o -L/home/amalik/Pytorch_virtual_enviornment/lib -o build/temp.linux-x86_64-2.7/test_compile/test_cpp_flags.so
mpicc -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/amalik/Pytorch_virtual_enviornment/include/python2.7 -c build/temp.linux-x86_64-2.7/test_compile/test_link_flags.cc -o build/temp.linux-x86_64-2.7/test_compile/test_link_flags.o
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
gcc -pthread -shared -B /home/amalik/Pytorch_virtual_enviornment/compiler_compat -L/home/amalik/Pytorch_virtual_enviornment/lib -Wl,-rpath=/home/amalik/Pytorch_virtual_enviornment/lib -Wl,--no-as-needed -Wl,--sysroot=/ -Wl,--version-script=horovod.lds build/temp.linux-x86_64-2.7/test_compile/test_link_flags.o -L/home/amalik/Pytorch_virtual_enviornment/lib -o build/temp.linux-x86_64-2.7/test_compile/test_link_flags.so
mpicc -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -std=c++11 -fPIC -O2 -I/usr/local/cuda/include -I/home/amalik/Pytorch_virtual_enviornment/include/python2.7 -c build/temp.linux-x86_64-2.7/test_compile/test_cuda.cc -o build/temp.linux-x86_64-2.7/test_compile/test_cuda.o
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
build/temp.linux-x86_64-2.7/test_compile/test_cuda.cc:1:10: fatal error: cuda_runtime.h: No such file or directory
#include <cuda_runtime.h>
^~~~~~~~~~~~~~~~
compilation terminated.
error: CUDA library was not found (see error above).
Please specify correct CUDA location with the HOROVOD_CUDA_HOME environment variable or combination of HOROVOD_CUDA_INCLUDE and HOROVOD_CUDA_LIB environment variables.

HOROVOD_CUDA_HOME - path where CUDA include and lib directories can be found
HOROVOD_CUDA_INCLUDE - path to CUDA include directory
HOROVOD_CUDA_LIB - path to CUDA lib directory

error
Cleaning up...
Removing source in /tmp/pip-req-build-04qBat
Command "/home/amalik/Pytorch_virtual_enviornment/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-req-build-04qBat/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-record-lkU3vO/install-record.txt --single-version-externally-managed --compile --user --prefix=" failed with error code 1 in /tmp/pip-req-build-04qBat/
Exception information:
Traceback (most recent call last):
File "/home/amalik/Pytorch_virtual_enviornment/lib/python2.7/site-packages/pip/_internal/basecommand.py", line 228, in main
status = self.run(options, args)
File "/home/amalik/Pytorch_virtual_enviornment/lib/python2.7/site-packages/pip/_internal/commands/install.py", line 335, in run
use_user_site=options.use_user_site,
File "/home/amalik/Pytorch_virtual_enviornment/lib/python2.7/site-packages/pip/_internal/req/init.py", line 49, in install_given_reqs
**kwargs
File "/home/amalik/Pytorch_virtual_enviornment/lib/python2.7/site-packages/pip/_internal/req/req_install.py", line 779, in install
spinner=spinner,
File "/home/amalik/Pytorch_virtual_enviornment/lib/python2.7/site-packages/pip/_internal/utils/misc.py", line 698, in call_subprocess
% (command_desc, proc.returncode, cwd))
InstallationError: Command "/home/amalik/Pytorch_virtual_enviornment/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-req-build-04qBat/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-record-lkU3vO/install-record.txt --single-version-externally-managed --compile --user --prefix=" failed with error code 1 in /tmp/pip-req-build-04qBat/
1 location(s) to search for versions of pip:

https://pypi.org/simple/pip/
Getting page https://pypi.org/simple/pip/
Starting new HTTPS connection (1): pypi.org
https://pypi.org:443 "GET /simple/pip/ HTTP/1.1" 200 9980
Analyzing links from page https://pypi.org/simple/pip/
Found link https://files.pythonhosted.org/packages/3d/9d/1e313763bdfb6a48977b65829c6ce2a43eaae29ea2f907c8bbef024a7219/pip-0.2.tar.gz#sha256=88bb8d029e1bf4acd0e04d300104b7440086f94cc1ce1c5c3c31e3293aee1f81 (from https://pypi.org/simple/pip/), version: 0.2
Found link https://files.pythonhosted.org/packages/18/ad/c0fe6cdfe1643a19ef027c7168572dac6283b80a384ddf21b75b921877da/pip-0.2.1.tar.gz#sha256=83522005c1266cc2de97e65072ff7554ac0f30ad369c3b02ff3a764b962048da (from https://pypi.org/simple/pip/), version: 0.2.1
Found link https://files.pythonhosted.org/packages/17/05/f66144ef69b436d07f8eeeb28b7f77137f80de4bf60349ec6f0f9509e801/pip-0.3.tar.gz#sha256=183c72455cb7f8860ac1376f8c4f14d7f545aeab8ee7c22cd4caf79f35a2ed47 (from https://pypi.org/simple/pip/), version: 0.3
Found link https://files.pythonhosted.org/packages/0a/bb/d087c9a1415f8726e683791c0b2943c53f2b76e69f527f2e2b2e9f9e7b5c/pip-0.3.1.tar.gz#sha256=34ce534f17065c78f980702928e988a6b6b2d8a9851aae5f1571a1feb9bb58d8 (from https://pypi.org/simple/pip/), version: 0.3.1
Found link https://files.pythonhosted.org/packages/cf/c3/153571aaac6cf999f4bb09c019b1ff379b7b599ea833813a41c784eec995/pip-0.4.tar.gz#sha256=28fc67558874f71fddda7168f73595f1650523dce3bc5bf189713ecdfc1e456e (from https://pypi.org/simple/pip/), version: 0.4
Found link https://files.pythonhosted.org/packages/8d/c7/f05c87812fa5d9562ecbc5f4f1fc1570444f53c81c834a7f662af406e3c1/pip-0.5.tar.gz#sha256=328d8412782f22568508a0d0c78a49c9920a82e44c8dfca49954fe525c152b2a (from https://pypi.org/simple/pip/), version: 0.5
Found link https://files.pythonhosted.org/packages/9a/aa/f536b6d14fe03343367da2ff44eee28f340ae650cd017ca088b6be13084a/pip-0.5.1.tar.gz#sha256=e27650538c41fe1007a41abd4cfd0f905b822622cbe1f8e7e09d1215af207694 (from https://pypi.org/simple/pip/), version: 0.5.1
Found link https://files.pythonhosted.org/packages/db/e6/fdf7be8a17b032c533d3f91e91e2c63dd81d3627cbe4113248a00c2d39d8/pip-0.6.tar.gz#sha256=4cf47db6815b2f435d1f44e1f35ff04823043f6161f7df9aec71a123b0c47f0d (from https://pypi.org/simple/pip/), version: 0.6
Found link https://files.pythonhosted.org/packages/91/cd/105f4d3c75d0ae18e12623acc96f42168aaba408dd6e43c4505aa21f8e37/pip-0.6.1.tar.gz#sha256=efe47e84ffeb0ea4804f9858b8a94bebd07f5452f907ebed36d03aed06a9f9ec (from https://pypi.org/simple/pip/), version: 0.6.1
Found link https://files.pythonhosted.org/packages/1c/c7/c0e1a9413c37828faf290f29a85a4d6034c145cc04bf1622ba8beb662ad8/pip-0.6.2.tar.gz#sha256=1c1a504d7e70d2c24246f95bd16e3d5fcec740fd144df69a407bf65a2ee67586 (from https://pypi.org/simple/pip/), version: 0.6.2
Found link https://files.pythonhosted.org/packages/3f/af/c4b9d49fb0f286996b28dbc0955c3ad359794697eb98e0e69863908070b0/pip-0.6.3.tar.gz#sha256=1a6df71eb29b98cba11bde6d6a0d8c6dd8b0518e74ceb71fb31ea4fbb42fd313 (from https://pypi.org/simple/pip/), version: 0.6.3
Found link https://files.pythonhosted.org/packages/ec/7a/6fe91ff0079ad0437830957c459d52f3923e516f5b453218f2a93d09a427/pip-0.7.tar.gz#sha256=ceaea0b9e494d893c8a191895301b79c1db33e41f14d3ad93e3d28a8b4e9bf27 (from https://pypi.org/simple/pip/), version: 0.7
Found link https://files.pythonhosted.org/packages/a5/63/11303863c2f5e9d9a15d89fcf7513a4b60987007d418862e0fb65c09fff7/pip-0.7.1.tar.gz#sha256=f54f05aa17edd0036de433c44892c8fedb1fd2871c97829838feb995818d24c3 (from https://pypi.org/simple/pip/), version: 0.7.1
Found link https://files.pythonhosted.org/packages/cd/a9/1debaa96bbc1005c1c8ad3b79fec58c198d35121546ea2e858ce0894268a/pip-0.7.2.tar.gz#sha256=98df2eb779358412bbbae75980171ae85deebc846d87e244d086520b1212da09 (from https://pypi.org/simple/pip/), version: 0.7.2
Found link https://files.pythonhosted.org/packages/74/54/f785c327fb3d163560a879b36edae5c78ee07806be282c9d4807f6be7dd1/pip-0.8.tar.gz#sha256=9017e4484a212dd4e1a43dd9f039dd7fc8338d4eea1c339d5ae1c80726de5b0f (from https://pypi.org/simple/pip/), version: 0.8
Found link https://files.pythonhosted.org/packages/5c/79/5e8381cc3078bae92166f2ba96de8355e8c181926505ba8882f7b099a500/pip-0.8.1.tar.gz#sha256=7176a87f35675f6468341212f3b959bb51d23ea66eb1c3692bf746c45c716fa2 (from https://pypi.org/simple/pip/), version: 0.8.1
Found link https://files.pythonhosted.org/packages/17/3e/0a98ab032991518741e7e712a719633e6ae160f51b3d3e855194530fd308/pip-0.8.2.tar.gz#sha256=f80a3549c048bc3bbcb47844826e9c7c6fcd87e77b92bef0d9e66d1b397c4962 (from https://pypi.org/simple/pip/), version: 0.8.2
Found link https://files.pythonhosted.org/packages/f7/9a/943fc6d879ed7220bac2e7e53096bfe78abec88d77f2f516400e0129679e/pip-0.8.3.tar.gz#sha256=1be2e18edd38aa75b5e4ef38a99ec33ba9247177cfcb4a6d2d2b3e73430e3001 (from https://pypi.org/simple/pip/), version: 0.8.3
Found link https://files.pythonhosted.org/packages/24/33/6eb675fb6db7b71d69d6928b33dea61b8bf5cfe1e5649be70ec84ce2fc09/pip-1.0.tar.gz#sha256=34ba07e2d14ba86d5088ba896ac80bed845a9b276ab8acb279b8d99bc77fec8e (from https://pypi.org/simple/pip/), version: 1.0
Found link https://files.pythonhosted.org/packages/10/d9/f584e6107ef98ad7eaaaa5d0f756bfee12561fa6a4712ffdb7209e0e1fd4/pip-1.0.1.tar.gz#sha256=37d2f18213d3845d2038dd3686bc71fc12bb41ad66c945a8b0dfec2879f3497b (from https://pypi.org/simple/pip/), version: 1.0.1
Found link https://files.pythonhosted.org/packages/16/90/5e6f80364d8a656f60681dfb7330298edef292d43e1499bcb3a4c71ff0b9/pip-1.0.2.tar.gz#sha256=a6ed9b36aac2f121c01a2c9e0307a9e4d9438d100a407db701ac65479a3335d2 (from https://pypi.org/simple/pip/), version: 1.0.2
Found link https://files.pythonhosted.org/packages/25/57/0d42cf5307d79913a082c5c4397d46f3793bc35e1138a694136d6e31be99/pip-1.1.tar.gz#sha256=993804bb947d18508acee02141281c77d27677f8c14eaa64d6287a1c53ef01c8 (from https://pypi.org/simple/pip/), version: 1.1
Found link https://files.pythonhosted.org/packages/ba/c3/4e1f892f41aaa217fe0d1f827fa05928783349c69f3cc06fdd68e112678a/pip-1.2.tar.gz#sha256=2b168f1987403f1dc6996a1f22a6f6637b751b7ab6ff27e78380b8d6e70aa314 (from https://pypi.org/simple/pip/), version: 1.2
Found link https://files.pythonhosted.org/packages/c3/a2/a63244da32afd9ce9a8ca1bd86e71610039adea8b8314046ebe5047527a6/pip-1.2.1.tar.gz#sha256=12a9302acfca62cdc7bc5d83386cac3e0581db61ac39acdb3a4e766a16b88eb1 (from https://pypi.org/simple/pip/), version: 1.2.1
Found link https://files.pythonhosted.org/packages/00/45/69d4f2602b80550bfb26cfd2f62c2f05b3b5c7352705d3766cd1e5b27648/pip-1.3.tar.gz#sha256=d6a13c5be316cb21a0243047c7f163f47e88973ebccff8d32e63ca1bf4d9321c (from https://pypi.org/simple/pip/), version: 1.3
Found link https://files.pythonhosted.org/packages/5b/ce/f5b98104f1c10d868936c25f7c597f492d4371aa9ad5fb61a94954ee7208/pip-1.3.1.tar.gz#sha256=145eaa5d1ea1b062663da1f3a97780d7edea4c63c68a37c463b1deedf7bb4957 (from https://pypi.org/simple/pip/), version: 1.3.1
Found link https://files.pythonhosted.org/packages/5f/d0/3b3958f6a58783bae44158b2c4c7827ae89abaecdd4bed12cff402620b9a/pip-1.4.tar.gz#sha256=1fd43cbf07d95ddcecbb795c97a1674b3ddb711bb4a67661284a5aa765aa1b97 (from https://pypi.org/simple/pip/), version: 1.4
Found link https://files.pythonhosted.org/packages/3f/f8/da390e0df72fb61d176b25a4b95262e3dcc14bda0ad25ac64d56db38b667/pip-1.4.1.tar.gz#sha256=4e7a06554711a624c35d0c646f63674b7f6bfc7f80221bf1eb1f631bd890d04e (from https://pypi.org/simple/pip/), version: 1.4.1
Found link https://files.pythonhosted.org/packages/4f/7d/e53bc80667378125a9e07d4929a61b0bd7128a1129dbe6f07bb3228652a3/pip-1.5.tar.gz#sha256=25f81d1a0e55d3b1709818dd57fdfb954b028f229f09bd69cb0bc80a8e03e048 (from https://pypi.org/simple/pip/), version: 1.5
Found link https://files.pythonhosted.org/packages/44/5d/1dca53b5de6d287e7eb99bd174bb022eb6cb0d6ca6e19ca6b16655dde8c2/pip-1.5.1-py2.py3-none-any.whl#sha256=00960db3b0b8724dd37fe37cfb9c72ecb8f59fab9db7d17c5c1e89a1adab49ce (from https://pypi.org/simple/pip/), version: 1.5.1
Found link https://files.pythonhosted.org/packages/21/3f/d86a600c9b2f41a75caacf768a24130f343def97652de2345da15ef7911f/pip-1.5.1.tar.gz#sha256=e60e936fbc101d56668c6134c1f2b5b40fcbec8b4fc4ca7fc34842b6b4c5c130 (from https://pypi.org/simple/pip/), version: 1.5.1
Found link https://files.pythonhosted.org/packages/3d/1f/227d77d5e9ed2df5162de4ba3616799a351eccb1ecd668ae824dd26153a1/pip-1.5.2-py2.py3-none-any.whl#sha256=6903909ccdcdbc3297b74118590e71344d6d262827acd1f5c0e2fcfce9807499 (from https://pypi.org/simple/pip/), version: 1.5.2
Found link https://files.pythonhosted.org/packages/ed/94/391a003107f6ec997c314199d03bff1c105af758ee490e3255353574487b/pip-1.5.2.tar.gz#sha256=2a8a3e08e652d3a40edbb39264bf01f8ff3c32520a79113357cca1f30533f738 (from https://pypi.org/simple/pip/), version: 1.5.2
Found link https://files.pythonhosted.org/packages/df/e9/bdb53d44fad1465b43edaf6bc7dd3027ed5af81405cc97603fdff0721ebb/pip-1.5.3-py2.py3-none-any.whl#sha256=f0037aed3ce6cf96b9e9117d42e967a74bea9ebe19088a2fdea5de93d5762fee (from https://pypi.org/simple/pip/), version: 1.5.3
Found link https://files.pythonhosted.org/packages/55/de/671a48ad313c808623041fc475f7c8f7610401d9f573f06b40eeb84e74e3/pip-1.5.3.tar.gz#sha256=dc53b4d28b88556a37cd73052b6d1d08cc644c6724e37c4d38a2e3c03c5440b2 (from https://pypi.org/simple/pip/), version: 1.5.3
Found link https://files.pythonhosted.org/packages/a9/9a/9aa19fe00de4c025562e5fb3796ff8520165a7dd1a5662c6ec9816e1ae99/pip-1.5.4-py2.py3-none-any.whl#sha256=fb7282556a42e84464f2e963a859ac4012d8134ba6218b70c1d82d145fcfa82f (from https://pypi.org/simple/pip/), version: 1.5.4
Found link https://files.pythonhosted.org/packages/78/d8/6e58a7130d457edadb753a0ea5708e411c100c7e94e72ad4802feeef735c/pip-1.5.4.tar.gz#sha256=70208a250bb4afdbbdd74c3ac35d4ab9ba1eb6852d02567a6a87f2f5104e30b9 (from https://pypi.org/simple/pip/), version: 1.5.4
Found link https://files.pythonhosted.org/packages/ce/c2/10d996b9c51b126a9f0bb9e14a9edcdd5c88888323c0685bb9b392b6c47c/pip-1.5.5-py2.py3-none-any.whl#sha256=fe7a5808190067b2598d85def9b83db46e5d64a00848ad843e107c36e1db4ae6 (from https://pypi.org/simple/pip/), version: 1.5.5
Found link https://files.pythonhosted.org/packages/88/01/a442fde40bd9aaf837612536f16ab751fac628807fd718690795b8ade77d/pip-1.5.5.tar.gz#sha256=4b7f5124364ae9b5ba833dcd8813a84c1c06fba1d7c8543323c7af4b33188eca (from https://pypi.org/simple/pip/), version: 1.5.5
Found link https://files.pythonhosted.org/packages/3f/08/7347ca4021e7fe0f1ab8f93cbc7d2a7a7350012300ad0e0227d55625e2b8/pip-1.5.6-py2.py3-none-any.whl#sha256=fbc1351ffedf09ca7560428758845a88d648b9730b63ce9e5df53a7c89f039a4 (from https://pypi.org/simple/pip/), version: 1.5.6
Found link https://files.pythonhosted.org/packages/45/db/4fb9a456b4ec4d3b701456ef562b9d72d76b6358e0c1463d17db18c5b772/pip-1.5.6.tar.gz#sha256=b1a4ae66baf21b7eb05a5e4f37c50c2706fa28ea1f8780ce8efe14dcd9f1726c (from https://pypi.org/simple/pip/), version: 1.5.6
Found link https://files.pythonhosted.org/packages/dc/7c/21191b5944b917b66e4e4e06d74f668d814b6e8a3ff7acd874479b6f6b3d/pip-6.0-py2.py3-none-any.whl#sha256=5ec6732505bd8be49fe1f8ad557b88253ffb085736396df4d6bea753fc2a8f2c (from https://pypi.org/simple/pip/), version: 6.0
Found link https://files.pythonhosted.org/packages/38/fd/065c66a88398f240e344fdf496b9707f92d75f88eedc3d10ff847b28a657/pip-6.0.tar.gz#sha256=6103897f1bb68d3f933edd60f3e3830c4ea6b8abf7a4b500db148921b11f6c9b (from https://pypi.org/simple/pip/), version: 6.0
Found link https://files.pythonhosted.org/packages/e9/7a/cdbc1a12ed52410d557e48d4646f4543e9e991ff32d2374dc6db849aa617/pip-6.0.1-py2.py3-none-any.whl#sha256=322aea7d1f7b9ee68ad87ac4704cad5df97f77e70668c0bd18f964c5daa78173 (from https://pypi.org/simple/pip/), version: 6.0.1
Found link https://files.pythonhosted.org/packages/4d/c3/8675b90cd89b9b222062f4f6c7e9d48b0387f5b35cbf747a74403a883e56/pip-6.0.1.tar.gz#sha256=fa2f7c68da4a405d673aa38542f9df009d60026db4f532429ac9cbfbda1f959d (from https://pypi.org/simple/pip/), version: 6.0.1
Found link https://files.pythonhosted.org/packages/71/3c/b5a521e5e99cfff091e282231591f21193fd80de079ec5fb8ed9c6614044/pip-6.0.2-py2.py3-none-any.whl#sha256=7d17b0f267f7c9cd17cd2924bbbe2b4a3d407322c0e09084ca3f1295c1fed50d (from https://pypi.org/simple/pip/), version: 6.0.2
Found link https://files.pythonhosted.org/packages/4c/5a/f9e8e3de0153282c7cb54a9b991af225536ac914bac858ca664cf883bb3e/pip-6.0.2.tar.gz#sha256=6fa90667706a679e3dc75b27a51fddafa64401c45e96f8ae6c20978183290077 (from https://pypi.org/simple/pip/), version: 6.0.2
Found link https://files.pythonhosted.org/packages/73/cb/3eebf42003791df29219a3dfa1874572aa16114b44c9b1b0ac66bf96e8c0/pip-6.0.3-py2.py3-none-any.whl#sha256=b72655b6ac6aef1c86dd07f51e8ace8d7aabd6a1c4ff88db87155276fa32a073 (from https://pypi.org/simple/pip/), version: 6.0.3
Found link https://files.pythonhosted.org/packages/ce/63/8d99ae60d11ae1a65f5d4fc39a529a598bd3b8e067132210cb0c4d9e9f74/pip-6.0.3.tar.gz#sha256=b091a35f5fa0faffac0b27b97e1e1e93ffe63b463c2ea8dbde0c1fb987933614 (from https://pypi.org/simple/pip/), version: 6.0.3
Found link https://files.pythonhosted.org/packages/c5/0e/c974206726542bc495fc7443dd97834a6d14c2f0cba183fcfcd01075225a/pip-6.0.4-py2.py3-none-any.whl#sha256=8dfd95de29a7a3bb1e7d368cc83d566938eb210b04d553ebfe5e3a422f4aec65 (from https://pypi.org/simple/pip/), version: 6.0.4
Found link https://files.pythonhosted.org/packages/02/a1/c90f19910ee153d7a0efca7216758121118d7e93084276541383fe9ca82e/pip-6.0.4.tar.gz#sha256=1dbbff9c369e510c7468ab68ba52c003f68f83c99c2f8259acd51099e8799f1e (from https://pypi.org/simple/pip/), version: 6.0.4
Found link https://files.pythonhosted.org/packages/e9/1b/c6a375a337fb576784cdea3700f6c3eaf1420f0a01458e6e034cc178a84a/pip-6.0.5-py2.py3-none-any.whl#sha256=b2c20e3a2a43b2bbb1d19ad98be27eccc7b0f0ece016da602ccaa757a862b0e2 (from https://pypi.org/simple/pip/), version: 6.0.5
Found link https://files.pythonhosted.org/packages/19/f2/58628768f618c8c9fea878e0fb97730c0b8a838d3ab3f325768bf12dac94/pip-6.0.5.tar.gz#sha256=3bf42d28be9085ab2e9aecfd69a6da2d31563fe833304bf71a620a30c38ab8a2 (from https://pypi.org/simple/pip/), version: 6.0.5
Found link https://files.pythonhosted.org/packages/64/fc/4a49ccb18f55a0ceeb76e8d554bd4563217117492997825d194ed0017cc1/pip-6.0.6-py2.py3-none-any.whl#sha256=fb04f8afe1ba57626783f0c8e2f3d46bbaebaa446fcf124f434e968a2fee595e (from https://pypi.org/simple/pip/), version: 6.0.6
Found link https://files.pythonhosted.org/packages/f6/ce/d9e4e178b66c766c117f62ddf4fece019ef9d50127a8926d2f60300d615e/pip-6.0.6.tar.gz#sha256=3a14091299dcdb9bab9e9004ae67ac401f2b1b14a7c98de074ca74fdddf4bfa0 (from https://pypi.org/simple/pip/), version: 6.0.6
Found link https://files.pythonhosted.org/packages/7a/8e/2bbd4fcf3ee06ee90ded5f39ec12f53165dfdb9ef25a981717ad38a16670/pip-6.0.7-py2.py3-none-any.whl#sha256=93a326304c7db749896bcef822bbbac1ab29dad5651c6d732e245975239890e6 (from https://pypi.org/simple/pip/), version: 6.0.7
Found link https://files.pythonhosted.org/packages/52/85/b160ebdaa84378df6bb0176d4eed9f57edca662446174eead7a9e2e566d6/pip-6.0.7.tar.gz#sha256=35a5a43ac6b7af83ed47ea5731a365f43d350a3a7267e039e5f06b61d42ab3c2 (from https://pypi.org/simple/pip/), version: 6.0.7
Found link https://files.pythonhosted.org/packages/63/65/55b71647adec1ad595bf0e5d76d028506dfc002df30c256f022ff7a660a5/pip-6.0.8-py2.py3-none-any.whl#sha256=3c22b0a8ff92727bd737a82f72700790591f177541df08c07bc1f90d6b72ac19 (from https://pypi.org/simple/pip/), version: 6.0.8
Found link https://files.pythonhosted.org/packages/ef/8a/e3a980bc0a7f791d72c1302f65763ed300f2e14c907ac033e01b44c79e5e/pip-6.0.8.tar.gz#sha256=0d58487a1b7f5be2e5e965c11afbea1dc44ecec8069de03491a4d0d6c85f4551 (from https://pypi.org/simple/pip/), version: 6.0.8
Found link https://files.pythonhosted.org/packages/24/fb/8a56a46243514681e569bbafd8146fa383476c4b7c725c8598c452366f31/pip-6.1.0-py2.py3-none-any.whl#sha256=435a018f6d29e34d4f901bf4e6860d8a5fa1816b68d62008c18ca062a306db31 (from https://pypi.org/simple/pip/), version: 6.1.0
Found link https://files.pythonhosted.org/packages/6c/84/432eb60bbcb414b9cdfcb135d5f4925e253c74e7d6916ada79990d6cc1a0/pip-6.1.0.tar.gz#sha256=89f120e2ab3d25ab70c36eb28ad4f280fc9ba71736e74d3055f609c1f9173768 (from https://pypi.org/simple/pip/), version: 6.1.0
Found link https://files.pythonhosted.org/packages/67/f0/ba0fb41dbdbfc4aa3e0c16b40269aca6b9e3d59cacdb646218aa2e9b1d2c/pip-6.1.1-py2.py3-none-any.whl#sha256=a67e54aa0f26b6d62ccec5cc6735eff205dd0fed075f56ac3d3111e91e4467fc (from https://pypi.org/simple/pip/), version: 6.1.1
Found link https://files.pythonhosted.org/packages/bf/85/871c126b50b8ee0b9819e8a63b614aedd264577e73478caedcd447e8f28c/pip-6.1.1.tar.gz#sha256=89f3b626d225e08e7f20d85044afa40f612eb3284484169813dc2d0631f2a556 (from https://pypi.org/simple/pip/), version: 6.1.1
Found link https://files.pythonhosted.org/packages/5a/9b/56d3c18d0784d5f2bbd446ea2dc7ffa7476c35e3dc223741d20cfee3b185/pip-7.0.0-py2.py3-none-any.whl#sha256=309c48399c7d68501a10ef206abd6e5c541fedbf84b95435d9063bd454b39df7 (from https://pypi.org/simple/pip/), version: 7.0.0
Found link https://files.pythonhosted.org/packages/c6/16/6475b142927ca5d03e3b7968efa5b0edd103e4684ecfde181a25f6fa2505/pip-7.0.0.tar.gz#sha256=7b46bfc1b95494731de306a688e2a7bc056d7fa7ad27e026908fb2ae67fed23d (from https://pypi.org/simple/pip/), version: 7.0.0
Found link https://files.pythonhosted.org/packages/5a/10/bb7a32c335bceba636aa673a4c977effa1e73a79f88856459486d8d670cf/pip-7.0.1-py2.py3-none-any.whl#sha256=d26b8573ba1ac1ec99a9bdbdffee2ff2b06c7790815211d0eb4dc1462a089705 (from https://pypi.org/simple/pip/), version: 7.0.1
Found link https://files.pythonhosted.org/packages/4a/83/9ae4362a80739657e0c8bb628ea3fa0214a9aba7c8590dacc301ea293f73/pip-7.0.1.tar.gz#sha256=cfec177552fdd0b2d12b72651c8e874f955b4c62c1c2c9f2588cbdc1c0d0d416 (from https://pypi.org/simple/pip/), version: 7.0.1
Found link https://files.pythonhosted.org/packages/64/7f/7107800ae0919a80afbf1ecba21b90890431c3ee79d700adac3c79cb6497/pip-7.0.2-py2.py3-none-any.whl#sha256=83c869c5ab7113866e2d69641ec470d47f0faae68ca4550a289a4d3db515ad65 (from https://pypi.org/simple/pip/), version: 7.0.2
Found link https://files.pythonhosted.org/packages/75/b1/66532c273bca0133e42c3b4540a1609289f16e3046f1830f18c60794d661/pip-7.0.2.tar.gz#sha256=ba28fa60b573a9444e7b78ccb3b0f261d1f66f46d20403f9dce37b18a6aed405 (from https://pypi.org/simple/pip/), version: 7.0.2
Found link https://files.pythonhosted.org/packages/96/76/33a598ae42dd0554207d83c7acc60e3b166dbde723cbf282f1f73b7a127c/pip-7.0.3-py2.py3-none-any.whl#sha256=7b1cb03e827d58d2d05e68ea96a9e27487ed4b0afcd951ac6e40847ce94f0738 (from https://pypi.org/simple/pip/), version: 7.0.3
Found link https://files.pythonhosted.org/packages/35/59/5b23115758ba0f2fc465c459611865173ef006202ba83f662d1f58ed2fb8/pip-7.0.3.tar.gz#sha256=b4c598825a6f6dc2cac65968feb28e6be6c1f7f1408493c60a07eaa731a0affd (from https://pypi.org/simple/pip/), version: 7.0.3
Found link https://files.pythonhosted.org/packages/f7/c0/9f8dac88326609b4b12b304e8382f64f7d5af7735a00d2fac36cf135fc30/pip-7.1.0-py2.py3-none-any.whl#sha256=80c29f899d3a00a448d65f8158544d22935baec7159af8da1a4fa1490ced481d (from https://pypi.org/simple/pip/), version: 7.1.0
Found link https://files.pythonhosted.org/packages/7e/71/3c6ece07a9a885650aa6607b0ebfdf6fc9a3ef8691c44b5e724e4eee7bf2/pip-7.1.0.tar.gz#sha256=d5275ba3221182a5dd1b6bcfbfc5ec277fb399dd23226d6fa018048f7e0f10f2 (from https://pypi.org/simple/pip/), version: 7.1.0
Found link https://files.pythonhosted.org/packages/1c/56/094d563c508917081bccff365e4f621ba33073c1c13aca9267a43cfcaf13/pip-7.1.1-py2.py3-none-any.whl#sha256=ce13000878d34c1178af76cb8cf269e232c00508c78ed46c165dd5b0881615f4 (from https://pypi.org/simple/pip/), version: 7.1.1
Found link https://files.pythonhosted.org/packages/3b/bb/b3f2a95494fd3f01d3b3ae530e7c0e910dc25e88e30787b0a5e10cbc0640/pip-7.1.1.tar.gz#sha256=b22fe3c93a13fc7c04f145a42fd2ad50a9e3e1b8a7eed2e2b1c66e540a0951da (from https://pypi.org/simple/pip/), version: 7.1.1
Found link https://files.pythonhosted.org/packages/b2/d0/cd115fe345dd6f07ec1c780020a7dfe74966fceeb171e0f20d1d4905b0b7/pip-7.1.2-py2.py3-none-any.whl#sha256=b9d3983b5cce04f842175e30169d2f869ef12c3546fd274083a65eada4e9708c (from https://pypi.org/simple/pip/), version: 7.1.2
Found link https://files.pythonhosted.org/packages/d0/92/1e8406c15d9372084a5bf79d96da3a0acc4e7fcf0b80020a4820897d2a5c/pip-7.1.2.tar.gz#sha256=ca047986f0528cfa975a14fb9f7f106271d4e0c3fe1ddced6c1db2e7ae57a477 (from https://pypi.org/simple/pip/), version: 7.1.2
Found link https://files.pythonhosted.org/packages/00/ae/bddef02881ee09c6a01a0d6541aa6c75a226a4e68b041be93142befa0cd6/pip-8.0.0-py2.py3-none-any.whl#sha256=262ed1823eb7fbe3f18a9bedb4800e59c4ab9a6682aff8c37b5ee83ea840910b (from https://pypi.org/simple/pip/), version: 8.0.0
Found link https://files.pythonhosted.org/packages/e3/2d/03c014d11e66628abf2fda5ca00f779cbe7b5292c5cd13d42a95b94aa9b8/pip-8.0.0.tar.gz#sha256=90112b296152f270cb8dddcd19b7b87488d9e002e8cf622e14c4da9c2f6319b1 (from https://pypi.org/simple/pip/), version: 8.0.0
Found link https://files.pythonhosted.org/packages/45/9c/6f9a24917c860873e2ce7bd95b8f79897524353df51d5d920cd6b6c1ec33/pip-8.0.1-py2.py3-none-any.whl#sha256=dedaac846bc74e38a3253671f51a056331ffca1da70e3f48d8128f2aa0635bba (from https://pypi.org/simple/pip/), version: 8.0.1
Found link https://files.pythonhosted.org/packages/ea/66/a3d6187bd307159fedf8575c0d9ee2294d13b1cdd11673ca812e6a2dda8f/pip-8.0.1.tar.gz#sha256=477c50b3e538a7ac0fa611fb8b877b04b33fb70d325b12a81b9dbf3eb1158a4d (from https://pypi.org/simple/pip/), version: 8.0.1
Found link https://files.pythonhosted.org/packages/e7/a0/bd35f5f978a5e925953ce02fa0f078a232f0f10fcbe543d8cfc043f74fda/pip-8.0.2-py2.py3-none-any.whl#sha256=249a6f3194be8c2e8cb4d4be3f6fd16a9f1e3336218caffa8e7419e3816f9988 (from https://pypi.org/simple/pip/), version: 8.0.2
Found link https://files.pythonhosted.org/packages/ce/15/ee1f9a84365423e9ef03d0f9ed0eba2fb00ac1fffdd33e7b52aea914d0f8/pip-8.0.2.tar.gz#sha256=46f4bd0d8dfd51125a554568d646fe4200a3c2c6c36b9f2d06d2212148439521 (from https://pypi.org/simple/pip/), version: 8.0.2
Found link https://files.pythonhosted.org/packages/ae/d4/2b127310f5364610b74c28e2e6a40bc19e2d3c9a9a4e012d3e333e767c99/pip-8.0.3-py2.py3-none-any.whl#sha256=b0335bc837f9edb5aad03bd43d0973b084a1cbe616f8188dc23ba13234dbd552 (from https://pypi.org/simple/pip/), version: 8.0.3
Found link https://files.pythonhosted.org/packages/22/f3/14bc87a4f6b5ec70b682765978a6f3105bf05b6781fa97e04d30138bd264/pip-8.0.3.tar.gz#sha256=30f98b66f3fe1069c529a491597d34a1c224a68640c82caf2ade5f88aa1405e8 (from https://pypi.org/simple/pip/), version: 8.0.3
Found link https://files.pythonhosted.org/packages/1e/c7/78440b3fb882ed001e6e12d8770bd45e73d6eced4e57f7c072b829ce8a3d/pip-8.1.0-py2.py3-none-any.whl#sha256=a542b99e08002ead83200198e19a3983270357e1cb4fe704247990b5b35471dc (from https://pypi.org/simple/pip/), version: 8.1.0
Found link https://files.pythonhosted.org/packages/3c/72/6981d5adf880adecb066a1a1a4c312a17f8d787a3b85446967964ac66d55/pip-8.1.0.tar.gz#sha256=d8faa75dd7d0737b16d50cd0a56dc91a631c79ecfd8d38b80f6ee929ec82043e (from https://pypi.org/simple/pip/), version: 8.1.0
Found link https://files.pythonhosted.org/packages/31/6a/0f19a7edef6c8e5065f4346137cc2a08e22e141942d66af2e1e72d851462/pip-8.1.1-py2.py3-none-any.whl#sha256=44b9c342782ab905c042c207d995aa069edc02621ddbdc2b9f25954a0fdac25c (from https://pypi.org/simple/pip/), version: 8.1.1
Found link https://files.pythonhosted.org/packages/41/27/9a8d24e1b55bd8c85e4d022da2922cb206f183e2d18fee4e320c9547e751/pip-8.1.1.tar.gz#sha256=3e78d3066aaeb633d185a57afdccf700aa2e660436b4af618bcb6ff0fa511798 (from https://pypi.org/simple/pip/), version: 8.1.1
Found link https://files.pythonhosted.org/packages/9c/32/004ce0852e0a127f07f358b715015763273799bd798956fa930814b60f39/pip-8.1.2-py2.py3-none-any.whl#sha256=6464dd9809fb34fc8df2bf49553bb11dac4c13d2ffa7a4f8038ad86a4ccb92a1 (from https://pypi.org/simple/pip/), version: 8.1.2
Found link https://files.pythonhosted.org/packages/e7/a8/7556133689add8d1a54c0b14aeff0acb03c64707ce100ecd53934da1aa13/pip-8.1.2.tar.gz#sha256=4d24b03ffa67638a3fa931c09fd9e0273ffa904e95ebebe7d4b1a54c93d7b732 (from https://pypi.org/simple/pip/), version: 8.1.2
Found link https://files.pythonhosted.org/packages/3f/ef/935d9296acc4f48d1791ee56a73781271dce9712b059b475d3f5fa78487b/pip-9.0.0-py2.py3-none-any.whl#sha256=c856ac18ca01e7127456f831926dc67cc7d3ab663f4c13b1ec156e36db4de574 (from https://pypi.org/simple/pip/) (requires-python:>=2.6,!=3.0.,!=3.1.,!=3.2.), version: 9.0.0
Found link https://files.pythonhosted.org/packages/5e/53/eaef47e5e2f75677c9de0737acc84b659b78a71c4086f424f55346a341b5/pip-9.0.0.tar.gz#sha256=f62fb70e7e000e46fce12aaeca752e5281a5446977fe5a75ab4189a43b3f8793 (from https://pypi.org/simple/pip/) (requires-python:>=2.6,!=3.0.,!=3.1.,!=3.2.), version: 9.0.0
Found link https://files.pythonhosted.org/packages/b6/ac/7015eb97dc749283ffdec1c3a88ddb8ae03b8fad0f0e611408f196358da3/pip-9.0.1-py2.py3-none-any.whl#sha256=690b762c0a8460c303c089d5d0be034fb15a5ea2b75bdf565f40421f542fefb0 (from https://pypi.org/simple/pip/) (requires-python:>=2.6,!=3.0.,!=3.1.,!=3.2.), version: 9.0.1
Found link https://files.pythonhosted.org/packages/11/b6/abcb525026a4be042b486df43905d6893fb04f05aac21c32c638e939e447/pip-9.0.1.tar.gz#sha256=09f243e1a7b461f654c26a725fa373211bb7ff17a9300058b205c61658ca940d (from https://pypi.org/simple/pip/) (requires-python:>=2.6,!=3.0.,!=3.1.,!=3.2.), version: 9.0.1
Found link https://files.pythonhosted.org/packages/e7/f9/e801dcea22886cd513f6bd2e8f7e581bd6f67bb8e8f1cd8e7b92d8539280/pip-9.0.2-py2.py3-none-any.whl#sha256=b135491ddb061f39719b8472d8abb59c613816a2b86069c332db74d1cd208ab2 (from https://pypi.org/simple/pip/) (requires-python:>=2.6,!=3.0.,!=3.1.,!=3.2.), version: 9.0.2
Found link https://files.pythonhosted.org/packages/e5/8f/3fc66461992dc9e9fcf5e005687d5f676729172dda640df2fd8b597a6da7/pip-9.0.2.tar.gz#sha256=88110a224e9d30e5d76592a0b2130ef10e7e67a6426e8617bb918fffbfe91fe5 (from https://pypi.org/simple/pip/) (requires-python:>=2.6,!=3.0.,!=3.1.,!=3.2.), version: 9.0.2
Found link https://files.pythonhosted.org/packages/ac/95/a05b56bb975efa78d3557efa36acaf9cf5d2fd0ee0062060493687432e03/pip-9.0.3-py2.py3-none-any.whl#sha256=c3ede34530e0e0b2381e7363aded78e0c33291654937e7373032fda04e8803e5 (from https://pypi.org/simple/pip/) (requires-python:>=2.6,!=3.0.,!=3.1.,!=3.2.), version: 9.0.3
Found link https://files.pythonhosted.org/packages/c4/44/e6b8056b6c8f2bfd1445cc9990f478930d8e3459e9dbf5b8e2d2922d64d3/pip-9.0.3.tar.gz#sha256=7bf48f9a693be1d58f49f7af7e0ae9fe29fd671cde8a55e6edca3581c4ef5796 (from https://pypi.org/simple/pip/) (requires-python:>=2.6,!=3.0.,!=3.1.,!=3.2.), version: 9.0.3
Found link https://files.pythonhosted.org/packages/4b/5a/8544ae02a5bd28464e03af045e8aabde20a7b02db1911a9159328e1eb25a/pip-10.0.0b1-py2.py3-none-any.whl#sha256=dbd5d24cd461be23429625085a36cc8732cbcac4d2aaf673031f80f6ac07d844 (from https://pypi.org/simple/pip/) (requires-python:>=2.7,!=3.0.,!=3.1.,!=3.2.), version: 10.0.0b1
Found link https://files.pythonhosted.org/packages/aa/6d/ffbb86abf18b750fb26f27eda7c7732df2aacaa669c420d2eb2ad6df3458/pip-10.0.0b1.tar.gz#sha256=8d6e63d8b99752e4b53f272b66f9cd7b59e2b288e9a863a61c48d167203a2656 (from https://pypi.org/simple/pip/) (requires-python:>=2.7,!=3.0.,!=3.1.,!=3.2.), version: 10.0.0b1
Found link https://files.pythonhosted.org/packages/97/72/1d514201e7d7fc7fff5aac3de9c7b892cd72fb4bf23fd983630df96f7412/pip-10.0.0b2-py2.py3-none-any.whl#sha256=79f55588912f1b2b4f86f96f11e329bb01b25a484e2204f245128b927b1038a7 (from https://pypi.org/simple/pip/) (requires-python:>=2.7,!=3.0.,!=3.1.,!=3.2.), version: 10.0.0b2
Found link https://files.pythonhosted.org/packages/32/67/572f642e6e42c580d3154964cfbab7d9322c23b0f417c6c01fdd206a2777/pip-10.0.0b2.tar.gz#sha256=ad6adec2150ce4aed8f6134d9b77d928fc848dbcb887fb1a455988cf99da5cae (from https://pypi.org/simple/pip/) (requires-python:>=2.7,!=3.0.,!=3.1.,!=3.2.), version: 10.0.0b2
Found link https://files.pythonhosted.org/packages/62/a1/0d452b6901b0157a0134fd27ba89bf95a857fbda64ba52e1ca2cf61d8412/pip-10.0.0-py2.py3-none-any.whl#sha256=86a60a96d85e329962a9e6f6af612cbc11106293dbc83f119802b5bee9874cf3 (from https://pypi.org/simple/pip/) (requires-python:>=2.7,!=3.0.,!=3.1.,!=3.2.), version: 10.0.0
Found link https://files.pythonhosted.org/packages/e0/69/983a8e47d3dfb51e1463c1e962b2ccd1d74ec4e236e232625e353d830ed2/pip-10.0.0.tar.gz#sha256=f05a3eeea64bce94e85cc6671d679473d66288a4d37c3fcf983584954096b34f (from https://pypi.org/simple/pip/) (requires-python:>=2.7,!=3.0.,!=3.1.,!=3.2.), version: 10.0.0
Found link https://files.pythonhosted.org/packages/0f/74/ecd13431bcc456ed390b44c8a6e917c1820365cbebcb6a8974d1cd045ab4/pip-10.0.1-py2.py3-none-any.whl#sha256=717cdffb2833be8409433a93746744b59505f42146e8d37de6c62b430e25d6d7 (from https://pypi.org/simple/pip/) (requires-python:>=2.7,!=3.0.,!=3.1.,!=3.2.), version: 10.0.1
Found link https://files.pythonhosted.org/packages/ae/e8/2340d46ecadb1692a1e455f13f75e596d4eab3d11a57446f08259dee8f02/pip-10.0.1.tar.gz#sha256=f2bd08e0cd1b06e10218feaf6fef299f473ba706582eb3bd9d52203fdbd7ee68 (from https://pypi.org/simple/pip/) (requires-python:>=2.7,!=3.0.,!=3.1.,!=3.2.), version: 10.0.1
Found link https://files.pythonhosted.org/packages/5f/25/e52d3f31441505a5f3af41213346e5b6c221c9e086a166f3703d2ddaf940/pip-18.0-py2.py3-none-any.whl#sha256=070e4bf493c7c2c9f6a08dd797dd3c066d64074c38e9e8a0fb4e6541f266d96c (from https://pypi.org/simple/pip/) (requires-python:>=2.7,!=3.0.,!=3.1.,!=3.2.,!=3.3.), version: 18.0
Found link https://files.pythonhosted.org/packages/69/81/52b68d0a4de760a2f1979b0931ba7889202f302072cc7a0d614211bc7579/pip-18.0.tar.gz#sha256=a0e11645ee37c90b40c46d607070c4fd583e2cd46231b1c06e389c5e814eed76 (from https://pypi.org/simple/pip/) (requires-python:>=2.7,!=3.0.,!=3.1.,!=3.2.,!=3.3.), version: 18.0
You are using pip version 10.0.1, however version 18.0 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
(/home/amalik/Pytorch_virtual_enviornment) [amalik@hpc1 hpc1_runs]$ ssh node04
Last login: Thu Sep 27 09:13:29 2018 from 10.20.19.17
[amalik@node04 ~]$ module load anaconda2
[amalik@node04 ~]$ module load openmpi/3.0.0-gnu
[amalik@node04 ~]$ module load gcc/7.2.0
[amalik@node04 ~]$ module load cuda/9.0
[amalik@node04 ~]$ source activate /home/amalik/Pytorch_virtual_enviornment/
(/home/amalik/Pytorch_virtual_enviornment) [amalik@node04 ~]$ python
Python 2.7.15 |Anaconda, Inc.| (default, May 1 2018, 23:32:55)
[GCC 7.2.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.

import horovod as hvd
Traceback (most recent call last):
File "", line 1, in
ImportError: No module named horovod

abidmalikwaterloo · 2018-09-28T14:36:36Z

The following messgae is interesting:

In file included from /home/amalik/Pytorch_virtual_enviornment/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include/THC/THC.h:4:0,
from _test_cuda.c:493:
/home/amalik/Pytorch_virtual_enviornment/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include/THC/THCGeneral.h:12:10: fatal error: cuda.h: No such file or directory
#include "cuda.h"
^~~~~~~~
compilation terminated.
INFO: Above error indicates that this PyTorch installation does not support CUDA.
INFO: Unable to build PyTorch plugin, will skip it.

Traceback (most recent call last):
  File "/tmp/pip-req-build-QFl6Ac/setup.py", line 720, in build_extensions
    build_torch_extension(self, options, torch_version)
  File "/tmp/pip-req-build-QFl6Ac/setup.py", line 587, in build_torch_extension
    'Horovod build with GPU support was requested, but this PyTorch '
DistutilsPlatformError: Horovod build with GPU support was requested, but this PyTorch installation does not support CUDA.

error: Neither TensorFlow nor PyTorch plugins were built. See errors above.

error

I installed Pytorch from the following the instruction on the site:

https://pytorch.org/

I used
pip install --user torch torchvision

mrfox321 · 2018-09-28T15:41:13Z

Also got the same error:

Caused by op u'DistributedAdamOptimizer_Allreduce/HorovodAllreduce_gradients_cudnn_rnn_stack_cudnn_gru_CudnnRNN_grad_tuple_control_dependency_3_0', defined at:
...
UnknownError (see above for traceback): cudaMemcpyAsync failed: invalid argument [[Node: DistributedAdamOptimizer_Allreduce/HorovodAllreduce_gradients_cudnn_rnn_stack_cudnn_gru_CudnnRNN_grad_tuple_control_dependency_3_0 = HorovodAllreduce[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](gradients/cudnn_rnn_stack/cudnn_gru/CudnnRNN_grad/CudnnRNNBackprop:3)]]

alsrgv · 2018-09-28T18:13:26Z

@abidmalikwaterloo, could you try specifying HOROVOD_CUDA_HOME=/path/to/your/cuda in your installation flags?

alsrgv · 2018-09-28T18:14:48Z

@mrfox321, could you try running from debug branch, as described in #404 (comment), to help narrow down this issue?

abidmalikwaterloo · 2018-10-01T13:35:08Z

@alsrgv I tried as to build from scratch

conda install pytorch torchvision -c pytorch
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/amalik/nccl_2.1.15-1+cuda9.0_x86_64/lib
export HOROVOD_NCCL_HOME=/home/amalik/nccl_2.1.15-1+cuda9.0_x86_64/
export HOROVOD_GPU_ALLREDUCE=NCCL
export HOROVOD_CUDA_HOME= /software/cuda/
export PATH=$PATH:/software/openmpi/3.0.0-gnu/bin/
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/software/openmpi/3.0.0-gnu/lib/
pip install --user -v --no-cache-dir git+https://github.com/uber/horovod@debug_before_memcpy

I am getting the following message:

(/home/amalik/PyTorchHorovod) [amalik@node04 PyTorchHorovod]$ pip install --user -v --no-cache-dir git+https://github.com/uber/horovod@debug_before_memcpy
Created temporary directory: /tmp/pip-ephem-wheel-cache-v94wl1es
Created temporary directory: /tmp/pip-install-6s89svrr
Collecting git+https://github.com/uber/horovod@debug_before_memcpy
Created temporary directory: /tmp/pip-req-build-vlcqfrb8
Cloning https://github.com/uber/horovod (to revision debug_before_memcpy) to /tmp/pip-req-build-vlcqfrb8
Running command git clone -q https://github.com/uber/horovod /tmp/pip-req-build-vlcqfrb8
Running command git show-ref debug_before_memcpy
2d42310 refs/remotes/origin/debug_before_memcpy
Running command git rev-parse HEAD
8d72d66
Running command git checkout -q 2d42310
Running setup.py (path:/tmp/pip-req-build-vlcqfrb8/setup.py) egg_info for package from git+https://github.com/uber/horovod@debug_before_memcpy
Running command python setup.py egg_info
running egg_info
creating pip-egg-info/horovod.egg-info
writing pip-egg-info/horovod.egg-info/PKG-INFO
writing dependency_links to pip-egg-info/horovod.egg-info/dependency_links.txt
writing requirements to pip-egg-info/horovod.egg-info/requires.txt
writing top-level names to pip-egg-info/horovod.egg-info/top_level.txt
writing manifest file 'pip-egg-info/horovod.egg-info/SOURCES.txt'
reading manifest file 'pip-egg-info/horovod.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
no previously-included directories found matching '.eggs'
writing manifest file 'pip-egg-info/horovod.egg-info/SOURCES.txt'
Source in /tmp/pip-req-build-vlcqfrb8 has version 0.14.1, which satisfies requirement horovod==0.14.1 from git+https://github.com/uber/horovod@debug_before_memcpy
Requirement already satisfied: cffi>=1.4.0 in ./lib/python3.6/site-packages (from horovod==0.14.1) (1.11.5)
Requirement already satisfied: pycparser in ./lib/python3.6/site-packages (from cffi>=1.4.0->horovod==0.14.1) (2.18)
Installing collected packages: horovod
Created temporary directory: /tmp/pip-record-db5v6nte
Running setup.py install for horovod ... Running command /home/amalik/PyTorchHorovod/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-req-build-vlcqfrb8/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-record-db5v6nte/install-record.txt --single-version-externally-managed --compile --user --prefix=
running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.6
creating build/lib.linux-x86_64-3.6/horovod
copying horovod/init.py -> build/lib.linux-x86_64-3.6/horovod
creating build/lib.linux-x86_64-3.6/horovod/common
copying horovod/common/init.py -> build/lib.linux-x86_64-3.6/horovod/common
creating build/lib.linux-x86_64-3.6/horovod/keras
copying horovod/keras/callbacks.py -> build/lib.linux-x86_64-3.6/horovod/keras
copying horovod/keras/callbacks_impl.py -> build/lib.linux-x86_64-3.6/horovod/keras
copying horovod/keras/init.py -> build/lib.linux-x86_64-3.6/horovod/keras
copying horovod/keras/impl.py -> build/lib.linux-x86_64-3.6/horovod/keras
creating build/lib.linux-x86_64-3.6/horovod/tensorflow
copying horovod/tensorflow/mpi_ops.py -> build/lib.linux-x86_64-3.6/horovod/tensorflow
copying horovod/tensorflow/init.py -> build/lib.linux-x86_64-3.6/horovod/tensorflow
creating build/lib.linux-x86_64-3.6/horovod/torch
copying horovod/torch/init.py -> build/lib.linux-x86_64-3.6/horovod/torch
copying horovod/torch/mpi_ops.py -> build/lib.linux-x86_64-3.6/horovod/torch
creating build/lib.linux-x86_64-3.6/horovod/tensorflow/keras
copying horovod/tensorflow/keras/callbacks.py -> build/lib.linux-x86_64-3.6/horovod/tensorflow/keras
copying horovod/tensorflow/keras/init.py -> build/lib.linux-x86_64-3.6/horovod/tensorflow/keras
creating build/lib.linux-x86_64-3.6/horovod/torch/mpi_lib
copying horovod/torch/mpi_lib/init.py -> build/lib.linux-x86_64-3.6/horovod/torch/mpi_lib
creating build/lib.linux-x86_64-3.6/horovod/torch/mpi_lib_impl
copying horovod/torch/mpi_lib_impl/init.py -> build/lib.linux-x86_64-3.6/horovod/torch/mpi_lib_impl
running build_ext
gcc -pthread -B /home/amalik/PyTorchHorovod/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -std=c++11 -fPIC -O2 -I/home/amalik/PyTorchHorovod/include/python3.6m -c build/temp.linux-x86_64-3.6/test_compile/test_cpp_flags.cc -o build/temp.linux-x86_64-3.6/test_compile/test_cpp_flags.o
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
gcc -pthread -shared -B /home/amalik/PyTorchHorovod/compiler_compat -L/home/amalik/PyTorchHorovod/lib -Wl,-rpath=/home/amalik/PyTorchHorovod/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.6/test_compile/test_cpp_flags.o -o build/temp.linux-x86_64-3.6/test_compile/test_cpp_flags.so
gcc -pthread -B /home/amalik/PyTorchHorovod/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/amalik/PyTorchHorovod/include/python3.6m -c build/temp.linux-x86_64-3.6/test_compile/test_link_flags.cc -o build/temp.linux-x86_64-3.6/test_compile/test_link_flags.o
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
gcc -pthread -shared -B /home/amalik/PyTorchHorovod/compiler_compat -L/home/amalik/PyTorchHorovod/lib -Wl,-rpath=/home/amalik/PyTorchHorovod/lib -Wl,--no-as-needed -Wl,--sysroot=/ -Wl,--version-script=horovod.lds build/temp.linux-x86_64-3.6/test_compile/test_link_flags.o -o build/temp.linux-x86_64-3.6/test_compile/test_link_flags.so
error: /software/openmpi/3.0.0-gnu/ failed (see error below), is MPI in $PATH?
Note: If your version of MPI has a custom command to show compilation flags, please specify it with the HOROVOD_MPICXX_SHOW environment variable.

Traceback (most recent call last):
  File "/tmp/pip-req-build-vlcqfrb8/setup.py", line 221, in get_mpi_flags
    shlex.split(show_command), universal_newlines=True).strip()
  File "/home/amalik/PyTorchHorovod/lib/python3.6/subprocess.py", line 336, in check_output
    **kwargs).stdout
  File "/home/amalik/PyTorchHorovod/lib/python3.6/subprocess.py", line 403, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/home/amalik/PyTorchHorovod/lib/python3.6/subprocess.py", line 709, in __init__
    restore_signals, start_new_session)
  File "/home/amalik/PyTorchHorovod/lib/python3.6/subprocess.py", line 1344, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
PermissionError: [Errno 13] Permission denied: '/software/openmpi/3.0.0-gnu/'

error
Cleaning up...
Removing source in /tmp/pip-req-build-vlcqfrb8
Command "/home/amalik/PyTorchHorovod/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-req-build-vlcqfrb8/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-record-db5v6nte/install-record.txt --single-version-externally-managed --compile --user --prefix=" failed with error code 1 in /tmp/pip-req-build-vlcqfrb8/
Exception information:
Traceback (most recent call last):
File "/home/amalik/PyTorchHorovod/lib/python3.6/site-packages/pip/_internal/basecommand.py", line 228, in main
status = self.run(options, args)
File "/home/amalik/PyTorchHorovod/lib/python3.6/site-packages/pip/_internal/commands/install.py", line 335, in run
use_user_site=options.use_user_site,
File "/home/amalik/PyTorchHorovod/lib/python3.6/site-packages/pip/_internal/req/init.py", line 49, in install_given_reqs
**kwargs
File "/home/amalik/PyTorchHorovod/lib/python3.6/site-packages/pip/_internal/req/req_install.py", line 779, in install
spinner=spinner,
File "/home/amalik/PyTorchHorovod/lib/python3.6/site-packages/pip/_internal/utils/misc.py", line 698, in call_subprocess
% (command_desc, proc.returncode, cwd))
pip._internal.exceptions.InstallationError: Command "/home/amalik/PyTorchHorovod/bin/python -u -c "import setuptools, tokenize;file='/tmp/pip-req-build-vlcqfrb8/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-record-db5v6nte/install-record.txt --single-version-externally-managed --compile --user --prefix=" failed with error code 1 in /tmp/pip-req-build-vlcqfrb8/
You are using pip version 10.0.1, however version 18.0 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

It is complaining about the MPI ??

alsrgv · 2018-10-02T02:04:38Z

@abidmalikwaterloo, do you have HOROVOD_MPICXX_SHOW set? It appears that it's set to /software/openmpi/3.0.0-gnu/. Can you un-set it, and if that does not help, set it to export HOROVOD_MPICXX_SHOW="/software/openmpi/3.0.0-gnu/bin/mpicxx -show"?

abidmalikwaterloo · 2018-10-03T10:57:45Z

@alsrgv It seems that I didn't get any error yet with this new setting. I also changed the virtual environment. Currently, I am testing it extensively with different runtime variables just to ensure that if the breaking has to do with the nondeterministic behavior.

abidmalikwaterloo · 2018-10-04T13:49:23Z

@alsrgv Finally
got the same error. FYI, I ran a successful training using 5 GPUs with 10 epochs. Now I tried with 6 GPUs and got the error. I think this shows an un-deterministic behavior. I am attaching the error log for your convenience.

err_4559.log.pdf

andfoy · 2018-10-06T02:58:29Z

@alsrgv I managed to replicate the error once again using your debugging build, here is the error traceback:

Traceback (most recent call last):
  File "/home/eamargffoy/anaconda3/envs/parallel/lib/python3.6/runpy.py", line 193, in _run_module_as_main
Traceback (most recent call last):
  File "/home/eamargffoy/anaconda3/envs/parallel/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/eamargffoy/anaconda3/envs/parallel/lib/python3.6/runpy.py", line 85, in _run_code
    "__main__", mod_spec)
  File "/home/eamargffoy/anaconda3/envs/parallel/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
    exec(code, run_globals)
  File "/media/SSD1/score-textseg/ref_score_net/train.py", line 495, in <module>
  File "/media/SSD1/score-textseg/ref_score_net/train.py", line 495, in <module>
    train_loss = train(epoch)
  File "/media/SSD1/score-textseg/ref_score_net/train.py", line 363, in train
    train_loss = train(epoch)
  File "/media/SSD1/score-textseg/ref_score_net/train.py", line 363, in train
    optimizer.step()
    optimizer.step()
  File "/home/eamargffoy/anaconda3/envs/parallel/lib/python3.6/site-packages/horovod/torch/__init__.py", line 88, in step
  File "/home/eamargffoy/anaconda3/envs/parallel/lib/python3.6/site-packages/horovod/torch/__init__.py", line 88, in step
    self.synchronize()
  File "/home/eamargffoy/anaconda3/envs/parallel/lib/python3.6/site-packages/horovod/torch/__init__.py", line 84, in synchronize
    self.synchronize()
  File "/home/eamargffoy/anaconda3/envs/parallel/lib/python3.6/site-packages/horovod/torch/__init__.py", line 84, in synchronize
    synchronize(handle)
  File "/home/eamargffoy/anaconda3/envs/parallel/lib/python3.6/site-packages/horovod/torch/mpi_ops.py", line 417, in synchronize
    synchronize(handle)
  File "/home/eamargffoy/anaconda3/envs/parallel/lib/python3.6/site-packages/horovod/torch/mpi_ops.py", line 417, in synchronize
    mpi_lib.horovod_torch_wait_and_clear(handle)
  File "/home/eamargffoy/anaconda3/envs/parallel/lib/python3.6/site-packages/torch/utils/ffi/__init__.py", line 197, in safe_call
    mpi_lib.horovod_torch_wait_and_clear(handle)
  File "/home/eamargffoy/anaconda3/envs/parallel/lib/python3.6/site-packages/torch/utils/ffi/__init__.py", line 197, in safe_call
    result = torch._C._safe_call(*args, **kwargs)
    result = torch._C._safe_call(*args, **kwargs)
torch.FatalError: cudaMemcpyAsync1 failed: invalid argument
torch.FatalError: cudaMemcpyAsync1 failed: invalid argument

alsrgv · 2018-10-07T15:56:17Z

@abidmalikwaterloo, @andfoy, thanks for reproducing this issue. It certainly narrows it down to a single cudaMemcpyAsync statement. I've updated the debug_before_memcpy branch with additional debug information, could you re-install and reproduce again?

abidmalikwaterloo · 2018-10-15T12:41:22Z

@alsrgv FYI, Running the experiments. Unable to get the resources because of the long queue on the cluster. Will update as soon as I see the crash.

ppwwyyxx · 2019-09-01T01:27:39Z

Haven't seen such errors afterwards. So closing

alsrgv added the question label Jul 28, 2018

abidmalikwaterloo mentioned this issue Sep 25, 2018

Running Pytorch with Horovod #492

Closed

ppwwyyxx mentioned this issue Jan 4, 2019

Faster RCNN training exited with Process received signal tensorpack/tensorpack#1031

Closed

ppwwyyxx closed this as completed Sep 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cudaMemcpyAsync failed: invalid argument during training #404

cudaMemcpyAsync failed: invalid argument during training #404

ppwwyyxx commented Jul 26, 2018

winwinJJiang commented Jul 26, 2018

alsrgv commented Jul 28, 2018

ppwwyyxx commented Aug 2, 2018

alsrgv commented Aug 3, 2018

ppwwyyxx commented Aug 3, 2018

alsrgv commented Aug 3, 2018

ppwwyyxx commented Aug 3, 2018

andfoy commented Sep 21, 2018

alsrgv commented Sep 28, 2018

abidmalikwaterloo commented Sep 28, 2018

abidmalikwaterloo commented Sep 28, 2018

mrfox321 commented Sep 28, 2018

alsrgv commented Sep 28, 2018

alsrgv commented Sep 28, 2018

abidmalikwaterloo commented Oct 1, 2018

alsrgv commented Oct 2, 2018 •

edited

abidmalikwaterloo commented Oct 3, 2018

abidmalikwaterloo commented Oct 4, 2018 •

edited

andfoy commented Oct 6, 2018

alsrgv commented Oct 7, 2018

abidmalikwaterloo commented Oct 15, 2018

ppwwyyxx commented Sep 1, 2019

cudaMemcpyAsync failed: invalid argument during training #404

cudaMemcpyAsync failed: invalid argument during training #404

Comments

ppwwyyxx commented Jul 26, 2018

winwinJJiang commented Jul 26, 2018

alsrgv commented Jul 28, 2018

ppwwyyxx commented Aug 2, 2018

alsrgv commented Aug 3, 2018

ppwwyyxx commented Aug 3, 2018

alsrgv commented Aug 3, 2018

ppwwyyxx commented Aug 3, 2018

andfoy commented Sep 21, 2018

alsrgv commented Sep 28, 2018

abidmalikwaterloo commented Sep 28, 2018

abidmalikwaterloo commented Sep 28, 2018

mrfox321 commented Sep 28, 2018

alsrgv commented Sep 28, 2018

alsrgv commented Sep 28, 2018

abidmalikwaterloo commented Oct 1, 2018

I am getting the following message:

alsrgv commented Oct 2, 2018 • edited

abidmalikwaterloo commented Oct 3, 2018

abidmalikwaterloo commented Oct 4, 2018 • edited

andfoy commented Oct 6, 2018

alsrgv commented Oct 7, 2018

abidmalikwaterloo commented Oct 15, 2018

ppwwyyxx commented Sep 1, 2019

alsrgv commented Oct 2, 2018 •

edited

abidmalikwaterloo commented Oct 4, 2018 •

edited