libgcc_s.so.1 must be installed for pthread_cancel to work #41661

ruotianluo · 2020-07-20T03:36:31Z

🐛 Bug

Only get error with nightly, 1.5.1 works fine.

(Edit: I saw this at other places. The main problem of getting this error is I can't see the original error trace.)

How to reproduce

>>> import torchvision
>>> x = torchvision.models.resnet.resnet50(True)
Downloading: "https://download.pytorch.org/models/resnet50-19c8e357.pth" to /home-nfs/rluo/.cache/torch/hub/checkpoints/resnet50-19c8e357.pth
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 97.8M/97.8M [00:01<00:00, 65.4MB/s]
libgcc_s.so.1 must be installed for pthread_cancel to work
Aborted

By stepping in, it seems the error occurs at the end of downloading.

Environment

Collecting environment information...
PyTorch version: 1.7.0.dev20200709+cu101
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: CentOS Linux 7 (Core)
GCC version: (GCC) 7.5.0
CMake version: version 3.14.0

Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: 10.1.243
GPU models and configuration:
GPU 0: GeForce GTX 1080 Ti
GPU 1: GeForce GTX 1080 Ti

Nvidia driver version: 418.43
cuDNN version: Could not collect

Versions of relevant libraries:
[pip] detectron-pytorch==0.1
[pip] gluoncv-torch==0.0.3
[pip] numpy==1.18.4
[pip] numpydoc==0.9.2
[pip] pytorch-lightning==0.8.6.dev0
[pip] pytorch-pretrained-bert==0.6.2
[pip] torch==1.7.0.dev20200709+cu101
[pip] torchvision==0.8.0.dev20200719+cu101
[conda] blas 1.0 mkl
[conda] cudatoolkit 10.1.243 h6bb024c_0
[conda] detectron-pytorch 0.1 dev_0
[conda] gluoncv-torch 0.0.3 pypi_0 pypi
[conda] magma-cuda102 2.5.2 1 pytorch
[conda] mkl 2019.5 281 conda-forge
[conda] mkl-include 2020.1 217
[conda] mkl-service 2.3.0 py37he904b0f_0
[conda] mkl_fft 1.0.15 py37ha843d7b_0
[conda] mkl_random 1.1.0 py37hd6b4f25_0
[conda] numpy 1.17.2 pypi_0 pypi
[conda] numpydoc 0.9.2 py_0
[conda] pytorch-lightning 0.8.6.dev0 dev_0
[conda] pytorch-pretrained-bert 0.6.2 pypi_0 pypi
[conda] torch 1.7.0.dev20200709+cu101 pypi_0 pypi
[conda] torchvision 0.8.0.dev20200719+cu101 pypi_0 pypi

cc @ezyang @gchanan @zou3519 @seemethere @malfet

The text was updated successfully, but these errors were encountered:

gchanan · 2020-07-20T23:23:16Z

very likely an issue with your setup; googling around that error message shows a lot of examples.

gchanan · 2020-07-20T23:23:51Z

@malfet would know more though.

SteffenCzolbe · 2020-07-31T18:21:28Z

Dumped hours of debugging time into this issue since upgrading to torch 1.6.0, still have no clue what causes it. Some of my models work fine, while others abort with the previously mentioned error.

Only fix I found was downgrading to 1.5.1 :/

ezyang · 2020-08-02T20:01:46Z

raising priority based on activity

gchanan · 2020-08-03T17:31:06Z

@SteffenCzolbe can you post the environment information (see above example, which is on master).

ruotianluo · 2020-08-03T21:20:16Z

I did a binary search. I am not having trouble (getting regular error trace) with 1.6.0.dev20200411+cu101.
But got:

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted

between 412-421

422-424 I just got segmentation fault.
and
425-427 there are no proper wheels.
428 I start to get the error in the title.

The code I use is here https://gist.github.com/ruotianluo/54c25460b2ca43a274f50e1a7daa409a.

mattip · 2020-08-09T15:33:42Z

I cannot reproduce in a cuda-less docker environment using the pypa/manylinux2014 image based on CentOS 7.8.2003 and torch 1.6.0 official wheels. It does seem strange that @ruotianluo has two different NumPy versions.

ruotianluo · 2020-08-09T19:22:20Z

FYI. I seem to be able to get correct error trace with stable 1.6.0 now. But still fail with torch==1.6.0.dev20200428+cu101.
current env:

Collecting environment information...
PyTorch version: 1.6.0+cu101
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: CentOS Linux 7 (Core)
GCC version: (GCC) 7.5.0
CMake version: version 3.14.0

Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: 10.1.243
GPU models and configuration:
GPU 0: GeForce GTX 1080 Ti
GPU 1: GeForce GTX 1080 Ti

Nvidia driver version: 418.43
cuDNN version: Could not collect

Versions of relevant libraries:
[pip] detectron-pytorch==0.1
[pip] gluoncv-torch==0.0.3
[pip] numpy==1.18.4
[pip] numpydoc==0.9.2
[pip] pytorch-pretrained-bert==0.6.2
[pip] torch==1.6.0+cu101
[pip] torchvision==0.7.0+cu101
[conda] blas                      1.0                         mkl
[conda] cudatoolkit               10.1.243             h6bb024c_0
[conda] detectron-pytorch         0.1                       dev_0    <develop>
[conda] gluoncv-torch             0.0.3                    pypi_0    pypi
[conda] magma-cuda102             2.5.2                         1    pytorch
[conda] mkl                       2019.5                      281    conda-forge
[conda] mkl-include               2020.1                      217
[conda] mkl-service               2.3.0            py37he904b0f_0
[conda] mkl_fft                   1.0.15           py37ha843d7b_0
[conda] mkl_random                1.1.0            py37hd6b4f25_0
[conda] numpy                     1.18.4                   pypi_0    pypi
[conda] numpydoc                  0.9.2                      py_0
[conda] pytorch-pretrained-bert   0.6.2                    pypi_0    pypi
[conda] torch                     1.6.0+cu101              pypi_0    pypi
[conda] torchvision               0.7.0+cu101              pypi_0    pypi

rgommers · 2020-08-09T20:43:48Z

Other info that will be relevant:

how did you install GCC?
how did you install PyTorch and torchvision?

@ruotianluo looking at you package info, you are using a mix between pip and conda-installed packages - this is never a good idea and will lead to the kind of issue you're seeing.

FYI. I seem to be able to get correct error trace with stable 1.6.0 now.

If you could reproduce this in a clean environment where all packages are installed with the same package manager, then that would really help (and hint at a real issue). E.g.:

conda create -n issue41661 python=3.7
conda activate issue41661
pip install numpy
pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
python my_script_that_is_failing.py

ruotianluo · 2020-08-09T21:44:42Z

I built my gcc from source. (I used to use conda to install torch and I switched to pip. That's why there is something left in the conda env. (magma/cudatoolkit))

Using a clean conda env as you suggested: the same result. Getting "libgcc_s.so.1 must be installed for pthread_cancel to work" with torch==1.6.0.dev20200428+cu101, and correct error trace with stable 1.6.0.

Maybe it has been fixed in 1.6.0??

Collecting environment information...
PyTorch version: 1.6.0.dev20200428+cu101
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: CentOS Linux 7 (Core)
GCC version: (GCC) 7.5.0
CMake version: version 2.8.12.2

Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: 10.1.243
GPU models and configuration:
GPU 0: GeForce GTX 1080 Ti
GPU 1: GeForce GTX 1080 Ti

Nvidia driver version: 418.43
cuDNN version: Could not collect

Versions of relevant libraries:
[pip3] numpy==1.19.1
[pip3] torch==1.6.0.dev20200428+cu101
[pip3] torchvision==0.7.0+cu101
[conda] numpy                     1.19.1                   pypi_0    pypi
[conda] torch                     1.6.0.dev20200428+cu101          pypi_0    pypi
[conda] torchvision               0.7.0+cu101              pypi_0    pypi

mattip · 2020-08-09T21:55:18Z

I built my gcc from source.

Try installing the conda compilers conda install compilers. You may have built your libc_gcc without compatible pthread support. Installing the conda compilers should supply a compatible path/to/conda/env/lib/libgcc_s.so.1

ruotianluo · 2020-08-10T02:40:40Z

Installed gcc:

(issue41661) [rluo@gpu20 ~]$ conda list | grep  gcc
_libgcc_mutex             0.1                        main
gcc_impl_linux-64         7.3.0                habb00fd_1
gcc_linux-64              7.3.0                h553295d_9
libgcc-ng                 9.1.0                hdf63c60_0
(issue41661) [rluo@gpu20 ~]$ ls ~/rluo/local/anaconda3/envs/issue41661/lib/libgcc_s.so.1
/home-nfs/rluo/rluo/local/anaconda3/envs/issue41661/lib/libgcc_s.so.1

Still fail with torch==1.6.0.dev20200428+cu101

mattip · 2020-08-10T07:22:36Z

Are you using the pytorch you think you are?
$ python -c "import torch; print(torch._C)"
What does ldd show for that file?

ruotianluo · 2020-08-10T13:06:40Z

	linux-vdso.so.1 =>  (0x00007ffc1bbfb000)
	libshm.so => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/python3.7/site-packages/torch/lib/libshm.so (0x00007f30d3d43000)
	libtorch_python.so => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/python3.7/site-packages/torch/lib/libtorch_python.so (0x00007f30d2c9e000)
	libstdc++.so.6 => /share/data/vision-greg/rluo/local/gcc-7.5.0/lib64/libstdc++.so.6 (0x00007f30d291b000)
	libm.so.6 => /usr/lib64/libm.so.6 (0x00007f30d2619000)
	libgcc_s.so.1 => /share/data/vision-greg/rluo/local/gcc-7.5.0/lib64/libgcc_s.so.1 (0x00007f30d2402000)
	libpthread.so.0 => /usr/lib64/libpthread.so.0 (0x00007f30d21e6000)
	libc.so.6 => /usr/lib64/libc.so.6 (0x00007f30d1e19000)
	libtorch.so => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/python3.7/site-packages/torch/lib/libtorch.so (0x00007f30d1c05000)
	librt.so.1 => /usr/lib64/librt.so.1 (0x00007f30d19fd000)
	libtorch_cpu.so => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so (0x00007f30c2626000)
	libtorch_cuda.so => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so (0x00007f3085e2e000)
	libnvToolsExt-3965bdd0.so.1 => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/python3.7/site-packages/torch/lib/libnvToolsExt-3965bdd0.so.1 (0x00007f3085c24000)
	libc10_cuda.so => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/python3.7/site-packages/torch/lib/libc10_cuda.so (0x00007f30859f5000)
	libc10.so => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/python3.7/site-packages/torch/lib/libc10.so (0x00007f3085799000)
	libdl.so.2 => /usr/lib64/libdl.so.2 (0x00007f3085595000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f30d4168000)
	libgomp-7c85b1e2.so.1 => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/python3.7/site-packages/torch/lib/libgomp-7c85b1e2.so.1 (0x00007f308536b000)
	libcudart-1b201d85.so.10.1 => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/python3.7/site-packages/torch/lib/libcudart-1b201d85.so.10.1 (0x00007f30850ec000)

mattip · 2020-08-10T13:18:36Z

So installing the conda compilers was not useful, since the "wrong" libgcc_s.so.1 is being picked up. Do you have LD_LIBRARY_PATH defined?

ruotianluo · 2020-08-10T13:25:20Z

Yes.

/share/data/vision-greg/rluo/local/gcc-7.5.0/lib64:/share/data/vision-greg/rluo/local/gcc-7.5.0/lib:/share/data/vision-greg/rluo/local/anaconda3/lib:/share/data/vision-greg/common/libjpeg/lib:/share/data/vision-greg/rluo/local/nccl/lib:/share/data/vision-greg/common/boost-1.57/lib:/share/data/vision-greg/rluo/local/cuda-10.1/lib64:/share/data/vision-greg/rluo/local/cuda-10.1/extras/CUPTI/lib64:/share/data/vision-greg/rluo/local/cudnn-7.6.4-for-cuda-10.1/lib64:/share/data/vision-greg/rluo/local/gcc-7.5.0/lib64:/share/data/vision-greg/rluo/local/gcc-7.5.0/lib:/share/data/vision-greg/rluo/local/anaconda3/lib:/share/data/vision-greg/common/libjpeg/lib:/share/data/vision-greg/rluo/local/nccl/lib:/share/data/vision-greg/common/boost-1.57/lib:/share/data/vision-greg/rluo/local/cuda-10.1/lib64:/share/data/vision-greg/rluo/local/cuda-10.1/extras/CUPTI/lib64:/share/data/vision-greg/rluo/local/cudnn-7.6.4-for-cuda-10.1/lib64:

mattip · 2020-08-10T13:51:40Z

Can you put /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib before the rest so it picks up the conda-provided libgcc_s.so.1?

ruotianluo · 2020-08-10T15:05:20Z

	linux-vdso.so.1 =>  (0x00007ffe8ffa5000)
	libshm.so => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/python3.7/site-packages/torch/lib/libshm.so (0x00007f325d4cf000)
	libtorch_python.so => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/python3.7/site-packages/torch/lib/libtorch_python.so (0x00007f325c42a000)
	libstdc++.so.6 => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/libstdc++.so.6 (0x00007f325d99e000)
	libm.so.6 => /usr/lib64/libm.so.6 (0x00007f325c128000)
	libgcc_s.so.1 => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/libgcc_s.so.1 (0x00007f325d971000)
	libpthread.so.0 => /usr/lib64/libpthread.so.0 (0x00007f325bf0c000)
	libc.so.6 => /usr/lib64/libc.so.6 (0x00007f325bb3f000)
	libtorch.so => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/python3.7/site-packages/torch/lib/libtorch.so (0x00007f325b92b000)
	librt.so.1 => /usr/lib64/librt.so.1 (0x00007f325b723000)
	libtorch_cpu.so => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so (0x00007f324c34c000)
	libtorch_cuda.so => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so (0x00007f320fb54000)
	libnvToolsExt-3965bdd0.so.1 => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/python3.7/site-packages/torch/lib/libnvToolsExt-3965bdd0.so.1 (0x00007f320f94a000)
	libc10_cuda.so => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/python3.7/site-packages/torch/lib/libc10_cuda.so (0x00007f320f71b000)
	libc10.so => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/python3.7/site-packages/torch/lib/libc10.so (0x00007f320f4bf000)
	libdl.so.2 => /usr/lib64/libdl.so.2 (0x00007f320f2bb000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f325d8f4000)
	libgomp-7c85b1e2.so.1 => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/python3.7/site-packages/torch/lib/libgomp-7c85b1e2.so.1 (0x00007f320f091000)
	libcudart-1b201d85.so.10.1 => /share/data/vision-greg/rluo/local/anaconda3/envs/issue41661/lib/python3.7/site-packages/torch/lib/libcudart-1b201d85.so.10.1 (0x00007f320ee12000)

It can pick up the conda libgcc_s now. But I still get "libgcc_s.so.1 must be installed for pthread_cancel to work"

mattip · 2020-08-12T17:17:34Z

I am thinking this is not connected to pytorch, rather any program you compile that uses pythread_cancel will show this error on your system. The man page has an example, does that compile and run?

ruotianluo · 2020-08-12T20:04:43Z

Yes. It runs correctly.

rgommers · 2020-08-16T13:52:45Z

So installing the conda compilers was not useful, since the "wrong" libgcc_s.so.1 is being picked up. Do you have LD_LIBRARY_PATH defined?

It looks to me like @ruotianluo installed compilers without rebuilding PyTorch and Torchvision with those compilers, or is still working in a conda env that's somehow messed up. I'd suggest the current back-and-forth isn't all that helpful. There are two people who reported this issue, but there's no reproducer. We need a full reproducer, either with Docker with a system GCC from the distro's package manager, or in a clean conda env with conda compilers.

vanewu · 2020-09-01T09:16:52Z

After I upgraded to 1.6.0, I also encountered the same problem. I have checked that libgcc_s.so.1 exists. My GCC is also 7.5.

rgommers · 2020-09-01T09:21:40Z

@kenjewu thanks for the report. Could you please add the output of python torch/utils/collect_env.py? And how you installed GCC?

philokey · 2020-09-25T12:13:40Z

@ruotianluo Did you solve this problem，I also encounter this problem when I use pytorch1.6.

evaldsurtans · 2020-11-04T19:12:23Z

For me it consistently happens when I install latest pytorch and use it with:

if torch.cuda.device_count() > 1:
    model = torch.nn.DataParallel(model, dim=0)

without dataparallel works fine, I checked that LD_LIBRARY_PATH contains libgcc_s.so.1

brando90 · 2020-11-07T16:03:11Z

Dumped hours of debugging time into this issue since upgrading to torch 1.6.0, still have no clue what causes it. Some of my models work fine, while others abort with the previously mentioned error.

Only fix I found was downgrading to 1.5.1 :/

how do you downgrade?

dnaaun · 2020-11-24T22:18:19Z

@brando90 , if you are using pip, you can downgrade by doing pip install torch==1.5.1 (assuming 1.5.1 is an acutal version that exists, didn't check). What will definitely work is something like pip install 'torch<1.6'

ritvik1512 · 2020-11-29T14:13:26Z

Any updates on this? Updating to 1.6.0 gives the same error.

mattip · 2020-11-30T13:31:50Z

So far we have theorized that the wrong libgcc_s.so.1 is being picked up. This could be shown by someone with the problem compiling the test program in the man page for pthread_cancel and, if it runs correctly, trying to figure out which libgcc the test program is using versus which libgcc pytorch is using.

Since I cannot reproduce nor help without more information, I am unassigning myself from the issue.

rgommers · 2020-11-30T18:18:36Z

So far we have theorized that the wrong libgcc_s.so.1 is being picked up.

I don't think it is actually. A test with the exact conda env in #41661 (comment), which has a pip-installed pytorch 1.6.0 shows libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 - and running the 1.6.0 test suite against this pip-installed pytorch works just fine.

Since I cannot reproduce nor help without more information

Yep, me neither. If anyone who encounters this issue could put together a reproducer, that would be super helpful. Try one of:

shrubb · 2021-02-25T18:53:16Z

Had this error too with pip-installed PyTorch and with GCC built from source. This helped.

tejas-gokhale · 2021-06-21T23:56:25Z

I can confirm that I got the same error with torch version '1.9.0+cu102', and downgrading to previous versions solved it.

seyeeet · 2021-06-25T20:06:03Z

I also get this error with pytorch 1.9.0+cu102

malfet · 2021-06-25T22:09:10Z

@seyeeet can you run `python3 -m torch.utils.collect_env" and share its output here?

seyeeet · 2021-06-26T19:50:33Z

@malfet
yep

Collecting environment information...
PyTorch version: 1.9.0
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A

OS: CentOS Linux release 7.6.1810 (Core)  (x86_64)
GCC version: (GCC) 8.2.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.9

Python version: 3.6 (64-bit runtime)
Python platform: Linux-3.10.0-957.1.3.el7.x86_64-x86_64-with-centos-7.6.1810-Core
Is CUDA available: False
CUDA runtime version: 10.2.89
GPU models and configuration: Could not collect
Nvidia driver version: Could not collect
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.5
[pip3] numpydoc==1.1.0
[pip3] torch==1.9.0
[pip3] torch-summary==1.4.5
[pip3] torchaudio==0.9.0a0+33b2469
[pip3] torchfile==0.1.0
[pip3] torchtext==0.10.0
[pip3] torchvision==0.10.0
[conda] blas                      1.0                         mkl
[conda] cudatoolkit               10.2.89              hfd86e86_1
[conda] ffmpeg                    4.3                  hf484d3e_0    pytorch
[conda] mkl                       2018.0.3                      1
[conda] mkl-service               1.1.2            py36h90e4bf4_5
[conda] mkl_fft                   1.0.4            py36h4414c95_1
[conda] mkl_random                1.0.1            py36h4414c95_1
[conda] numpy                     1.19.5                   pypi_0    pypi
[conda] numpydoc                  1.1.0              pyhd3eb1b0_1
[conda] pytorch                   1.9.0           py3.6_cuda10.2_cudnn7.6.5_0    pytorch
[conda] torch-summary             1.4.5                    pypi_0    pypi
[conda] torchaudio                0.9.0                      py36    pytorch
[conda] torchfile                 0.1.0                    pypi_0    pypi
[conda] torchtext                 0.10.0                     py36    pytorch
[conda] torchvision               0.10.0               py36_cu102    pytorch

bonlime · 2021-08-04T17:48:52Z

I've also encountered this issue after installing nightly build of PyTorch. On latest stable 1.9 release it works. For me the output of python3 -m torch.utils.collect_env is as follows:

Collecting environment information...
PyTorch version: 1.10.0.dev20210804+cu111
Is debug build: False
CUDA used to build PyTorch: 11.1
ROCM used to build PyTorch: N/A

OS: CentOS Linux release 7.9.2009 (Core) (x86_64)
GCC version: (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
Clang version: 3.8.0 (tags/RELEASE_380/final)
CMake version: version 2.8.12.2
Libc version: glibc-2.17

Python version: 3.8.5 (default, Jul 29 2020, 13:59:36)  [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] (64-bit runtime)
Python platform: Linux-5.4.15-1.el7.elrepo.x86_64-x86_64-with-glibc2.17
Is CUDA available: False
CUDA runtime version: 10.2.89
GPU models and configuration: Could not collect
Nvidia driver version: Could not collect
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.18.5
[pip3] torch==1.10.0.dev20210804+cu111
[pip3] torchaudio==0.9.0
[pip3] torchvision==0.10.0+cu111
[conda] Could not collect

zeakey · 2021-11-27T12:48:12Z

Same error with torch==1.8.1, gcc==7.3.0.
The output of python3 -m torch.utils.collect_env is:

Collecting environment information...
PyTorch version: 1.8.1+cu111
Is debug build: False
CUDA used to build PyTorch: 11.1
ROCM used to build PyTorch: N/A

OS: Tencent tlinux 2.2 (Final) (x86_64)
GCC version: (GCC) 7.3.0
Clang version: Could not collect
CMake version: version 3.18.5

Python version: 3.8 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: 
GPU 0: A100-SXM4-40GB
GPU 1: A100-SXM4-40GB
GPU 2: A100-SXM4-40GB
GPU 3: A100-SXM4-40GB
GPU 4: A100-SXM4-40GB
GPU 5: A100-SXM4-40GB
GPU 6: A100-SXM4-40GB
GPU 7: A100-SXM4-40GB

Nvidia driver version: 450.80.02
cuDNN version: Probably one of the following:
/usr/lib64/libcudnn.so.8.0.5
/usr/lib64/libcudnn_adv_infer.so.8.0.5
/usr/lib64/libcudnn_adv_train.so.8.0.5
/usr/lib64/libcudnn_cnn_infer.so.8.0.5
/usr/lib64/libcudnn_cnn_train.so.8.0.5
/usr/lib64/libcudnn_ops_infer.so.8.0.5
/usr/lib64/libcudnn_ops_train.so.8.0.5
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.21.4
[pip3] torch==1.8.1+cu111
[pip3] torchvision==0.9.1+cu111
[conda] Could not collect

GangLiTarheel · 2022-10-26T16:32:22Z

Same error:

$ python3 -m torch.utils.collect_env
Collecting environment information...
PyTorch version: 1.10.2+cu102
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A

OS: CentOS Linux release 7.8.2003 (Core) (x86_64)
GCC version: (GCC) 9.4.0
Clang version: Could not collect
CMake version: version 3.18.0
Libc version: glibc-2.17

Python version: 3.9.1 (default, Dec 11 2020, 14:32:07)  [GCC 7.3.0] (64-bit runtime)
Python platform: Linux-3.10.0-1160.15.2.el7.x86_64-x86_64-with-glibc2.17
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: 
GPU 0: NVIDIA GeForce GTX 1080
GPU 1: NVIDIA GeForce GTX 1080
GPU 2: NVIDIA GeForce GTX 1080
GPU 3: NVIDIA GeForce GTX 1080

Nvidia driver version: 470.57.02
cuDNN version: Probably one of the following:
/usr/lib64/libcudnn.so.8.2.4
/usr/lib64/libcudnn_adv_infer.so.8.2.4
/usr/lib64/libcudnn_adv_train.so.8.2.4
/usr/lib64/libcudnn_cnn_infer.so.8.2.4
/usr/lib64/libcudnn_cnn_train.so.8.2.4
/usr/lib64/libcudnn_ops_infer.so.8.2.4
/usr/lib64/libcudnn_ops_train.so.8.2.4
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.20.2
[pip3] torch==1.10.2
[pip3] torchaudio==0.8.0a0+e4e171a
[pip3] torchvision==0.9.1
[conda] _tflow_select             2.3.0                       mkl  
[conda] blas                      1.0                         mkl  
[conda] cudatoolkit               11.1.74              h6bb024c_0    nvidia
[conda] ffmpeg                    4.3                  hf484d3e_0    pytorch
[conda] mkl                       2021.2.0           h06a4308_296  
[conda] mkl-service               2.3.0            py39h27cfd23_1  
[conda] mkl_fft                   1.3.0            py39h42c9631_2  
[conda] mkl_random                1.2.1            py39ha9443f7_2  
[conda] numpy                     1.20.2           py39h2d18471_0  
[conda] numpy-base                1.20.2           py39hfae3a4d_0  
[conda] tensorflow                2.4.1           mkl_py39h4683426_0  
[conda] tensorflow-base           2.4.1           mkl_py39h43e0292_0  
[conda] torch                     1.10.2                   pypi_0    pypi
[conda] torchaudio                0.8.1                      py39    pytorch
[conda] torchvision               0.9.1                py39_cu111    pytorch

elvinagam · 2022-11-15T16:16:40Z

it is so weird that this issue is everywhere. But, I believe it is about older Ubuntu version of the server/setup that you are using. Try using a server/reproducing the same error on a server which has Ubuntu 20< or newer

parvathirajan · 2023-01-05T12:22:29Z

+1

I'm getting this issue while running pipenv install

gchanan added module: binaries Anything related to official binaries that we release to users triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Jul 20, 2020

ruotianluo closed this as completed Jul 22, 2020

ruotianluo reopened this Jul 29, 2020

ezyang added the high priority label Aug 2, 2020

pytorch-probot bot added the triage review label Aug 2, 2020

mruberry removed the triage review label Aug 3, 2020

mattip self-assigned this Aug 9, 2020

francoishernandez mentioned this issue Sep 8, 2020

What version of GCC do I need? OpenNMT/OpenNMT-py#1863

Closed

mattip removed their assignment Nov 30, 2020

libgcc_s.so.1 must be installed for pthread_cancel to work #41661

libgcc_s.so.1 must be installed for pthread_cancel to work #41661

Comments

ruotianluo commented Jul 20, 2020 • edited by pytorch-probot bot

🐛 Bug

How to reproduce

Environment

gchanan commented Jul 20, 2020

gchanan commented Jul 20, 2020

SteffenCzolbe commented Jul 31, 2020

ezyang commented Aug 2, 2020

gchanan commented Aug 3, 2020 • edited

ruotianluo commented Aug 3, 2020

mattip commented Aug 9, 2020

ruotianluo commented Aug 9, 2020

rgommers commented Aug 9, 2020

ruotianluo commented Aug 9, 2020 • edited

mattip commented Aug 9, 2020

ruotianluo commented Aug 10, 2020

mattip commented Aug 10, 2020

ruotianluo commented Aug 10, 2020

mattip commented Aug 10, 2020

ruotianluo commented Aug 10, 2020

mattip commented Aug 10, 2020

ruotianluo commented Aug 10, 2020

mattip commented Aug 12, 2020

ruotianluo commented Aug 12, 2020

rgommers commented Aug 16, 2020

vanewu commented Sep 1, 2020

rgommers commented Sep 1, 2020

philokey commented Sep 25, 2020

evaldsurtans commented Nov 4, 2020

brando90 commented Nov 7, 2020

dnaaun commented Nov 24, 2020

ritvik1512 commented Nov 29, 2020

mattip commented Nov 30, 2020

rgommers commented Nov 30, 2020

shrubb commented Feb 25, 2021

tejas-gokhale commented Jun 21, 2021

seyeeet commented Jun 25, 2021

malfet commented Jun 25, 2021

seyeeet commented Jun 26, 2021

bonlime commented Aug 4, 2021

zeakey commented Nov 27, 2021

GangLiTarheel commented Oct 26, 2022 • edited

elvinagam commented Nov 15, 2022

parvathirajan commented Jan 5, 2023

ruotianluo commented Jul 20, 2020 •

edited by pytorch-probot bot

gchanan commented Aug 3, 2020 •

edited

ruotianluo commented Aug 9, 2020 •

edited

GangLiTarheel commented Oct 26, 2022 •

edited