Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot import torchdata when installed with conda on WSL #961

Open
erip opened this issue Jan 24, 2023 · 17 comments
Open

Cannot import torchdata when installed with conda on WSL #961

erip opened this issue Jan 24, 2023 · 17 comments

Comments

@erip
Copy link
Contributor

erip commented Jan 24, 2023

馃悰 Describe the bug

$ conda install -c pytorch torchdata -y
$ python -c "import torchdata"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/erip/anaconda3/envs/tmp-dev/lib/python3.10/site-packages/torchdata/__init__.py", line 7, in <module>
    from torchdata import _extension  # noqa: F401
  File "/home/erip/anaconda3/envs/tmp-dev/lib/python3.10/site-packages/torchdata/_extension.py", line 34, in <module>
    _init_extension()
  File "/home/erip/anaconda3/envs/tmp-dev/lib/python3.10/site-packages/torchdata/_extension.py", line 31, in _init_extension
    from torchdata import _torchdata as _torchdata
ImportError: /usr/lib/x86_64-linux-gnu/libp11-kit.so.0: undefined symbol: ffi_type_pointer, version LIBFFI_BASE_7.0
$ conda uninstall torchdata -y && pip install torchdata
$ python -c "import torchdata"
$

Versions

PyTorch version: 1.13.1
Is debug build: False
CUDA used to build PyTorch: 11.7
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.5 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.31

Python version: 3.10.9 (main, Jan 11 2023, 15:21:40) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.15.79.1-microsoft-standard-WSL2-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: 11.7.99
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3090
Nvidia driver version: 528.02
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] torch==1.13.1
[pip3] torchdata==0.5.1
[pip3] torchtext==0.14.1
[conda] blas 1.0 mkl
[conda] mkl 2022.1.0 hc2b9512_224
[conda] pytorch 1.13.1 py3.10_cuda11.7_cudnn8.5.0_0 pytorch
[conda] pytorch-cuda 11.7 h67b0de4_1 pytorch
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] torchdata 0.5.1 pypi_0 pypi
[conda] torchtext 0.14.1 py310 pytorch

@ejguan
Copy link
Contributor

ejguan commented Jan 24, 2023

Try install libffi from conda?

@erip
Copy link
Contributor Author

erip commented Jan 24, 2023

I did and it didn't help (no change in error). This is a clean install so it's seems like an issue with the conda packaging of torchdata. Note that the wheel is fine.

@ejguan
Copy link
Contributor

ejguan commented Jan 24, 2023

Do you mind checking libffi version?
If it's installed from conda, you should be able to find it via conda list

@ejguan
Copy link
Contributor

ejguan commented Jan 24, 2023

When torchdata 0.5.1 compiled, it used libffi 3.4.2. It seems your environment is trying to use LIBFFI_BASE_7.0.

@erip
Copy link
Contributor Author

erip commented Jan 24, 2023

Yes, here are the relevant details:

$ conda list | grep libffi
libffi                    3.4.2                h6a678d5_6
$ python -c "import torchdata"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/erip/anaconda3/envs/tmp-dev/lib/python3.10/site-packages/torchdata/__init__.py", line 7, in <module>
    from torchdata import _extension  # noqa: F401
  File "/home/erip/anaconda3/envs/tmp-dev/lib/python3.10/site-packages/torchdata/_extension.py", line 34, in <module>
    _init_extension()
  File "/home/erip/anaconda3/envs/tmp-dev/lib/python3.10/site-packages/torchdata/_extension.py", line 31, in _init_extension
    from torchdata import _torchdata as _torchdata
ImportError: /usr/lib/x86_64-linux-gnu/libp11-kit.so.0: undefined symbol: ffi_type_pointer, version LIBFFI_BASE_7.0

It seems like the right lib is coming along for the ride, but something is causing it to be linked improperly. I tried messing with LD_LIBRARY_PATH as well but it didn't seem to help.

@ejguan
Copy link
Contributor

ejguan commented Jan 25, 2023

TBH, I don't know. How about try to use pip to re-install cffi as well?

@erip
Copy link
Contributor Author

erip commented Jan 25, 2023

It doesn't seem to help --- with a force reinstall, cffi 1.15.1 gets (re)installed but with the same error. I can dig into this a bit more soon, but the workaround is to pip install torchdata in the short-term.

@ejguan
Copy link
Contributor

ejguan commented Jan 25, 2023

Yeah. I think pip works simply because we statically link those packages to the wheel, which would prevent this scenario of finding wrong shared lib.

@erip
Copy link
Contributor Author

erip commented Jan 25, 2023

One thing I intend to do is look in $CONDA_PREFIX to see if the right version of libffi is there. I'm not sure what the offending lib is from the error message (looks like a crypto lib?) so I'm not sure how it interacts with the search path either...

@erip
Copy link
Contributor Author

erip commented Jan 26, 2023

After conda installing torchdata, I see the following:

$ find $CONDA_PREFIX -name "*libffi*" | xargs strings | grep LIBFFI_BASE_ | sort -u
LIBFFI_BASE_7.0
LIBFFI_BASE_7.1
LIBFFI_BASE_8.0

I think this suggests that the right libffi might not getting bundled, though I'm not positive since 3.4.2 is clearly installed as reported by conda.

@ejguan
Copy link
Contributor

ejguan commented Jan 27, 2023

Do you mind sharing the result of ldd torchdata/_torchdata.so?

@erip
Copy link
Contributor Author

erip commented Jan 27, 2023

$ conda create -n tmp -y python=3.10 -q && conda activate tmp && conda install -c pytorch torchdata -y -q
$ ldd $CONDA_PREFIX/lib/python3.10/site-packages/torchdata/_torchdata.so
        linux-vdso.so.1 (0x00007ffeb9bb5000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007faad19c0000)
        libz.so.1 => /home/erip/anaconda3/envs/tmp/lib/python3.10/site-packages/torchdata/../../../libz.so.1 (0x00007faad19a2000)
        libcurl.so.4 => /lib/x86_64-linux-gnu/libcurl.so.4 (0x00007faad1910000)
        libssl.so.1.1 => /home/erip/anaconda3/envs/tmp/lib/python3.10/site-packages/torchdata/../../../libssl.so.1.1 (0x00007faad187f000)
        libcrypto.so.1.1 => /home/erip/anaconda3/envs/tmp/lib/python3.10/site-packages/torchdata/../../../libcrypto.so.1.1 (0x00007faad15b2000)
        libstdc++.so.6 => /home/erip/anaconda3/envs/tmp/lib/python3.10/site-packages/torchdata/../../../libstdc++.so.6 (0x00007faad139c000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007faad124d000)
        libgcc_s.so.1 => /home/erip/anaconda3/envs/tmp/lib/python3.10/site-packages/torchdata/../../../libgcc_s.so.1 (0x00007faad1233000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007faad1041000)
        /lib64/ld-linux-x86-64.so.2 (0x00007faad210f000)
        libnghttp2.so.14 => /lib/x86_64-linux-gnu/libnghttp2.so.14 (0x00007faad1018000)
        libidn2.so.0 => /lib/x86_64-linux-gnu/libidn2.so.0 (0x00007faad0ff7000)
        librtmp.so.1 => /lib/x86_64-linux-gnu/librtmp.so.1 (0x00007faad0fd5000)
        libssh.so.4 => /lib/x86_64-linux-gnu/libssh.so.4 (0x00007faad0f67000)
        libpsl.so.5 => /lib/x86_64-linux-gnu/libpsl.so.5 (0x00007faad0f54000)
        libgssapi_krb5.so.2 => /lib/x86_64-linux-gnu/libgssapi_krb5.so.2 (0x00007faad0f07000)
        libldap_r-2.4.so.2 => /lib/x86_64-linux-gnu/libldap_r-2.4.so.2 (0x00007faad0eb1000)
        liblber-2.4.so.2 => /lib/x86_64-linux-gnu/liblber-2.4.so.2 (0x00007faad0ea0000)
        libbrotlidec.so.1 => /lib/x86_64-linux-gnu/libbrotlidec.so.1 (0x00007faad0e90000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007faad0e8a000)
        libunistring.so.2 => /lib/x86_64-linux-gnu/libunistring.so.2 (0x00007faad0d08000)
        libgnutls.so.30 => /lib/x86_64-linux-gnu/libgnutls.so.30 (0x00007faad0b32000)
        libhogweed.so.5 => /lib/x86_64-linux-gnu/libhogweed.so.5 (0x00007faad0afb000)
        libnettle.so.7 => /lib/x86_64-linux-gnu/libnettle.so.7 (0x00007faad0abf000)
        libgmp.so.10 => /lib/x86_64-linux-gnu/libgmp.so.10 (0x00007faad0a3b000)
        libkrb5.so.3 => /lib/x86_64-linux-gnu/libkrb5.so.3 (0x00007faad095e000)
        libk5crypto.so.3 => /lib/x86_64-linux-gnu/libk5crypto.so.3 (0x00007faad092d000)
        libcom_err.so.2 => /lib/x86_64-linux-gnu/libcom_err.so.2 (0x00007faad0926000)
        libkrb5support.so.0 => /lib/x86_64-linux-gnu/libkrb5support.so.0 (0x00007faad0917000)
        libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007faad08f9000)
        libsasl2.so.2 => /lib/x86_64-linux-gnu/libsasl2.so.2 (0x00007faad08dc000)
        libgssapi.so.3 => /lib/x86_64-linux-gnu/libgssapi.so.3 (0x00007faad0897000)
        libbrotlicommon.so.1 => /lib/x86_64-linux-gnu/libbrotlicommon.so.1 (0x00007faad0874000)
        libp11-kit.so.0 => /lib/x86_64-linux-gnu/libp11-kit.so.0 (0x00007faad073e000)
        libtasn1.so.6 => /lib/x86_64-linux-gnu/libtasn1.so.6 (0x00007faad0728000)
        libkeyutils.so.1 => /lib/x86_64-linux-gnu/libkeyutils.so.1 (0x00007faad071f000)
        libheimntlm.so.0 => /lib/x86_64-linux-gnu/libheimntlm.so.0 (0x00007faad0713000)
        libkrb5.so.26 => /lib/x86_64-linux-gnu/libkrb5.so.26 (0x00007faad0680000)
        libasn1.so.8 => /lib/x86_64-linux-gnu/libasn1.so.8 (0x00007faad05da000)
        libhcrypto.so.4 => /lib/x86_64-linux-gnu/libhcrypto.so.4 (0x00007faad05a2000)
        libroken.so.18 => /lib/x86_64-linux-gnu/libroken.so.18 (0x00007faad0587000)
        libffi.so.7 => /home/erip/anaconda3/envs/tmp/lib/python3.10/site-packages/torchdata/../../../libffi.so.7 (0x00007faad0576000)
        libwind.so.0 => /lib/x86_64-linux-gnu/libwind.so.0 (0x00007faad054c000)
        libheimbase.so.1 => /lib/x86_64-linux-gnu/libheimbase.so.1 (0x00007faad053a000)
        libhx509.so.5 => /lib/x86_64-linux-gnu/libhx509.so.5 (0x00007faad04ec000)
        libsqlite3.so.0 => /home/erip/anaconda3/envs/tmp/lib/python3.10/site-packages/torchdata/../../../libsqlite3.so.0 (0x00007faad039e000)
        libcrypt.so.1 => /lib/x86_64-linux-gnu/libcrypt.so.1 (0x00007faad0363000)

@ejguan
Copy link
Contributor

ejguan commented Jan 27, 2023

So, it correctly finds the conda-provided libffi

libffi.so.7 => /home/erip/anaconda3/envs/tmp/lib/python3.10/site-packages/torchdata/../../../libffi.so.7 (0x00007faad0576000)

Could you please share if there is libffi under /lib/x86_64-linux-gnu?
And, ldd /lib/x86_64-linux-gnu/libp11-kit.so.0?

@erip
Copy link
Contributor Author

erip commented Jan 27, 2023

Could you please share if there is libffi under /lib/x86_64-linux-gnu?

Yes, it seems like it.

$ find /lib/x86_64-linux-gnu/ -name "*libffi*"
/lib/x86_64-linux-gnu/pkgconfig/libffi.pc
/lib/x86_64-linux-gnu/libffi.so.7
/lib/x86_64-linux-gnu/libffi.a
/lib/x86_64-linux-gnu/libffi_pic.a
/lib/x86_64-linux-gnu/libffi.so
/lib/x86_64-linux-gnu/libffi.so.7.1.0

And, ldd /lib/x86_64-linux-gnu/libp11-kit.so.0?

$ ldd /lib/x86_64-linux-gnu/libp11-kit.so.0
        linux-vdso.so.1 (0x00007ffc8eb97000)
        libffi.so.7 => /lib/x86_64-linux-gnu/libffi.so.7 (0x00007fb798408000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fb798402000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fb7983df000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb7981ed000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fb798555000)

@ejguan
Copy link
Contributor

ejguan commented Jan 27, 2023

I guess there might be some BC changes between libffi.so.7.1.0 and libffi.so.7 from. conda

But, it's still unclear to me why conda wants to install another libffi.so.7 when there is one from your system. Can you pls check when libffi.so.7 is installed by conda?
During the time of creating new conda environment or installation of pytorch or torchdata?

@erip
Copy link
Contributor Author

erip commented Jan 27, 2023

It seems to be at environment creation time so perhaps this is an upstream issue with conda on WSL. 馃槮

@erip
Copy link
Contributor Author

erip commented Jan 27, 2023

I created an issue upstream which is linked here. Hopefully they'll have some input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants