Installation runs into CUDA problem #212

cap-jmk · 2022-02-18T10:37:11Z

The bug reported at rusty1s/pytorch_sparse#180 and pyg-team/pytorch_geometric#4095 propagates to deeptime, too. Fixing PyTorch to older versions in the setup might help.

clonker · 2022-02-18T10:43:29Z

This seems to be a problem related to incompatible pytorch and pytorch_sparse versions, here we only depend on pytorch and that only weakly; there is no explicit dependency.
In that sense I am a bit uncomfortable fixing a version in the setup as it would introduce a hard link.

cap-jmk · 2022-02-19T09:43:39Z

I experienced the error while installing deeptime in an isolated conda environment on the newest Ubuntu. As pip was pulling the default PyTorch, the error occured for plain PyTorch, too. The error occurs also on Colab when using PyTorch. From what I know, the error does not depend on a Python package but rather on CUDA compilation and is thus independent of a specific Python package. Anyhow, one can't use deeptime in that case and I thus recommend fixing the error.

clonker · 2022-02-19T20:32:09Z

That is very odd, deeptime is not supposed to pull pytorch at all. Can you try again in an isolated environment and paste the output here?
If you have a look here you can see that pytorch is only an "extras" dependency, so a mere pip install deeptime shouldn't pull it.
Here is what you can run to check the installed dependencies of a pip package (example output for a test installation of mine):

~  pip show deeptime
Name: deeptime
Version: 0.4.1
[...]
Requires: numpy, scikit-learn, scipy, threadpoolctl
Required-by:

Please let me know what you find, thanks!

cap-jmk · 2022-02-21T08:39:41Z

However, for deeptime it is required to install torch. Maybe you can try reproducing the error with pulling default torch on a CUDA machine with CUDA 11.1. While the bug is present, I think the user will wonder why they can't use the full functionality of deeptime, or why the import of deeptime fails at all. Maybe the user will think the library is faulty and skip using it.

clonker · 2022-02-21T09:30:32Z

Ah now I see what you mean - I think it's a good idea to catch such an import error. 🙂 Fixing the version in the setup doesn't seem very sensible to me though, as we do not depend on pytorch.
Here is what happens: deeptime checks if pytorch is installed and if so, imports certain deep learning submodules. I will add a check if torch could successfully imported rather than just checking whether the namespace is available.

cap-jmk · 2022-02-21T10:10:41Z

Great fix 🚀
Under the hood of the torch bug, I realized another, similar bug, too. When installing from pip, the package does not always have the right c++ compilation in the numerical module. Installing from conda works, though. The bug looks similar, to the other one.

undefined symbol:  _ZNSt15__exception_ptr13exception_ptr10_M_releaseE

Ref: pybind/pybind11#3623

I am not sure if it is worth fixing at all. Just wanted to report in case there is some inconsistency in the distributions.

clonker · 2022-02-21T10:17:44Z

Ah thank you for bringing it to my attention! That is one of the drawbacks of using a sdist over a binary distribution with pip. On the other hand I do like that it is compiled locally. Basically a toolchain setup problem... not sure how one would even go about fixing that. Aside from using a binary distribution of course :)

cap-jmk · 2022-02-21T11:50:01Z

From the user perspective, I think it is whatever floats the boat. When building packages that have deeptime as dependency, it would be useful to be able to reliably pull it from pip. Otherwise, the distribution for the new package via PyPi is somewhat having the same problem, and the bug would propagate forever…
If the faulty behaviour is present, one could also redirect the user to the conda build or provide additional instructions. Maybe a simple test during the setup procedure helps to decide what to do. What do you think?

Conda is not an option in each environment.

clonker · 2022-03-02T20:25:38Z

Hey @MQSchleich, I've been experimenting a bit with CMake as dominant build system, I'd imagine it is a bit more robust with respect to incompatible toolchains. Also the initial pytorch issue should be fixed on the brach of PR #215 - if you'd like and have some time I'd appreciate if you can try it out and see if the problem persists.

cap-jmk · 2022-03-07T15:49:45Z

@clonker, did you upload it to PyPi, yet? I tried it out on the problematic machine, and it did indeed persist...

clonker · 2022-03-07T18:21:46Z

No it's not on pypi yet, you'll have to run the setup from the remote:

pip install git+https://github.com/deeptime-ml/deeptime.git@main

clonker · 2022-04-12T07:57:51Z

Ping on this one, with the new version it should also work via pip install deeptime.

clonker · 2022-08-18T08:34:44Z

I assume this is either no longer an issue or abandoned, please feel free to reopen otherwise. :)

clonker closed this as completed Aug 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Installation runs into CUDA problem #212

Installation runs into CUDA problem #212

cap-jmk commented Feb 18, 2022 •

edited

clonker commented Feb 18, 2022

cap-jmk commented Feb 19, 2022 •

edited

clonker commented Feb 19, 2022

cap-jmk commented Feb 21, 2022

clonker commented Feb 21, 2022

cap-jmk commented Feb 21, 2022

clonker commented Feb 21, 2022

cap-jmk commented Feb 21, 2022

clonker commented Mar 2, 2022

cap-jmk commented Mar 7, 2022

clonker commented Mar 7, 2022

clonker commented Apr 12, 2022

clonker commented Aug 18, 2022

Installation runs into CUDA problem #212

Installation runs into CUDA problem #212

Comments

cap-jmk commented Feb 18, 2022 • edited

clonker commented Feb 18, 2022

cap-jmk commented Feb 19, 2022 • edited

clonker commented Feb 19, 2022

cap-jmk commented Feb 21, 2022

clonker commented Feb 21, 2022

cap-jmk commented Feb 21, 2022

clonker commented Feb 21, 2022

cap-jmk commented Feb 21, 2022

clonker commented Mar 2, 2022

cap-jmk commented Mar 7, 2022

clonker commented Mar 7, 2022

clonker commented Apr 12, 2022

clonker commented Aug 18, 2022

cap-jmk commented Feb 18, 2022 •

edited

cap-jmk commented Feb 19, 2022 •

edited