Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can't run train.py w/o compiling Cauchy kernel for CUDA #17

Closed
DorotheaKolossa opened this issue Mar 21, 2022 · 5 comments
Closed

can't run train.py w/o compiling Cauchy kernel for CUDA #17

DorotheaKolossa opened this issue Mar 21, 2022 · 5 comments

Comments

@DorotheaKolossa
Copy link

DorotheaKolossa commented Mar 21, 2022

Dear all,

I am having trouble compiling the Cauchy kernel, and although I have installed pykeops, running train.py always results in errors like this:

_RuntimeError: [KeOps] This KeOps shared object has been compiled without cuda support:

  1. to perform computations on CPU, simply set tagHostDevice to 0
  2. to perform computations on GPU, please recompile the formula with a working version of cuda._

The only thing that fixed the issues for me is commenting out the following try/catch. Without that (sorry for its uglyness...) the code never did default back to the slow kernel... now it does, but that is certainly not the right way for me to go about it ;)

I wonder if the try/catch-phrase needs to check whether the kernel actually runs, not just lets itself be imported?

''' try:
import pykeops
from src.models.functional.cauchy import cauchy_conj
has_pykeops = True
except ImportError:
has_pykeops = False
from src.models.functional.cauchy import cauchy_conj_slow
if not has_cauchy_extension:
log.error(
"Falling back on slow Cauchy kernel. Install at least one of pykeops or the CUDA extension for efficiency."
)
'''

has_pykeops = False
from src.models.functional.cauchy import cauchy_conj_slow
if not has_cauchy_extension:
log.error(
"Falling back on slow Cauchy kernel. Install at least one of pykeops or the CUDA extension for efficiency."
)

@albertfgu
Copy link
Contributor

Your pykeops isn't installed correctly. If you want to go back to the slow kernel, you can simply uninstall pykeops since it isn't being used.
However I would recommend trying to check the pykeops installation by following the instructions on the website. In my experience, your error is most commonly caused by cmake not being installed correctly. Did you pip install pykeops and cmake in a fresh environment?

@DorotheaKolossa
Copy link
Author

thank you for your support! I can see the point for sure - but just being able to run the code slowly (to be able to look at the details is a debugger etc.) is very interesting for me, and it's what I've been doing :)

So now I also tried to run w/ pykeops uninstalled, but this throws the following error:

File "[...] /lib/python3.8/site-packages/hydra/_internal/utils.py", line 587, in _locate
raise ImportError(
ImportError: Encountered error: No module named 'pykeops' when loading module 'src.models.sequence.ss.s4.S4'

Maybe, if you'd like to allow for running without an orderly pykeops installation, the one try/catch-place above might be easiest to change, to allow for that, after all?

(But I am also totally fine with closing the issue, maybe putting in a comment somewhere in the doc how to avoid my kind of error?)

@albertfgu
Copy link
Contributor

Hmm, I haven't encountered this error even when I uninstall pykeops. Would it work if you try a clean installation of the environment but without pykeops? That seems to work for me. Otherwise, I'm not sure what's causing this.

You're right that a better try catch could be to actually try calling the function, but I think it's also useful to have a very explicit error when the installation isn't working. That way people who are actually trying to use S4 for performance won't hit a silent bug when the code seems to be working but is slower than expected.

@DorotheaKolossa
Copy link
Author

Thanks again for you help. In the new install, completely without any prior pykeops, the error is indeed caught.

@albertfgu
Copy link
Contributor

Glad that worked! I'm not sure what caused the issue as uninstalling pykeops still works for me, but at least it works now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants