New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interchangeability of cauchy kernel methods #81
Comments
Sorry for the late response. I have never tried this; I think it might depend on the problem characteristics. It's likely that there are small numerical differences between the implementations, and they might compound depend on how you're using them (e.g. if it's for autoregressive generation which involves repeated computations, the differences might compound). I've never tried using the extension in a notebook. I think sometimes it can be tricky getting the notebook to use the same pip environment which has the extension, can that be related to the issue? Is it unable to find the extension at all, or is it trying to use the extension and giving errors? |
Thank you for your reply! Yeah, this was also my guess for what is happening (and seems like a trade-off between training and inference speed). I managed to solve the problem in my notebook by pushing the model onto a GPU since the extension seems to require CUDA. Small note - lines 78 and 102 of cauchy.py have a small syntax error where:
doesn't end up throwing an error unless it is changed to the following:
I can open a pull request if that's helpful. Thanks again for the help! |
Thanks for the catch! I've made the fix and it will be incorporated in the next release. |
Hi Albert,
I've been training an image generation model using the S4 module and the cauchy extension (compiled code on local machine). Within a model trained with the cauchy extension, would you expect performance differences if the naive implementation (slow kernel) was used for evaluation? Or is the naive implementation less robust than the extension?
My goal is to visually inspect a few things, but am experiencing problems using the extension in a jupyter notebook (even when placing all tensors/models on a GPU).
Thanks again for your time,
Tommy
The text was updated successfully, but these errors were encountered: