Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interchangeability of cauchy kernel methods #81

Closed
tommybotch opened this issue Dec 24, 2022 · 3 comments
Closed

Interchangeability of cauchy kernel methods #81

tommybotch opened this issue Dec 24, 2022 · 3 comments

Comments

@tommybotch
Copy link

Hi Albert,

I've been training an image generation model using the S4 module and the cauchy extension (compiled code on local machine). Within a model trained with the cauchy extension, would you expect performance differences if the naive implementation (slow kernel) was used for evaluation? Or is the naive implementation less robust than the extension?

My goal is to visually inspect a few things, but am experiencing problems using the extension in a jupyter notebook (even when placing all tensors/models on a GPU).

Thanks again for your time,
Tommy

@tommybotch tommybotch changed the title Interchangeability of cauchy kernels Interchangeability of cauchy kernel methods Dec 24, 2022
@albertfgu
Copy link
Contributor

Sorry for the late response. I have never tried this; I think it might depend on the problem characteristics. It's likely that there are small numerical differences between the implementations, and they might compound depend on how you're using them (e.g. if it's for autoregressive generation which involves repeated computations, the differences might compound).

I've never tried using the extension in a notebook. I think sometimes it can be tricky getting the notebook to use the same pip environment which has the extension, can that be related to the issue? Is it unable to find the extension at all, or is it trying to use the extension and giving errors?

@tommybotch
Copy link
Author

tommybotch commented Jan 4, 2023

Thank you for your reply! Yeah, this was also my guess for what is happening (and seems like a trade-off between training and inference speed). I managed to solve the problem in my notebook by pushing the model onto a GPU since the extension seems to require CUDA.

Small note - lines 78 and 102 of cauchy.py have a small syntax error where:

if not v.is_cuda and z.is_cuda and w.is_cuda: raise NotImplementedError(f'Only support CUDA tensors')

doesn't end up throwing an error unless it is changed to the following:

if not (v.is_cuda and z.is_cuda and w.is_cuda): raise NotImplementedError(f'Only support CUDA tensors')

I can open a pull request if that's helpful. Thanks again for the help!

@albertfgu
Copy link
Contributor

Thanks for the catch! I've made the fix and it will be incorporated in the next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants