New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collaborate on Implementation? #1
Comments
@calclavia hey Henry! Yea, I want to build this! Do you have a plan for cutting down memory usage for the autoregressive case? Custom cuda like what epfl did? |
@calclavia feel free to email me, happy to chat more :) |
@calclavia I think I'm just going to go it alone and finish it tonight, I'm too tired to collaborate lol sorry! |
@calclavia we should keep in touch and exchange our findings testing this in the wild |
Go go guys! ;) |
@calclavia committed and deployed my first draft! i'll keep digging into the jax code tomorrow to see what other essential details i'm missing |
@calclavia turns out the CUDA code will definitely be necessary, since I got word that the number of random features is encouraged to be 256 or above. Let me know if you'd like to collaborate on this :) |
@calclavia nevermind, a new paper has found a way to do this causal case without CUDA code. I'm going to wrap up the project today |
Sorry for the late response - was busier than I thought this week. Looking forward to your wrap up. :) |
I was planning on implementing this on Pytorch as well and started a repo https://github.com/calclavia/Performer-Pytorch
Implemented the kernel so far. If the author(s) of this repo wants to collaborate, would be happy to contribute.
The text was updated successfully, but these errors were encountered: