You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the blog post! If you wouldn't mind, I have a couple of comments/suggestions:
For PyTorch-like convolutions, the library to use is NNlib for an interface like torch.nn.functional or Flux for one like torch.nn. These should be tuned for batched, multithreaded CPU + GPU workfloads. I would be surprised if an implementation using PyTorch's (I)FFT functionality could beat FFTW.jl, because the latter wraps an optimized C library!
Thanks for reading and for the tips and links. I've used Zygote in the past and I'd like to get more involved with the FluxML ecosystem, in particular for experiments with neural cellular automata. I've updated my performance tests and I'm going to be writing a follow-up in the next few days.
What's new:
I wrote a NumPy implementation that is much closer to what my Julia implementation is doing, namely using FFTs for convolutions. Unsurprisingly, Juilia is quite a bit faster, especially considering ...
I noticed and fixed an issue in my Julia implementation that was performing two FFT convolutions for each update (and only using one of them). Fixing this makes the Julia implementation much faster, and it is now faster than the PyTorch implementation for small grid dimensions.
I also upgraded the PyTorch version I am using to 1.9.0 from 1.5.1, so CARLE is faster now too.
I'm not too worried about the name. I thought it was a little off-putting at first and now I'm happy to make fun of myself for overthinking it :)
Thanks for the blog post! If you wouldn't mind, I have a couple of comments/suggestions:
torch.nn.functional
or Flux for one liketorch.nn
. These should be tuned for batched, multithreaded CPU + GPU workfloads. I would be surprised if an implementation using PyTorch's (I)FFT functionality could beat FFTW.jl, because the latter wraps an optimized C library!The text was updated successfully, but these errors were encountered: