-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Excessive Caching in Neural Network #257
Comments
I've reproduced this and deduced that we weren't enabling scoped alias analysis (and thus Enzyme, without that improved alias info, had to assume extra things cached). Happily turning that on eliminates the caches completely (which is undergoing a PR now). |
Thanks for the quick fix ! 👍 |
FYI, I cloned a fresh enzyme revision 6117bbd which should contain the current fix, then rebuild, and reinstall Enzyme. |
The bug is also present with I'll try clang's mainline |
Just to confirm, can you past the output of the analysis? Specifically, the thing to look for is that there's no more lines like There will still exist some lines like |
OK, it still spits plenty of lines but looking more carefully there is no longer "remark: Caching instruction". There is still a "remark: Load must be recomputed" though (but I care less about this provided it's not quadratic) Here is the new output :
|
Hello,
I'm trying to build a proto-neural-network with enzyme, aka two successive Matrix-vector product.
I tried to keep the code as simple and minimalist as possible.
The code runs fine but when I pass -Rpass=enzyme it indicates that it's caching and recomputing whereas it shouldn't need any memory allocation, as I'm preallocating the intermediate buffers, nor recomputation as I'm preserving the intermediate layers.
I have put restrict everywhere I can, but what am I doing wrong ?
Thanks
bugDense.cpp
Compilation with :
clang bugDense.cpp -lstdc++ -lm -fno-exceptions -Rpass=enzyme -Xclang -load -Xclang /usr/local/lib/ClangEnzyme-11.so -O2 -o bugDense
Output :
The text was updated successfully, but these errors were encountered: