-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unnecessary integer moves #245
Comments
Thanks for this report. I will investigate. |
I reproduced the issue. Looks like a bug all right. Debugging. |
The issue turned out to be unexpected expressions ("t + 0" for example) in the stencil indices. These should now be recognized (as equivalent to "t" in the example) and handled properly. I pushed a change to the "develop" branch if you want to test it before I merge it into "master", probably sometime tomorrow assuming all regression tests pass. Performance of the test case you provided increased 1.86x on my system, consistent with your measurement. |
The "t+0" occurs as the stencil itself is produced using a different tool. Thank you for the hotfix, it works. |
For some stencils the YASK compilers generates unnecessary integer move operations. This is caused due to the construction of unaligned vectors, although it is not required in many cases.
As an example see the attached RHS_LC stencil which for instance when compiled with fold of 'x=1,y=1,z=4‘ and radius 4 on arch=hsw has many unaligned vector constructions inside 'calc_loop_of_clusters‘, however most of them are not necessary. (I meant the code after the comment "Construct unaligned vector starting at ...“ in generated file yask_stencil_code.hpp)
These unnecessary moves generate unwanted ‚movq‘ instructions for AVX/AVX2 code, which hurts performance a lot for this kernel. For example on an Intel Haswell CPU (E5-2695) the performance gain by avoiding them was 1.8 x on 1 socket.
yask_movq_issue.zip
The text was updated successfully, but these errors were encountered: