-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-Temporal Memory Transfers for ARM #961
base: master
Are you sure you want to change the base?
Conversation
Make ldnp load 64-bit registers (instead of 32-bit-wide registers)
a6de5f6
to
332803d
Compare
Works on my M2. I tried a convergence test (Snell's law) and achieved the same error for both this branch and master. Performance difference is hart to quantify. As this is a laptop, the difference between runs with the same settings is smaller than the difference between the settings. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #961 +/- ##
=======================================
Coverage 13.82% 13.82%
=======================================
Files 268 268
Lines 14984 14984
=======================================
Hits 2071 2071
Misses 12913 12913 ☔ View full report in Codecov by Sentry. |
@jwjeremy do you have capacity to test this branch and the master branch on your macbook before the training next week? |
This PR adds non-temporal loads/stores for ARM processors.
In principle functional, but not yet completely tested through. Especially the memory ordering for the pure-AARCH64 may need more verification.
PR for better visibility