Sample matrix multiply code to show affect of blocking and data alignment The code mm.c accompanies two papers at software.intel.com that discuss memory layout and performance. A simple matrix multiply is reordered and blocked to show performance improvement An exercise is included to show the impact on performance when matrices are not aligned on cacheline boundaries.
-
Notifications
You must be signed in to change notification settings - Fork 4
drmackay/samplematrixcode
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
About
Sample matrix multiply code to show affect of blocking and data alignment
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published