-
Notifications
You must be signed in to change notification settings - Fork 908
Closed
Description
In the matrix multiplication code, noticed that the last dimension of looping, k, does not match the last dimension of memory access, j:
kactl/content/data-structures/Matrix.h
Lines 18 to 23 in ba85dcd
| M operator*(const M& m) const { | |
| M a; | |
| rep(i,0,N) rep(j,0,N) | |
| rep(k,0,N) a.d[i][j] += d[i][k]*m.d[k][j]; | |
| return a; | |
| } |
I believe this can be made more cache-friendly simply by swapping order of summation so that memory accesses are linear.
For example, here is a one-change that speeds up matrix multiplication by 50% locally:
rep(i,0,N) rep(j,0,N)
rep(k,0,N) a.d[i][k] += d[i][j] * m.d[j][k];Metadata
Metadata
Assignees
Labels
No labels