Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Improve BLAS Level 3 multi-threads performance on ICT Loongson 3A #47

xianyi opened this Issue Jul 25, 2011 · 0 comments


None yet
1 participant

xianyi commented Jul 25, 2011

The average speedup of multi-threads dgemm is about 2.83 on 4 cores.
Need optimization.

@ghost ghost assigned xianyi Jul 25, 2011

xianyi added a commit that referenced this issue Sep 5, 2011

Refs #47. On Loongson 3A, set DGEMM_R parameter depending on differen…
…t number of threads. It would improve double precision BLAS3 on multi-threads.

@xianyi xianyi closed this Nov 3, 2011

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment