Further shortcuts for (small) cases that do not need buffer allocation by martin-frbg · Pull Request #3252 · OpenMathLib/OpenBLAS

martin-frbg · 2021-05-27T20:43:17Z

For problem sizes that are too small to benefit from multithreading, we can also skip the locking-intensive allocation and freeing of a temporary buffer if the data does not need compacting or sorting. This PR speeds up the single and double precision real versions of GER, SPR, SPR2, SYR2 and TRSV as well as the single and double precision complex versions of SYR and TRSV.
On x86_64, the same method should be applcable to SYMV as none of the present kernels make use of the buffer array under these circumstances, but this does not hold for all architectures.

…ocation

Revert wrong ZTRSV optimization from #3252

Revert invalid trsv shortcut from PR #3252

martin-frbg added 2 commits May 27, 2021 22:39

Add shortcuts for (small) cases that do not need expensive buffer all…

d6d7a66

…ocation

Fix copy-paste errors in variables used

1217eb9

martin-frbg mentioned this pull request May 29, 2021

Functions are not guarded against early threading #1886

Closed

martin-frbg added 2 commits May 29, 2021 15:40

revert symv changes for now

734bd26

Add shortcuts for (small) cases that do not need expensive buffer all…

f84197c

…ocation

martin-frbg added this to the 0.3.16 milestone Jun 15, 2021

martin-frbg merged commit baf03a0 into OpenMathLib:develop Jun 15, 2021

martin-frbg mentioned this pull request Oct 20, 2021

segfault in memset via zgemv_n on Haswell #3419

Closed

martin-frbg added a commit that referenced this pull request Oct 20, 2021

Merge pull request #3420 from martin-frbg/issue3419

059d3a0

Revert wrong ZTRSV optimization from #3252

martin-frbg added a commit that referenced this pull request Oct 25, 2021

Merge pull request #3422 from martin-frbg/issue3421

03f1354

Revert invalid trsv shortcut from PR #3252

martin-frbg mentioned this pull request Dec 8, 2022

ZSYTRF yields wrong result when OpenBLAS is built using CMake #3856

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Further shortcuts for (small) cases that do not need buffer allocation#3252

Further shortcuts for (small) cases that do not need buffer allocation#3252
martin-frbg merged 4 commits intoOpenMathLib:developfrom
martin-frbg:more_shortcuts

martin-frbg commented May 27, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

martin-frbg commented May 27, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

martin-frbg commented May 27, 2021 •

edited

Loading