-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimized ?trtrs #2252
Optimized ?trtrs #2252
Conversation
quick exit on NOOP is missing from code -
|
The appveyor failure is unrelated (and my fault), but I do not yet understand what happened to the travis OSX builds. |
This should fix it... I'm surprised it even got picked up, I thought extended precision was disabled. |
@thrasibule extended precision is alive on *BSD , just that OSX is only of a kind in CI |
Unfortunately I am seeing a considerable number of test failures in the LAPACK EIG and LIN tests (available through |
I think I've fixed it all. I'm still seeing some error in lapack-test, but not more than before this PR. |
Thank you. The remaining error in STFSM/CTFSM in indeed unrelated to the PR (probably caused by some inaccuracy in the corresponding trsm, and definitely not a recent regression). |
closes #2251
I've written it in the same model as the optimized ?getrs implementation. The idea is to just use ?trsv instead of ?trsm when the right hand side only has 1 column. I've checked that this give the expected speedup.
The only sure I'm not quite sure about, and someone more familiar with the codebase could probably help me with are these lines:
https://github.com/thrasibule/OpenBLAS/blob/trtrs/interface/lapack/trtrs.c#L142-L143.
I used the same parameters as in ?getrs
and here:
https://github.com/thrasibule/OpenBLAS/blob/trtrs/interface/lapack/trtrs.c#L165
Again I've used the same parameters as ?getrs. This looks like operation counts, so it probably should be half of that, but I'm not sure what these macros do.
This is a work in progress, I can reorganize the commits and maybe add some tests.