You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
openblas uses the reference implementation which is pretty slow since it uses ?trsm by default which is much slower than ?trsv in case the right hand side only has 1 column. This seems to come up regularly and surprise people (see here for a numpy discussion or here for the same discovery in julia). I noticed ?getrs has an optimized implementation which uses this simple trick (?trsv in case of column vector b, ?trsm otherwise), so I propose to do the same for ?trtrs. Does it make sense?
The text was updated successfully, but these errors were encountered:
openblas uses the reference implementation which is pretty slow since it uses ?trsm by default which is much slower than ?trsv in case the right hand side only has 1 column. This seems to come up regularly and surprise people (see here for a numpy discussion or here for the same discovery in julia). I noticed ?getrs has an optimized implementation which uses this simple trick (?trsv in case of column vector b, ?trsm otherwise), so I propose to do the same for ?trtrs. Does it make sense?
The text was updated successfully, but these errors were encountered: