-
Notifications
You must be signed in to change notification settings - Fork 152
Add direct calls to BLAS to compute SVDs #1259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fredrikekre
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe put these these in a separate file like BLAS.jl or something?
|
Thank you! The code looks generally fine to me but we still want to support Julia 1.6 here, which does not have |
|
I don't know how difficult it is to support both with and without |
|
A very simple change to merge this soon would be to just conditionally load @static if VERSION >= v"1.7"
include("blas.jl")
endand import |
|
Hi! Sorry, I missed the notifications! Yes, I am totally fine to allow this improvement only for Julia >= 1.7. I will make that change and submit a new commit. Thanks! |
|
Done @mateuszbaran ! |
mateuszbaran
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Could you just bump version to 1.9.5?
|
Sure! |
|
@mateuszbaran Done! |
This PR implements direct calls to BLAS library for computing SVDs, reducing the allocations if the input types are
SMatrixouMMatrixwith real numbers.The approach is the convert the input matrix to
MMatrix, call BLAS function, and convert it back to the input type. SinceMMatrixdoes not leave the function scope, the compiler is clever enough to avoid allocations.With this PR, we can call
svdvals,svd, andpinvwith 0 allocations.The current implementation of
svdandsvdvalsconverts the static matrix toMatrixand callLinearAlgebra.svd. Here, we are calling the same BLAS function as in LinearAlgebra stdlib. Hence, the performance gain here seems "free" except for maintaining new code inside StaticArrays.This PR does not add any noticeable delay when importing StaticArrays:
Before: 0.126202 seconds (253.52 k allocations: 13.714 MiB)
After: 0.128506 seconds (253.58 k allocations: 13.751 MiB)
Here are some benchmarks comparing the new and old implementation:
svdWe also have a performance boost when computing
pinvbecause it basically callssvd. For example, computingpinvin a 3 x 3 matrix takes 0.995 us in this commit versus 1.462 us inmaster.Closes #1255