-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Add specialized methods for inverses of small matrices #12454
Conversation
function small_inv{T}(A::StridedMatrix{T}) | ||
n = chksquare(A) | ||
d = det(A) | ||
if d == zero(T) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is exact comparison with zero reasonable here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not. I thought about it but I didn't know what would be reasonable due to the generic T. eps(T)
maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fine to throw for exact zeros only. That is similar to what LAPACK does.
These functions actually assume commutativity for multiplication of the elements so the elements should probably be restricted to subtypes of |
0070817
to
adff6bf
Compare
I restricted the types for when the |
Could someone restart the AppVeyor build for me? |
You need to rebase and force push here, you hit a build-number collision appveyor/ci#353 |
adff6bf
to
af2360e
Compare
To note, the timings here are with the hand coded determinants in #12452. These might be unacceptable due to numerical instability. WIth only #12460 the timings in this PR are: f(10^5, 2);
# 0.045274 seconds (700.00 k allocations: 41.199 MB, 8.82% gc time)
f(10^5, 3);
# 0.052139 seconds (700.00 k allocations: 51.880 MB, 6.88% gc time) which is only about a factor 2x faster. |
This looks pretty reasonable. @andreasnoack, please review and merge if you think this is good to go. |
There are two issues here.
|
Couldn't you hand-code a 2x2 LU, for example, to work around the numerical stability issue? |
@mlubin Sure. I think that is the way to procees, but the LU based inverse is a little verbose. The 2x2 is okay, but the 3x3 case is quite messy and has six branches. You can try out |
Closing this because a user can easily add their own hand coded inverses if they feel the default speed is not fast enough. Better to provide numerically stable functions by default. |
Same as #12452 but for inverses
The following benchmark shows the difference between master and PR (using #12452):
Master
PR