-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster linear algebra in \ #84
Comments
That should be "if there are m nonzero diagonals below the main diagonal, then after a few starting rotations, an alternation of floor(m/2) and ceil(m/2) rotations zero-ing the anti-diagonal elements can be done "in parallel."" This would also reduce the calls to slnorm. |
Hm not sure I understand, but maybe you mean row 1 eliminates row 2 while row 3 eliminates row 4 and row 5 eliminates row 6... Then once even rows are eliminated row 1 eliminates row 3 while row 5 eliminates row 7... Could work. Though probably needs multi threading in julia (which is coming) and even then I don't have any experience in parallel so there could be memory issues. There's probably research on this.. Sent from my iPhone
|
I mean in the picture, first we zero A_{2,1} by A_{1,1}, then A_{3,1} is zeroed by the updated A_{1,1}. Then the anti diagonals can be zero-ed simultaneously. A_{4,1} is zeroed by A_{1,1} while A_{3,2} is zeroed by A_{2,2}, and so on and so forth. But I guess this is similar to doing it vertically as you suggested. |
Is there a reason Givens rotations are used instead of Householder transformations? |
I'm under the impress that they are better for banded solves, but don't know why that would be Sent from my iPhone
|
The new implementation is much faster, so I'll close this issue. While its possible that it can be made even faster, it might be at the point of diminishing returns... |
I just did Givens by hand, which slows down too much for large bandwidth. I'm sure this could be sped up by a large factor, either by using LA Pack on sub blocks, vectorization, or other tricks
The text was updated successfully, but these errors were encountered: