-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sort singular values #46
Changes from 12 commits
61b05ee
3692ad2
fed5a0a
54a7744
b94a609
38a7997
dd44d14
a86d113
01013a6
677930e
d227259
1e0f7a7
6a49d87
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,7 +2,9 @@ | |
|
||
extern crate rulinalg; | ||
extern crate test; | ||
extern crate rand; | ||
|
||
mod linalg { | ||
mod matrix; | ||
mod svd; | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
use test::Bencher; | ||
use rand; | ||
use rand::{Rng, SeedableRng}; | ||
use rulinalg::matrix::Matrix; | ||
|
||
fn reproducible_random_matrix(rows: usize, cols: usize) -> Matrix<f64> { | ||
const STANDARD_SEED: [usize; 4] = [12, 2049, 4000, 33]; | ||
let mut rng = rand::StdRng::from_seed(&STANDARD_SEED); | ||
let elements: Vec<_> = rng.gen_iter::<f64>().take(rows * cols).collect(); | ||
Matrix::new(rows, cols, elements) | ||
} | ||
|
||
#[bench] | ||
fn svd_10_10(b: &mut Bencher) { | ||
let mat = reproducible_random_matrix(10, 10); | ||
|
||
b.iter(|| | ||
mat.clone().svd() | ||
) | ||
} | ||
|
||
#[bench] | ||
fn svd_100_100(b: &mut Bencher) { | ||
let mat = reproducible_random_matrix(100, 100); | ||
|
||
b.iter(|| | ||
mat.clone().svd() | ||
) | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
use libnum::Float; | ||
use std::f32; | ||
use std::f64; | ||
|
||
/// Expose the machine epsilon of floating point numbers. | ||
/// This trait should only need to exist for a short time, | ||
/// until the Float trait from the Num crate has the same | ||
/// capabilities. | ||
pub trait MachineEpsilon: Float { | ||
/// Returns the machine epsilon for the given Float type. | ||
fn epsilon() -> Self; | ||
} | ||
|
||
impl MachineEpsilon for f32 { | ||
fn epsilon() -> f32 { | ||
f32::EPSILON | ||
} | ||
} | ||
|
||
impl MachineEpsilon for f64 { | ||
fn epsilon() -> f64 { | ||
f64::EPSILON | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -23,6 +23,7 @@ use error::{Error, ErrorKind}; | |
|
||
use libnum::{One, Zero, Float, Signed}; | ||
use libnum::{cast, abs}; | ||
use epsilon::MachineEpsilon; | ||
|
||
impl<T: Any + Float> Matrix<T> { | ||
/// Cholesky decomposition | ||
|
@@ -305,18 +306,119 @@ impl<T: Any + Float> Matrix<T> { | |
} | ||
} | ||
|
||
impl<T: Any + Float + Signed> Matrix<T> { | ||
/// Ensures that all singular values in the given singular value decomposition | ||
/// are non-negative, making necessary corrections to the singular vectors. | ||
/// | ||
/// The SVD is represented by matrices `(b, u, v)`, where `b` is the diagonal matrix | ||
/// containing the singular values, `u` is the matrix of left singular vectors | ||
/// and v is the matrix of right singular vectors. | ||
fn correct_svd_signs<T>(mut b: Matrix<T>, mut u: Matrix<T>, mut v: Matrix<T>) | ||
-> (Matrix<T>, Matrix<T>, Matrix<T>) where T: Any + Float + Signed { | ||
|
||
// When correcting the signs of the singular vectors, we can choose | ||
// to correct EITHER u or v. We make the choice depending on which matrix has the | ||
// least number of rows. Later we will need to multiply all elements in columns by | ||
// -1, which might be significantly faster in corner cases if we pick the matrix | ||
// with the least amount of rows. | ||
{ | ||
let ref mut shortest_matrix = if u.rows() <= v.rows() { &mut u } | ||
else { &mut v }; | ||
let column_length = shortest_matrix.rows(); | ||
let num_singular_values = cmp::min(b.rows(), b.cols()); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You are right. I wrote this function when I thought There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree, it makes sense to keep it in whether we need it or not. |
||
|
||
for i in 0 .. num_singular_values { | ||
if b[[i, i]] < T::zero() { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's very minor but we can remove some bound checks here by using the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I considered this, but I came to the conclusion that the effect here would presumably be completely negligible, in which case I think going There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That sounds sensible - let's keep it as the safe bound checked version. |
||
// Swap sign of singular value and column in u | ||
b[[i, i]] = b[[i, i]].abs(); | ||
|
||
// Access the column as a slice and flip sign | ||
let mut column = shortest_matrix.sub_slice_mut([0, i], column_length, 1); | ||
column *= -T::one(); | ||
} | ||
} | ||
} | ||
(b, u, v) | ||
} | ||
|
||
fn sort_svd<T>(mut b: Matrix<T>, mut u: Matrix<T>, mut v: Matrix<T>) | ||
-> (Matrix<T>, Matrix<T>, Matrix<T>) where T: Any + Float + Signed { | ||
|
||
assert!(u.cols() == b.cols() && b.cols() == v.cols()); | ||
|
||
// This unfortunately incurs two allocations since we have no (simple) | ||
// way to iterate over a matrix diagonal, only to copy it into a new Vector | ||
let mut indexed_sorted_values: Vec<_> = b.diag().into_vec() | ||
.into_iter() | ||
.enumerate() | ||
.collect(); | ||
|
||
// Sorting a vector of indices simultaneously with the singular values | ||
// gives us a mapping between old and new (final) column indices. | ||
indexed_sorted_values.sort_by(|&(_, ref x), &(_, ref y)| | ||
x.partial_cmp(y).expect("All singular values should be finite, and thus sortable.") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would this panic be user error or an algorithmic failure? (Or both I guess...) I'm wondering whether it is better to propagate this by returning a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My reasoning for a panic was this: We are using the successful result of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes that does make sense, just wanted to check! |
||
.reverse() | ||
); | ||
|
||
// Set the diagonal elements of the singular value matrix | ||
for (i, &(_, value)) in indexed_sorted_values.iter().enumerate() { | ||
b[[i, i]] = value; | ||
} | ||
|
||
// Assuming N columns, the simultaneous sorting of indices and singular values yields | ||
// a set of N (i, j) pairs which correspond to columns which must be swapped. However, | ||
// for any (i, j) in this set, there is also (j, i). Keeping both of these would make us | ||
// swap the columns back and forth, so we must remove the duplicates. We can avoid | ||
// any further sorting or hashsets or similar by noting that we can simply | ||
// remove any (i, j) for which j >= i. This also removes (i, i) pairs, | ||
// i.e. columns that don't need to be swapped. | ||
let swappable_pairs = indexed_sorted_values.into_iter() | ||
.enumerate() | ||
.map(|(new_index, (old_index, _))| (old_index, new_index)) | ||
.filter(|&(old_index, new_index)| old_index < new_index); | ||
|
||
for (old_index, new_index) in swappable_pairs { | ||
u.swap_cols(old_index, new_index); | ||
v.swap_cols(old_index, new_index); | ||
} | ||
|
||
(b, u, v) | ||
} | ||
|
||
impl<T: Any + Float + Signed + MachineEpsilon> Matrix<T> { | ||
/// Singular Value Decomposition | ||
/// | ||
/// Computes the SVD using Golub-Reinsch algorithm. | ||
/// Computes the SVD using the Golub-Reinsch algorithm. | ||
/// | ||
/// Returns Σ, U, V where self = U Σ V<sup>T</sup>. | ||
/// Returns Σ, U, V, such that `self` = U Σ V<sup>T</sup>. Σ is a diagonal matrix whose elements | ||
/// correspond to the non-negative singular values of the matrix. The singular values are ordered in | ||
/// non-increasing order. U and V have orthonormal columns, and each column represents the | ||
/// left and right singular vectors for the corresponding singular value in Σ, respectively. | ||
/// | ||
/// Denoting the dimensions of self as M x N (rows x cols), the dimensions of the returned matrices | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that there is a minor mistake here in the dimensions listed below. I think that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh! Yeah, you're completely right. We transpose the matrix before applying There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Great, thanks! You may want to verify that I haven't got the dimensions I gave wrong. I only skimmed the source without checking the output (I'll have some time to check them properly later today if you can't). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I checked it by hand, as well as in the code. I think you are correct (easy to confuse). In the end I chose the verbose way of documenting it, as I found that to be clearer. Let me know if you want it differently phrased or so. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the verbose way is much better - it is pretty complex so taking more space to explain is sensible. All looks good now so I'll merge. Thanks again for your work on this! |
||
/// are as follows: | ||
/// | ||
/// - `Σ`: N x N | ||
/// - `U`: M x N | ||
/// - `V`: N x N | ||
/// | ||
/// # Failures | ||
/// | ||
/// This function may fail in some cases. The current decomposition whilst being | ||
/// efficient is fairly basic. Hopefully the algorithm can be made not to fail in the near future. | ||
pub fn svd(mut self) -> Result<(Matrix<T>, Matrix<T>, Matrix<T>), Error> { | ||
pub fn svd(self) -> Result<(Matrix<T>, Matrix<T>, Matrix<T>), Error> { | ||
let (b, u, v) = try!(self.svd_unordered()); | ||
Ok(sort_svd(b, u, v)) | ||
} | ||
|
||
fn svd_unordered(self) -> Result<(Matrix<T>, Matrix<T>, Matrix<T>), Error> { | ||
let (b, u, v) = try!(self.svd_golub_reinsch()); | ||
|
||
// The Golub-Reinsch implementation sometimes spits out negative singular values, | ||
// so we need to correct these. | ||
Ok(correct_svd_signs(b, u, v)) | ||
} | ||
|
||
fn svd_golub_reinsch(mut self) -> Result<(Matrix<T>, Matrix<T>, Matrix<T>), Error> { | ||
let mut flipped = false; | ||
|
||
// The algorithm assumes rows > cols. If this is not the case we transpose and fix later. | ||
|
@@ -325,6 +427,7 @@ impl<T: Any + Float + Signed> Matrix<T> { | |
flipped = true; | ||
} | ||
|
||
let eps = T::from(3.0).unwrap() * T::epsilon(); | ||
let n = self.cols; | ||
|
||
// Get the bidiagonal decomposition | ||
|
@@ -346,8 +449,7 @@ impl<T: Any + Float + Signed> Matrix<T> { | |
unsafe { | ||
b_ii = *b.get_unchecked([i, i]); | ||
b_sup_diag = b.get_unchecked([i, i + 1]).abs(); | ||
diag_abs_sum = T::min_positive_value() * | ||
(b_ii.abs() + *b.get_unchecked([i + 1, i + 1])); | ||
diag_abs_sum = eps * (b_ii.abs() + b.get_unchecked([i + 1, i + 1]).abs()); | ||
} | ||
if b_sup_diag <= diag_abs_sum { | ||
// Adjust q or p to define boundaries of sup-diagonal box | ||
|
@@ -382,7 +484,7 @@ impl<T: Any + Float + Signed> Matrix<T> { | |
b_sup_diag = *b.get_unchecked([i, i + 1]); | ||
} | ||
|
||
if b_ii.abs() < T::min_positive_value() { | ||
if b_ii.abs() < eps { | ||
let (c, s) = Matrix::<T>::givens_rot(b_ii, b_sup_diag); | ||
let givens = Matrix::new(2, 2, vec![c, s, -s, c]); | ||
let b_i = MatrixSliceMut::from_matrix(&mut b, [i, i], 1, 2); | ||
|
@@ -1063,6 +1165,7 @@ impl<T> Matrix<T> where T: Any + Copy + One + Zero + Neg<Output=T> + | |
mod tests { | ||
use matrix::{Matrix, BaseMatrix}; | ||
use vector::Vector; | ||
use super::sort_svd; | ||
|
||
fn validate_bidiag(mat: &Matrix<f64>, | ||
b: &Matrix<f64>, | ||
|
@@ -1123,6 +1226,8 @@ mod tests { | |
for (idx, row) in b.iter_rows().enumerate() { | ||
assert!(!row.iter().take(idx).any(|&x| x > 1e-10)); | ||
assert!(!row.iter().skip(idx + 1).any(|&x| x > 1e-10)); | ||
// Assert non-negativity of diagonal elements | ||
assert!(row[idx] >= 0.0); | ||
} | ||
|
||
let recovered = u * b * v.transpose(); | ||
|
@@ -1134,36 +1239,119 @@ mod tests { | |
.iter() | ||
.zip(recovered.data().iter()) | ||
.any(|(&x, &y)| (x - y).abs() > 1e-10)); | ||
|
||
// The transposition is due to the fact that there does not exist | ||
// any column iterators at the moment, and we need to simultaneously iterate | ||
// over the columns. Once they do exist, we should rewrite | ||
// the below iterators to use iter_cols() or whatever instead. | ||
let ref u_transposed = u.transpose(); | ||
let ref v_transposed = v.transpose(); | ||
let ref mat_transposed = mat.transpose(); | ||
|
||
let mut singular_triplets = u_transposed.iter_rows().zip(b.diag().into_iter()).zip(v_transposed.iter_rows()) | ||
// chained zipping results in nested tuple. Flatten it. | ||
.map(|((u_col, singular_value), v_col)| (Vector::new(u_col), singular_value, Vector::new(v_col))); | ||
|
||
assert!(singular_triplets.by_ref() | ||
// For a matrix M, each singular value σ and left and right singular vectors u and v respectively | ||
// satisfy M v = σ u, so we take the difference | ||
.map(|(ref u, sigma, ref v)| mat * v - u * sigma) | ||
.flat_map(|v| v.into_vec().into_iter()) | ||
.all(|x| x.abs() < 1e-10)); | ||
|
||
assert!(singular_triplets.by_ref() | ||
// For a matrix M, each singular value σ and left and right singular vectors u and v respectively | ||
// satisfy M_transposed u = σ v, so we take the difference | ||
.map(|(ref u, sigma, ref v)| mat_transposed * u - v * sigma) | ||
.flat_map(|v| v.into_vec().into_iter()) | ||
.all(|x| x.abs() < 1e-10)); | ||
} | ||
|
||
#[test] | ||
fn test_svd_non_square() { | ||
let mat = Matrix::new(5, | ||
3, | ||
vec![1f64, 2.0, 3.0, 4.0, 5.0, 2.0, 4.0, 1.0, 2.0, 1.0, 3.0, 1.0, | ||
7.0, 1.0, 1.0]); | ||
fn test_sort_svd() { | ||
let u = Matrix::new(2, 3, vec![1.0, 2.0, 3.0, | ||
4.0, 5.0, 6.0]); | ||
let b = Matrix::new(3, 3, vec![4.0, 0.0, 0.0, | ||
0.0, 8.0, 0.0, | ||
0.0, 0.0, 2.0]); | ||
let v = Matrix::new(3, 3, vec![21.0, 22.0, 23.0, | ||
24.0, 25.0, 26.0, | ||
27.0, 28.0, 29.0]); | ||
let (b, u, v) = sort_svd(b, u, v); | ||
|
||
assert_eq!(b.data(), &vec![8.0, 0.0, 0.0, | ||
0.0, 4.0, 0.0, | ||
0.0, 0.0, 2.0]); | ||
assert_eq!(u.data(), &vec![2.0, 1.0, 3.0, | ||
5.0, 4.0, 6.0]); | ||
assert_eq!(v.data(), &vec![22.0, 21.0, 23.0, | ||
25.0, 24.0, 26.0, | ||
28.0, 27.0, 29.0]); | ||
|
||
} | ||
|
||
#[test] | ||
fn test_svd_tall_matrix() { | ||
// Note: This matrix is not arbitrary. It has been constructed specifically so that | ||
// the "natural" order of the singular values it not sorted by default. | ||
let mat = Matrix::new(5, 4, | ||
vec![ 3.61833700244349288, -3.28382346228211697, 1.97968027781346501, -0.41869628192662156, | ||
3.96046289599926427, 0.70730060716580723, -2.80552479438772817, -1.45283286109873933, | ||
1.44435028724617442, 1.27749196276785826, -1.09858397535426366, -0.03159619816434689, | ||
1.13455445826500667, 0.81521390274755756, 3.99123446373437263, -2.83025703359666192, | ||
-3.30895752093770579, -0.04979044289857298, 3.03248594516832792, 3.85962479743330977]); | ||
let (b, u, v) = mat.clone().svd().unwrap(); | ||
|
||
let expected_values = vec![8.0, 6.0, 4.0, 2.0]; | ||
|
||
validate_svd(&mat, &b, &u, &v); | ||
|
||
let mat = Matrix::new(3, | ||
5, | ||
vec![1f64, 2.0, 3.0, 4.0, 5.0, 2.0, 4.0, 1.0, 2.0, 1.0, 3.0, 1.0, | ||
7.0, 1.0, 1.0]); | ||
// Assert the singular values are what we expect | ||
assert!(expected_values.iter() | ||
.zip(b.diag().data().iter()) | ||
.all(|(expected, actual)| (expected - actual).abs() < 1e-14)); | ||
} | ||
|
||
#[test] | ||
fn test_svd_short_matrix() { | ||
// Note: This matrix is not arbitrary. It has been constructed specifically so that | ||
// the "natural" order of the singular values it not sorted by default. | ||
let mat = Matrix::new(4, 5, | ||
vec![ 3.61833700244349288, 3.96046289599926427, 1.44435028724617442, 1.13455445826500645, -3.30895752093770579, | ||
-3.28382346228211697, 0.70730060716580723, 1.27749196276785826, 0.81521390274755756, -0.04979044289857298, | ||
1.97968027781346545, -2.80552479438772817, -1.09858397535426366, 3.99123446373437263, 3.03248594516832792, | ||
-0.41869628192662156, -1.45283286109873933, -0.03159619816434689, -2.83025703359666192, 3.85962479743330977]); | ||
let (b, u, v) = mat.clone().svd().unwrap(); | ||
|
||
let expected_values = vec![8.0, 6.0, 4.0, 2.0]; | ||
|
||
validate_svd(&mat, &b, &u, &v); | ||
|
||
// Assert the singular values are what we expect | ||
assert!(expected_values.iter() | ||
.zip(b.diag().data().iter()) | ||
.all(|(expected, actual)| (expected - actual).abs() < 1e-14)); | ||
} | ||
|
||
#[test] | ||
fn test_svd_square() { | ||
let mat = Matrix::new(5, | ||
5, | ||
vec![1f64, 2.0, 3.0, 4.0, 5.0, 2.0, 4.0, 1.0, 2.0, 1.0, 3.0, 1.0, | ||
7.0, 1.0, 1.0, 4.0, 2.0, 1.0, -1.0, 3.0, 5.0, 1.0, 1.0, 3.0, | ||
2.0]); | ||
fn test_svd_square_matrix() { | ||
let mat = Matrix::new(5, 5, | ||
vec![1.0, 2.0, 3.0, 4.0, 5.0, | ||
2.0, 4.0, 1.0, 2.0, 1.0, | ||
3.0, 1.0, 7.0, 1.0, 1.0, | ||
4.0, 2.0, 1.0, -1.0, 3.0, | ||
5.0, 1.0, 1.0, 3.0, 2.0]); | ||
|
||
let expected_values = vec![ 12.1739747429271112, 5.2681047320525831, 4.4942269799769843, | ||
2.9279675877385123, 2.8758200827412224]; | ||
|
||
let (b, u, v) = mat.clone().svd().unwrap(); | ||
validate_svd(&mat, &b, &u, &v); | ||
|
||
// Assert the singular values are what we expect | ||
assert!(expected_values.iter() | ||
.zip(b.diag().data().iter()) | ||
.all(|(expected, actual)| (expected - actual).abs() < 1e-12)); | ||
} | ||
|
||
#[test] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm just tired - but don't we want to multiply the rows of
v
by-1
instead of the columns?b * v
is basically scaling the rows ofv
by the respective singular values?If this is true then it becomes a little trickier to determine the faster approach. It will probably almost always be faster to flip the sign on the row due to cache access and vectorization.
Actually I just realised that we transpose
v
at the end so this is correct. I'm leaving this comment to highlight the fact that we may want to modify the algorithm so that we are operating on rows (not in this PR of course).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also possible to collect the indices of all columns that need to be flipped, and then iterate over each row, flipping the sign of each element as we go. It's also possible to do the same for the sorting, and would likely be faster. However, the implementation would be slightly more complex, and since the overhead seems to be more or less negligible at this point, I opted for the simpler approach of just dealing with columns directly.
As you say, we may want to revisit this in the future!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I just realized that a
100 x 100
matrix (which I used in my benchmarks) would still fit in L2 cache. Guess we should do some tests for matrices so big that they also don't fit in L3? Perhaps the results would be significantly different then...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try that too. In particular the case nxm where n >> m, E.g. 10000x10. This a bit more representative of cases we may see in rusty-machine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do that next chance I get. Thinking a little about it though, I suspect that we have the following performance characteristics. First, consider that if
m >= n
, SVD is usually something likeO(m n^2)
flops or so, and I'm not sure about memory accesses of Golub-Reinsch. The ordering of the singular vectors is roughlyO(n^2)
memory accesses in the worst case. Based on this and our current benchmarks, I would expect the following:Looking forward to try the benchmarks out!
By the way, you may find the following interesting. I spent Fall 2014 and Spring 2015 at UC Berkeley. Among other things, I took some courses under Prof. James Demmel. Prof. Demmel is a leading figure in the research of what they call "communication avoiding algorithms". In essence, the goals of these algorithms is to minimize not just the number of flops required for an algorithm, but it also uses a simple mathematical model for cache behavior, so that you can develop algorithms that provably minimize (again in a big O sense) the amount of communication, i.e. the number of expensive memory accesses. I mention this, because I remember that at some point prof. Demmel referred me to some certain papers on these algorithms. Their publications can be found here. I think in particular this survey paper may be of interest. It's fairly up to date (published in 2014) and discusses the communication optimality of various algorithms in linear algebra.
However, I want again to stress that we should ensure correctness of the algorithms in
rulinalg
(by extensive testing against expected error bounds etc.) before this should become a topic. Still, thought I'd mention it, because I think there's a lot of interesting stuff in there :)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must have missed this comment before. Thank you for pointing out those papers! They look like really interesting reads - though at a first glance pretty challenging. The survey paper looks like a really good start, it looks like I could learn a lot from it.