Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pinv function (Moore-Penrose Pseudo-inverse) #299

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

emiddleton
Copy link

@emiddleton emiddleton commented Jun 1, 2021

This is my work in progress adding Moore-Penrose Pseudo-inverse of a Matrices #292. I have added all the tests suggested in @jturner314 #292 (comment) but it still needs more documentation and the tests could do with some cleanup. I wrote some functions for creating various ranked matrices but I think there might be a better way to do this. I added a rank function to help create random matrices of various ranks.

@emiddleton emiddleton changed the title WIP: pinv implementation with tests Add pinv function (Moore-Penrose Pseudo-inverse of a Matrices) Jun 1, 2021
@emiddleton emiddleton changed the title Add pinv function (Moore-Penrose Pseudo-inverse of a Matrices) Add pinv function (Moore-Penrose Pseudo-inverse) Jun 1, 2021
@jturner314
Copy link
Member

I haven't had a chance to review the code, but I thought I'd respond to this portion of your comment:

I wrote some functions for creating various ranked matrices but I think there might be a better way to do this. I added a rank function to help create random matrices of various ranks.

Probably the simplest approach is to take advantage of the properties of the rank of matrix products and the fact that matrices generated with ndarray_linalg::generate::random are almost always full-rank. So, given two random matrices of shape m × r and r × n, where r <= min(m, n), their product will almost always have rank r.

use ndarray::prelude::*;
use ndarray::linalg::general_mat_mul;
use ndarray_linalg::{Scalar, generate::random};

/// Returns an array with the specified shape and rank.
///
/// # Panics
///
/// Panics if the rank is impossible to achieve for the given shape, i.e. if
/// it's less than the minimum of the number of rows and number of columns.
fn random_with_rank<A, Sh>(shape: Sh, rank: usize) -> Array2<A>
where
    A: Scalar,
    Sh: ShapeBuilder<Dim = Ix2>,
{
    let mut out = Array2::zeros(shape);
    assert!(rank <= usize::min(out.nrows(), out.ncols()));
    for _ in 0..10 {
        let left: Array2<A> = random([out.nrows(), rank]);
        let right: Array2<A> = random([rank, out.ncols()]);
        general_mat_mul(A::one(), &left, &right, A::zero(), &mut out);
        if let Ok(out_rank) = out.rank() {
            if out_rank == rank {
                return out;
            }
        }
    }
    unreachable!("Failed to generate random matrix of desired rank within 10 tries. This is very unlikely.");
}

This implementation uses general_mat_mul to write the result of the matrix multiplication into out, so that we have easy control over its layout.

@codecov
Copy link

codecov bot commented Jun 2, 2021

Codecov Report

Merging #299 (970232c) into master (082f01d) will increase coverage by 0.17%.
The diff coverage is 97.43%.

❗ Current head 970232c differs from pull request most recent head 942b7d7. Consider uploading reports for the commit 942b7d7 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #299      +/-   ##
==========================================
+ Coverage   89.01%   89.19%   +0.17%     
==========================================
  Files          71       75       +4     
  Lines        3578     3656      +78     
==========================================
+ Hits         3185     3261      +76     
- Misses        393      395       +2     
Impacted Files Coverage Δ
ndarray-linalg/src/pinv.rs 93.10% <93.10%> (ø)
ndarray-linalg/src/rank.rs 100.00% <100.00%> (ø)
ndarray-linalg/tests/pinv.rs 100.00% <100.00%> (ø)
ndarray-linalg/tests/rank.rs 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 082f01d...942b7d7. Read the comment docs.

@benkay86
Copy link
Contributor

@emiddleton, I had the same thought, but you beat me to it! I wrote fully-tested implementation here, but had not yet covered the in-place use case.

The main difference between our implementations is that my algorithm attempts QR decomposition first and then falls back to singular value decomposition. My reasoning is that the most common use case for computing the Moore-Penrose pseudoinverse is performing least-squares regression, where we would like to know the pseudoinverse of the design matrix x. Most of the time x will have full column rank. When x is full rank, QR decomposition is about 4 times faster than singular value decomposition with dgesvd() and at least twice as fast as degsdd(). There is a separate pinv_svd() method when the caller believes the matrix is rank-deficient.

What are your thoughts?

@benkay86 benkay86 mentioned this pull request Jul 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants