ENH: Add wrapper for ?gejsv #11354

mdhaber · 2020-01-13T01:49:11Z

See NumFOCUS Small Development Grant Proposal 2019 Round 3 proposal for background.

Add wrapper for ?gejsv.

Suggested signature:

! sva, u, v, work, iwork, info  = gejsv(a, lwork=max(2*M+N, 4*N+N*N,2*N+N*N+6), joba="C", jobu="U", jobv="V", jobr="R", jobt="N", jobp="N", overwrite_A=0)

Test idea:

Generate random A.
Perform SVD with ?gejsv
Compare A against u @ SIGMA @ v.T where SIGMA is np.diag(work[1]/work[0]*sva[:N]) (I think - see second thought below)
Compare u.T @ u against identity, e.g. np.testing.assert_allclose(u.T @ u, np.eye(u.shape[1])), with appropriate tolerances
Compare v @ v.T against identity.
Check info==0.
Check iwork[0]==min(m,n)
Check iwork[1]==min(m,n)

Thoughts:

Computing the optimal LWORK looks quite complicated. In order to do it properly, it looks like we would need to create six other lwork routines - DGEQP3, DGEQRF, DPOCON, DGELQF, DGEQRF, DORMLQ - in order to write an ?dgejsv_lwork. That is outside the scope of what I intended for this project, and I think that can be left to future work. For now, using the default value shown above is sufficient for any set of options.
The documentation about the scaling of the singular values in SVA is confusing. In the description of SVA: "The singular values of A are (WORK(1)/WORK(2)) * SVA(1:N)". In the description of WORK: "SCALE = WORK(2) / WORK(1) is the scaling factor such that SCALE*SVA(1:N) are the computed singular values of A." These seem contradictory.
We should test with M < N, M = N, and M > N.
We might test with non-full-row/column-rank matrices. In that case, the routine should succeed, but I think iwork[0] and iwork[1] will now be the rank of the matrix. One way to generate such a matrix is to compute the SVD of a random matrix, zero k of the singular values, and multiply the factors back together.
There are 6x4x4x2x2x2 = 768 combinations of job? options. I don't think we need to test every combination. I suggest we just vary one thing at a time from the defaults.
For joba, I don't think we need to verify that the different options are doing what they say they will; we should just check that the basic test above passes with every option.
For jobu, I think we can test options U and F with the basic test above. I'll describe tests with options W and N below.
For jobv, we can test option V with the basic test above. I don't want to dig into what option J means; maybe just test that the code runs without error. I'll describe tests with options W and N below.
For jobr, test with both options N and R normally.
For jobt, test with option N. I'll describe a test with option T below.
For jobp, test with options P and N normally.
I'd suggest putting an invalid argument for each job? option and confirming that the returned info indicates which argument is invalid.

Test with jobu='N' and jobv ='N': since only the singular values are generated, we can't reproduce the original matrix from the factors or test u or v for orthogonality. Just check that the singular values are correct.

Test with jobt='T', jobu='W', jobv='V', and m==n: do we need a special test for this? Originally I thought we might need to do something special, now I think we might be able to throw these options in with the regular tests above.

We could get fancy and design special tests with numerically non-full-(column/row) rank matrices to confirm that this routine is doing the things it claims, but I don't think that's necessary.

Maybe make sure there is reasonable behavior when:

A is not 2D?
A is 1 x 1
A is empty
lwork is zero

Also: implement all examples from the NAG manual as additional tests.

The text was updated successfully, but these errors were encountered:

mdhaber · 2020-01-13T01:57:18Z

@swallan

swallan · 2020-02-12T22:23:42Z

Here's the branch that I started with the tests for this routine.

@mdhaber, you commented

We could get fancy and design special tests with numerically non-full-(column/row) rank matrices to confirm that this routine is doing the things it claims, but I don't think that's necessary.

Is this in respect to point number 4?

We might test with non-full-row/column-rank matrices [...]

I assumed that reasonable behavior for malformed input matrix would be non zero info pointing towards the A parameter, and that non valid job? values would result in non zero info, but can update it to the specific job? index in the list if that is what info should return for.

Additionally, I could only find the NAG page for the real version of this function. I didn't see the page / any examples for the complex version, so there is only one NAG test.

mdhaber · 2020-02-12T23:42:20Z

No, those were separate ideas.

I write this because according to the documentation:

DGEJSV can sometimes compute tiny singular values and their singular vectors much
more accurately than other SVD routines, see below under Further Details.

What I meant by the first comment was

We could get fancy and design special tests with very tiny singular values to confirm that this routine achieves better accuracy than other SVD routines, but I don't think that's necessary.

The second comment is different. I mean to try the routine with matrices:

with singular values that are exactly zero (not just tiny) and
to check that iwork[0] and iwork[1] correctly report the rank of the matrix.

(The rank of the matrix is min(m, n) minus the number of zero singular values. If we create a matrix with some zero singular values, then we know what the rank of the matrix is; we're checking that iwork correctly reports that.)

mdhaber · 2020-02-13T00:00:05Z

Here's a little code that generates a random matrix, performs SVD, zeros some of the singular values in the SVD, creates a new matrix from the modified SVD, and checks that the rank of the new matrix is what we want it to be. It's just a way of creating a random matrix with a certain rank. There are simpler ways (e.g. zeroing some of the rows or setting rows to linear combinations of other rows) but in some sense the resulting matrices are not as "random" as this way.

import numpy as np
from scipy.linalg import svd
m = 10
n = 15
k = 8 # desired rank
A = np.random.rand(m, n)
U, s, Vh = svd(A, full_matrices=False)
s[np.random.permutation(min(m, n))[k:]] = 0
A2 = U * s @ Vh
print(np.linalg.matrix_rank(A2) == k)

mdhaber · 2020-02-13T00:01:58Z

I assumed that reasonable behavior for malformed input matrix would be non zero info pointing towards the A parameter, and that non valid job? values would result in non zero info, but can update it to the specific job? index in the list if that is what info should return for.

I'm confused. Is that a separate question? Could you rephrase / expand?

mdhaber · 2020-02-13T00:14:17Z

Additionally, I could only find the NAG page for the real version of this function. I didn't see the page / any examples for the complex version, so there is only one NAG test.

No problem.

swallan · 2020-02-13T00:31:41Z

I'm confused.

Sorry, let me clarify. It was a separate point.
The documentation says:

< 0: if INFO = -i, then the i-th argument had an illegal value.

In the past info has usually been positive, will it now be negative?

I was not completely sure what value info would return in the event of A being malformed (such as A is None, A is 1x1, A is not 2D), so I had assumed that it would be 1 since A is the first argument.

Similarly, when testing for invalid job inputs, will info be the index of the invalid input argument? So, if joba="k" is the third argument, info would then be 3? For now I have left this as asserting that info !=0.

mdhaber · 2020-02-16T15:41:32Z

I would assume it would be -1 and -3 in those cases, respectively. For whatever reason, it does appear that it will report negative values for bad inputs.

swallan · 2020-03-04T06:41:54Z

@mdhaber
Here's the branch that has the tests for gejsv on it. I split this one up when I wrote it into a

general test with test for each job* value
specific test with things like the special job* combinations, rank modifications, and error tests.

To follow the model of the other tests, I think that I'll break up the ironically titled 'specific' tests into actual specific tests for each thing.

In broad terms, what are your thoughts on the parametrization? This one's lambdas are more pretty since they line up nicer than the other ones, I think.

mdhaber · 2020-03-04T07:12:35Z

I think the parametrization looks great. Breaking up the specific tests sounds good. I think I mistyped above when I wrote:

Test with jobu='N' and jobv ='N'

I meant "test with jobu='N' or jobv ='N'".
Let's incorporate these with the general tests, then you'll need something like:

if jobu!='N' and jobv!='N':
    assert_allclose(A, u @ SIGMA @ v.T, rtol=rtol, atol=atol)
if jobu!='N':
    assert_allclose(u.T @ u, np.identity(u.shape[1]), rtol=rtol, atol=atol)
if jobv!='N':
    assert_allclose(v @ v.T, np.identity(n), rtol=rtol, atol=atol)

And let's add the tests for the eigenvalues all the time. Not:

assert_allclose(sva, eig(A), rtol=rtol, atol=atol)

I think you need to multiply by the ratio of the work values, like you did with SIGMA.

assert_allclose(work[1] / work[0] * sva[:n], eig(A), rtol=rtol, atol=atol)

Actually, don't make SIGMA a diagonal matrix, keep it a 1D vector. Remember that right-multiplying by a diagonal matrix is the same as multiplying the columns by the corresponding elements of the diagonal; left multiplying by a diagonal matrix is the same as multiplying the rows by the corresponding elements of the diagonal.
Very explicitly, letting x by any arbitrary vector of the right shape

np.allclose(A @ np.diag(x), A * x[np.newaxis, :]) # same as A * x, too
np.allclose(np.diag(x) @ A -  x[:, np.newaxis] * A)

Consequently, I suggest you define

sigma = work[1] / work[0] * sva[:n] # not a matrix

and then for the tests:

np.allclose(u @ SIGMA @ v.T, u * sigma @ v.T)
np.allclose(sigma, eig(A), rtol=rtol, atol=atol)

If this looks confusing, I can explain when I see you.

mdhaber · 2020-03-08T05:37:23Z

@Kai-Striega can you start a branch for this so @swallan can submit a PR?

mdhaber · 2020-03-13T21:36:08Z

The stuff I wrote about eig/"eigenvalues" above should be replaced with svd/"singular values". We can discuss this in person sometime next week?

mdhaber assigned Kai-Striega Jan 13, 2020

mdhaber added scipy.linalg enhancement A new feature or improvement labels Jan 13, 2020

mdhaber added this to the 1.5.0 milestone Jan 13, 2020

This was referenced Mar 12, 2020

ENH/TST: Add wrapper and tests for lapack routine ?gejsv Kai-Striega/scipy#7

Closed

ENH: Add wrapper for ?gejsv #11664

Merged

Kai-Striega closed this as completed in #11664 Apr 20, 2020

tylerjereddy changed the title ~~Add wrapper for ?gejsv~~ ENH: Add wrapper for ?gejsv Apr 20, 2020

highlando mentioned this issue Dec 2, 2020

scipy.linalg.lapack.dgesjv overwrites original arrays if called on a transposed array #13191

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Add wrapper for ?gejsv #11354

ENH: Add wrapper for ?gejsv #11354

mdhaber commented Jan 13, 2020 •

edited

Loading

mdhaber commented Jan 13, 2020

swallan commented Feb 12, 2020

mdhaber commented Feb 12, 2020

mdhaber commented Feb 13, 2020

mdhaber commented Feb 13, 2020 •

edited

Loading

mdhaber commented Feb 13, 2020

swallan commented Feb 13, 2020

mdhaber commented Feb 16, 2020

swallan commented Mar 4, 2020 •

edited

Loading

mdhaber commented Mar 4, 2020 •

edited

Loading

mdhaber commented Mar 8, 2020

mdhaber commented Mar 13, 2020 •

edited

Loading

ENH: Add wrapper for ?gejsv #11354

ENH: Add wrapper for ?gejsv #11354

Comments

mdhaber commented Jan 13, 2020 • edited Loading

mdhaber commented Jan 13, 2020

swallan commented Feb 12, 2020

mdhaber commented Feb 12, 2020

mdhaber commented Feb 13, 2020

mdhaber commented Feb 13, 2020 • edited Loading

mdhaber commented Feb 13, 2020

swallan commented Feb 13, 2020

mdhaber commented Feb 16, 2020

swallan commented Mar 4, 2020 • edited Loading

mdhaber commented Mar 4, 2020 • edited Loading

mdhaber commented Mar 8, 2020

mdhaber commented Mar 13, 2020 • edited Loading

mdhaber commented Jan 13, 2020 •

edited

Loading

mdhaber commented Feb 13, 2020 •

edited

Loading

swallan commented Mar 4, 2020 •

edited

Loading

mdhaber commented Mar 4, 2020 •

edited

Loading

mdhaber commented Mar 13, 2020 •

edited

Loading