Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST: complex example checks to PROPACK tests #21

Merged
merged 7 commits into from
Jul 17, 2021

Conversation

mckib2
Copy link
Owner

@mckib2 mckib2 commented Jun 28, 2021

  • adds PROPACK's complex Harwell-Boeing Exchange Format reader to complex64 and complex128 example checks
  • remove example matrix files from scipy source tree (rely on those in PROPACK submodule)
  • I think I like have the MatrixMarket files in scipy -- that's more in line with what's already there and the size of the files are comparable to other benchmark data files

TODO:

  • tolerances for u, vh complex example checks

@mckib2 mckib2 changed the title Propack test bench refactors TST: add complex example checks to PROPACK tests Jun 28, 2021
@mckib2 mckib2 changed the title TST: add complex example checks to PROPACK tests TST: complex example checks to PROPACK tests Jun 28, 2021
return A


def load_sigma(folder, precision="double", irl=False):
dtype = {"single": np.float32, "double": np.float64}[precision]
dtype = {
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shows up a lot of places, should probably refactor so everything uses the same mapping

Copy link
Collaborator

@mdhaber mdhaber Jun 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a good idea. Maybe call it _dtype_map?

assert_allclose(np.abs(u), np.abs(u_expected), atol=atol)
assert_allclose(np.abs(vt), np.abs(vt_expected), atol=atol)

# TODO: complex types seem to have trouble with this, maybe adjust atol for
Copy link
Owner Author

@mckib2 mckib2 Jun 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdhaber Even when I adjust to use atol=1e-1 on complex data, I'm getting a lot of mismatches. Any hints or is it fine to exclude for now? Singular values seem fine, just the left/right singular vectors that are having issues

Copy link
Collaborator

@mdhaber mdhaber Jun 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two problems here.

The expected results that are being loaded don't seem correct.

For the complex16 example:

def reconstruct_A(u, s, vh):
    return u @ np.diag(s) @  vh

A0 = reconstruct_A(u_expected, s_expected, vh_expected)
A1 = reconstruct_A(u, s, vh)

print(np.linalg.norm(A.todense())) - norm of original matrix - 110.2050939958339
print(np.linalg.norm(A0 - A))  # norm of residual of expected partial SVD - 179.4476639715223
print(np.linalg.norm(A1 - A))  # norm of residual of actual partial SVD  - 1.0994110349274073

1.0994110349274073 is small relative to 110.2050939958339; 179.4476639715223 is not. This suggests that our code is producing a partial SVD that is much more consistent with the original matrix.

Also:

from scipy.linalg import svd
u3, s3, vh3 = svd(A.todense())
u3 = u3[:,:k]
s3 = s3[:k]
vh3 = vh3[:k]
A3 = reconstruct_A(u3, s3, vh3)
print(np.linalg.norm(A3 - A))  # 1.0994110349274073
print(np.linalg.norm(A3 - A1))  # 2.4623325485074163e-10

So the reconstructed matrix from our code is very similar to the reconstructed matrix produced from the (partial; k = 200) results of scipy.linalg.svd.

The original A matrix has a lot of repeated singular values.

This makes it difficult to compare the SVDs because the corresponding columns of u and rows of vh can be in a different order - actually, it's probably much worse than that; I don't think they're unique and they probably just have to span the same subspace.

For instance, the first 27 columns of u and u3 agree.

np.testing.assert_allclose(np.abs(u[:, :27]), np.abs(u3[:, :27]), atol=1e-14)  # True
np.testing.assert_allclose(np.abs(u[:, :27]), np.abs(u_expected[:, :27]), atol=1e-14)  #  Also True, actually

These correspond with the first 27 singular values:

>>> s[:27]
array([70.32203242, 70.00692295, 26.7388179 , 26.41915262, 12.73844424,
       12.24801509,  7.99152204,  7.67632184,  7.31533747,  6.87598475,
        5.42325413,  4.91629832,  4.26967853,  3.98062428,  3.80209103,
        3.69441692,  3.52584012,  3.02172029,  3.01505871,  2.68859418,
        2.65081908,  2.51221116,  2.44594626,  2.37044294,  2.14864846,
        2.13901198,  2.04126998])

And then we start repeating the singular value 2:

>>> s[27:42]
array([2.        , 2.        , 2.        , 2.        , 2.        ,
       2.        , 2.        , 2.        , 2.        , 2.        ,
       2.        , 2.        , 2.        , 2.        , 1.96937591])

Immediately we run into trouble.

>>> np.testing.assert_allclose(np.abs(u[:, 27]), np.abs(u3[:, 27]), atol=1e-14)
AssertionError: 
Not equal to tolerance rtol=1e-07, atol=1e-14

Mismatched elements: 21 / 1280 (1.64%)
Max absolute difference: 0.50134403
Max relative difference: 5.10131082e+108
 x: array([2.932184e-01, 4.016259e-18, 4.501122e-01, ..., 1.263908e-19,
       1.980111e-19, 2.619319e-19])
 y: array([0.000000e+000, 1.835095e-017, 5.646847e-006, ..., 3.626856e-127,
       1.112345e-125, 5.134600e-128])

I suppose I'd suggest comparing just the first 27 singular vectors against the expected results, and note this stuff in the comments. Then maybe check for orthogonality of the singular vectors, and perhaps compare results against scipy.linalg.svd . For instance, you could check the norm of the difference between the reconstructed matrices, and that is pretty convincing to me if it's tiny w.r.t. to the norm of the matrix itself.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good solution: it's now only comparing against first 27 singular vectors for complex example matrices and implements the norm comparison against np.linalg.svd

@mckib2
Copy link
Owner Author

mckib2 commented Jun 28, 2021

@mdhaber I don't think any changes here will clobber anything in gh-20, but if you can glance at the complex example tolerances that would be helpful. I spent all afternoon puzzling over writing a Fortran routine and using f2py with it, but I'm fairly happy with how it's turned out and has been a good learning experience.

As mentioned in the description, I'm leaning toward keeping the .mtx files in the benchmarks directory like you had them but moving the coord and cua files out in favor of the ones in the PROPACK submodule. That minimizes duplication and seems consistent with what's happening with benchmark data already and the mtx files "weight" about the same as other data there.

Ping me when you're happy with gh-20 (I made a couple minor comments) and I can merge. I will merge this one afterwards if you're okay with the contents. If there are outstanding issues/concerns we can move comments about those to the main PR (gh-9)

@mckib2
Copy link
Owner Author

mckib2 commented Jul 17, 2021

@mdhaber Good to merge?

@mckib2 mckib2 mentioned this pull request Jul 17, 2021
@mdhaber
Copy link
Collaborator

mdhaber commented Jul 17, 2021

yeah I think so. let me run it locally and let you know.

Copy link
Collaborator

@mdhaber mdhaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine. I'll take a closer look when everything is together in the main PR. Just checking - auto-generated pyf files are supposed to be added here, or are they created on the fly during build? (Or should those readers go in the submodule?)

@mckib2
Copy link
Owner Author

mckib2 commented Jul 17, 2021

Looks fine. I'll take a closer look when everything is together in the main PR. Just checking - auto-generated pyf files are supposed to be added here, or are they created on the fly during build? (Or should those readers go in the submodule?)

Yep, pyf files are generated by the f2py command and then manually edited with contextual information the same way that the main PROPACK pyf interface files are handled

@mckib2 mckib2 merged commit f071c34 into propack Jul 17, 2021
@mckib2 mckib2 deleted the propack-test-bench-refactors branch July 17, 2021 21:02
@mdhaber
Copy link
Collaborator

mdhaber commented Jul 17, 2021

K.

mckib2 pushed a commit that referenced this pull request Dec 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants