Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST tests for non-canonical input to sparse matrix operations #3254

Merged
merged 5 commits into from
Feb 1, 2014

Conversation

jnothman
Copy link
Contributor

A number of sparse matrix formats are designed to treat duplicate values as their sum; they prefer indices in sorted order, but should be capable when unsorted. The current test suite largely compares functionality to numpy arrays/matrices, and so constructs sparse matrices from those, producing only canonical sparse forms.

This introduces tests for non-canonical forms, but finds many test failures. I don't intend to fix them all here, but we can perhaps add known failure decorations.

if indptr is None:
return (data,) + inds
else:
return (data,) + inds + 2 * indptr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably should read 2 * (indptr,)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we're duplicating the data entries; 2*indptr fixes the indptr to point correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I'd created this helper function in part to work with LIL as well, until I realised (I suppose) that LIL's not meant to handle duplicates like COO and CSR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it probably should be (data,) + inds + (2 * indptr,)?
indptr is an array and cannot be added to a tuple (check the Travis-CI output)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that ;)

@pv
Copy link
Member

pv commented Jan 30, 2014

This test would indeed be useful to add.
For the compressed formats, one could also check if things work when the indices array is too big and contains crap beyond indptr[-1].

knownfailures can be added by overriding the corresponding functions in the Test*NonCanonical classes.

@jnothman
Copy link
Contributor Author

For the compressed formats, one could also check if things work when the indices array is too big and contains crap beyond indptr[-1].

Do we want this case to work? Or do you mean that we should test that every method throws an error (do we need to be validating that often?)?

@pv
Copy link
Member

pv commented Jan 30, 2014

I was wondering whether len(indices) = len(data) = indptr[-1] should be taken as an invariant in the code or not. But maybe it's clearest to assume it's an invariant (doesn't need to be checked, except maybe in __init__ and in self.check_format).

@jnothman
Copy link
Contributor Author

I think we can assume len(indices) == len(data) == indptr[-1] except where
there are functions for the user to set these (init). If the user
manually changes these things, it's their problem.

I'm pushing some known failures...

On 31 January 2014 07:05, Pauli Virtanen notifications@github.com wrote:

I was wondering whether len(indices) = len(data) = indptr[-1] should be
taken as an invariant in the code or not. But maybe it's clearest to assume
it's an invariant (doesn't need to be checked, except maybe in __init__and in
self.check_format).

Reply to this email directly or view it on GitHubhttps://github.com//pull/3254#issuecomment-33727268
.

@jnothman
Copy link
Contributor Author

That's a whole lot of failures, and some are truly broken (abs, add_sub, bool, minmax, sparse_format_conversions, unary_ufunc_overrides; in CSR/C: sparse boolean indexing, broadcast element-wise multiply, inverse, solve and getnnz_axis).

I'll rebase on master and try the changes to min/max.

@jnothman
Copy link
Contributor Author

It'll be nice to remove many of the known failures when #3233 is merged, but those cases largely throw an error at the moment, while other cases will silently return the wrong values, and at a minimum should have comments to note this fact.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling bd391ab on jnothman:test_noncanonical_sparse into 4844c63 on scipy:master.

@jnothman
Copy link
Contributor Author

It turns out add_sub and mu were my fault for not handling uints.

@pv
Copy link
Member

pv commented Jan 31, 2014

The remaining test_mu failures are due to use of assert_array_almost_equal. This function uses absolute tolerances, and it is best to never use it.
assert_allclose is a better alternative.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling faedf01 on jnothman:test_noncanonical_sparse into 233ad82 on scipy:master.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling faedf01 on jnothman:test_noncanonical_sparse into 233ad82 on scipy:master.

pv added a commit that referenced this pull request Feb 1, 2014
TST: sparse: tests for non-canonical input to sparse matrix operations

A number of sparse matrix formats are designed to treat duplicate values
as their sum; they prefer indices in sorted order, but should be capable
when unsorted. The current test suite largely compares functionality to
numpy arrays/matrices, and so constructs sparse matrices from those,
producing only canonical sparse forms.

These commits introduce tests for non-canonical forms.
@pv pv merged commit 2b1c323 into scipy:master Feb 1, 2014
@pv
Copy link
Member

pv commented Feb 1, 2014

Thanks, merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants