# TST tests for non-canonical input to sparse matrix operations #3254

Merged
merged 5 commits into from Feb 1, 2014

## Conversation

Projects
None yet
3 participants
Contributor

### jnothman commented Jan 29, 2014

 A number of sparse matrix formats are designed to treat duplicate values as their sum; they prefer indices in sorted order, but should be capable when unsorted. The current test suite largely compares functionality to numpy arrays/matrices, and so constructs sparse matrices from those, producing only canonical sparse forms. This introduces tests for non-canonical forms, but finds many test failures. I don't intend to fix them all here, but we can perhaps add known failure decorations.

### pv and 1 other commented on an outdated diff Jan 30, 2014

scipy/sparse/tests/test_base.py
 + if np.issubdtype(data.dtype, np.bool_): + if indptr is None: + return (data,) + inds + else: + return (data,) + inds + (indptr,) + + data = data.repeat(2, axis=0) + data[::2] *= 2 + data[1::2] *= -1 + + inds = tuple(indices.repeat(2) for indices in inds) + + if indptr is None: + return (data,) + inds + else: + return (data,) + inds + 2 * indptr

#### pv Jan 30, 2014

Owner

This probably should read `2 * (indptr,)`

#### jnothman Jan 30, 2014

Contributor

No, we're duplicating the data entries; `2*indptr` fixes the indptr to point correctly.

#### jnothman Jan 30, 2014

Contributor

And I'd created this helper function in part to work with LIL as well, until I realised (I suppose) that LIL's not meant to handle duplicates like COO and CSR.

#### pv Jan 30, 2014

Owner

Then it probably should be `(data,) + inds + (2 * indptr,)`?
indptr is an array and cannot be added to a tuple (check the Travis-CI output)

Contributor

Yes, that ;)

### pv and 1 other commented on an outdated diff Jan 30, 2014

scipy/sparse/tests/test_base.py
 + indices[start:stop] = indices[start:stop][::-1] + data[start:stop] = data[start:stop][::-1] + + if 'shape' not in kwargs: + kwargs['shape'] = M.shape + + NC = construct((data, indices, indptr), **kwargs) + assert_array_almost_equal(M.A, NC.A) + return NC + + +class TestCSRNonCanonical(_NonCanonicalCompressedMixin, TestCSR): + pass + + +class TestCSCNonCanonical(_NonCanonicalCompressedMixin, TestCSR):

#### pv Jan 30, 2014

Owner

Inherits from TestCSR, not from TestCSC

Contributor

Whoops. Thanks.

### pv commented on an outdated diff Jan 30, 2014

scipy/sparse/tests/test_base.py
 + kwargs['shape'] = M.shape + + NC = construct((data, indices, indptr), **kwargs) + assert_array_almost_equal(M.A, NC.A) + return NC + + +class TestCSRNonCanonical(_NonCanonicalCompressedMixin, TestCSR): + pass + + +class TestCSCNonCanonical(_NonCanonicalCompressedMixin, TestCSR): + pass + + +class TestBSRNonCanonical(_NonCanonicalCompressedMixin, TestCSR):

Owner

As above

Owner

### pv commented Jan 30, 2014

 This test would indeed be useful to add. For the compressed formats, one could also check if things work when the `indices` array is too big and contains crap beyond `indptr[-1]`. knownfailures can be added by overriding the corresponding functions in the Test*NonCanonical classes.
Contributor

### jnothman commented Jan 30, 2014

 For the compressed formats, one could also check if things work when the indices array is too big and contains crap beyond indptr[-1]. Do we want this case to work? Or do you mean that we should test that every method throws an error (do we need to be validating that often?)?
Owner

### pv commented Jan 30, 2014

 I was wondering whether len(indices) = len(data) = indptr[-1] should be taken as an invariant in the code or not. But maybe it's clearest to assume it's an invariant (doesn't need to be checked, except maybe in `__init__` and in `self.check_format`).
Contributor

### jnothman commented Jan 30, 2014

 I think we can assume len(indices) == len(data) == indptr[-1] except where there are functions for the user to set these (init). If the user manually changes these things, it's their problem. I'm pushing some known failures... On 31 January 2014 07:05, Pauli Virtanen notifications@github.com wrote: I was wondering whether len(indices) = len(data) = indptr[-1] should be taken as an invariant in the code or not. But maybe it's clearest to assume it's an invariant (doesn't need to be checked, except maybe in __init__and in self.check_format). Reply to this email directly or view it on GitHubhttps://github.com/scipy/scipy/pull/3254#issuecomment-33727268 .
Contributor

### jnothman commented Jan 31, 2014

 That's a whole lot of failures, and some are truly broken (abs, add_sub, bool, minmax, sparse_format_conversions, unary_ufunc_overrides; in CSR/C: sparse boolean indexing, broadcast element-wise multiply, inverse, solve and getnnz_axis). I'll rebase on master and try the changes to min/max.

### jnothman added some commits Jan 29, 2014

 jnothman `TST tests for non-canonical input to sparse matrix operations` `4eabdc2` jnothman `TST correct logic for noncanonical sparse tests` `3eaaca5` jnothman `TST add known failures for non-canonical sparse matrices` `bd391ab`
Contributor

### jnothman commented Jan 31, 2014

 It'll be nice to remove many of the known failures when #3233 is merged, but those cases largely throw an error at the moment, while other cases will silently return the wrong values, and at a minimum should have comments to note this fact.

### coveralls commented Jan 31, 2014

 Coverage remained the same when pulling bd391ab on jnothman:test_noncanonical_sparse into 4844c63 on scipy:master.
 jnothman `TST fix tests to avoid negating unsigned datatypes` `f37f094`
Contributor

### jnothman commented Jan 31, 2014

 It turns out `add_sub` and `mu` were my fault for not handling uints.
Owner

### pv commented Jan 31, 2014

 The remaining `test_mu` failures are due to use of `assert_array_almost_equal`. This function uses absolute tolerances, and it is best to never use it. `assert_allclose` is a better alternative.
 jnothman `TST use assert_allclose instead of _array_equal` `faedf01`

### coveralls commented Feb 1, 2014

 Coverage remained the same when pulling faedf01 on jnothman:test_noncanonical_sparse into 233ad82 on scipy:master.

### coveralls commented Feb 1, 2014

 Coverage remained the same when pulling faedf01 on jnothman:test_noncanonical_sparse into 233ad82 on scipy:master.

### pv merged commit `2b1c323` into scipy:masterFeb 1, 2014 1 check passed

#### 1 check passed

default The Travis CI build passed
Details
Owner

### pv commented Feb 1, 2014

 Thanks, merged.