Skip to content

Conversation

ivirshup
Copy link
Contributor

@ivirshup ivirshup commented Mar 4, 2024

Reference issue

What does this implement/fix?

This coerces the indices/ indptr arrays instead of erroring

Additional information

I'm not totally sure on the implementation. It could be worth bundling this behaviour into a method, and maybe warning.

@ivirshup
Copy link
Contributor Author

ivirshup commented Mar 5, 2024

I've modified my PR so that it no longer does inplace modification of the input array. This better fits existing code like: _matmul_sparse and _binopt.

This will have higher peak memory (when the dtypes need to be converted) and may happen more than once.

@ivirshup ivirshup marked this pull request as ready for review March 5, 2024 13:01
Copy link
Member

@perimosocordiae perimosocordiae left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @ivirshup. Ideally we wouldn't need to defend against these kinds of index dtype mismatches, but this is a reasonable improvement and it won't cause slowdowns for well-formed sparse arrays.

@tylerjereddy tylerjereddy added this to the 1.13.0 milestone Mar 8, 2024
@tylerjereddy tylerjereddy changed the title Fix fancy indexing on compressed sparse arrays with mixed indices/ indptr dtypes BUG: Fix fancy indexing on compressed sparse arrays with mixed indices/ indptr dtypes Mar 8, 2024
@tylerjereddy tylerjereddy added the defect A clear bug or issue that prevents SciPy from being installed or used as expected label Mar 8, 2024

indices = [([2, 3, 4], slice(None)), (slice(None), [2, 3, 4])]
for idx, mtx in product(indices, [base_mtx, indptr_64bit, indices_64bit]):
np.testing.assert_array_equal(mtx[idx].toarray(), base_mtx[idx].toarray())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like CJ already approved, but I did notice that when I revert your source change this regression test still passes.

Something like this, which borrows from your original issue reproducer, fails before and passes after the source patch:

--- a/scipy/sparse/tests/test_csr.py
+++ b/scipy/sparse/tests/test_csr.py
@@ -183,4 +183,6 @@ def test_mixed_index_dtype_int_indexing(cls):
 
     indices = [([2, 3, 4], slice(None)), (slice(None), [2, 3, 4])]
     for idx, mtx in product(indices, [base_mtx, indptr_64bit, indices_64bit]):
-        np.testing.assert_array_equal(mtx[idx].toarray(), base_mtx[idx].toarray())
\ No newline at end of file
+        np.testing.assert_array_equal(mtx[idx].toarray(), base_mtx[idx].toarray())
+        base_mtx.indptr = base_mtx.indptr.astype(np.int64)
+        base_mtx[[1, 2], :]

Any chance we could do something like that? I'm not a sparse regular, so maybe it can be cleaner than that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch! I've corrected this, though am a little unsure in what exactly is different.

@tylerjereddy
Copy link
Contributor

Ok, I checked locally that we have fail before/pass after with the test now.

The linter is complaining about a few line lengths, but also some other things you didn't touch (same file, but different lines).

The other CI failures were recently fixed in main by Matt H.

I'll go ahead and merge this, overriding the linter as we get ready to branch, etc. I don't think that'll make main fail the linter, but if it does we'll deal with it...

@tylerjereddy tylerjereddy merged commit 33e93e0 into scipy:main Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect A clear bug or issue that prevents SciPy from being installed or used as expected scipy.sparse
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: csr_row_index and csr_column_index error for mixed indices/indptr dtype when they should probably just convert
3 participants