-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
BUG: Fix fancy indexing on compressed sparse arrays with mixed indices
/ indptr
dtypes
#20183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Fix fancy indexing on compressed sparse arrays with mixed indices
/ indptr
dtypes
#20183
Conversation
I've modified my PR so that it no longer does inplace modification of the input array. This better fits existing code like: This will have higher peak memory (when the dtypes need to be converted) and may happen more than once. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @ivirshup. Ideally we wouldn't need to defend against these kinds of index dtype mismatches, but this is a reasonable improvement and it won't cause slowdowns for well-formed sparse arrays.
indices
/ indptr
dtypesindices
/ indptr
dtypes
scipy/sparse/tests/test_csr.py
Outdated
|
||
indices = [([2, 3, 4], slice(None)), (slice(None), [2, 3, 4])] | ||
for idx, mtx in product(indices, [base_mtx, indptr_64bit, indices_64bit]): | ||
np.testing.assert_array_equal(mtx[idx].toarray(), base_mtx[idx].toarray()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like CJ already approved, but I did notice that when I revert your source change this regression test still passes.
Something like this, which borrows from your original issue reproducer, fails before and passes after the source patch:
--- a/scipy/sparse/tests/test_csr.py
+++ b/scipy/sparse/tests/test_csr.py
@@ -183,4 +183,6 @@ def test_mixed_index_dtype_int_indexing(cls):
indices = [([2, 3, 4], slice(None)), (slice(None), [2, 3, 4])]
for idx, mtx in product(indices, [base_mtx, indptr_64bit, indices_64bit]):
- np.testing.assert_array_equal(mtx[idx].toarray(), base_mtx[idx].toarray())
\ No newline at end of file
+ np.testing.assert_array_equal(mtx[idx].toarray(), base_mtx[idx].toarray())
+ base_mtx.indptr = base_mtx.indptr.astype(np.int64)
+ base_mtx[[1, 2], :]
Any chance we could do something like that? I'm not a sparse
regular, so maybe it can be cleaner than that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the catch! I've corrected this, though am a little unsure in what exactly is different.
Ok, I checked locally that we have fail before/pass after with the test now. The linter is complaining about a few line lengths, but also some other things you didn't touch (same file, but different lines). The other CI failures were recently fixed in I'll go ahead and merge this, overriding the linter as we get ready to branch, etc. I don't think that'll make |
Reference issue
csr_row_index
andcsr_column_index
error for mixed indices/indptr dtype when they should probably just convert #20182What does this implement/fix?
This coerces the
indices
/indptr
arrays instead of erroringAdditional information
I'm not totally sure on the implementation. It could be worth bundling this behaviour into a method, and maybe warning.