Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop astype from overwriting arrays when copy=True #8679

Closed

Conversation

perimosocordiae
Copy link
Member

Fixes gh-8678 by adding a new kwarg to the private _deduped_data method, which prevents sum_duplicates from modifying the source matrix when it shouldn't.

@pv
Copy link
Member

pv commented Apr 7, 2018

Test failures

@pv pv added the needs-work Items that are pending response from the author label Apr 7, 2018
@lesteve
Copy link
Contributor

lesteve commented Apr 23, 2018

Interesting, the diff looks fine to me: copy=False in by default _dedup_data should not affect the existing code and the rest of the changes looks good. Any suggestions about why the CIs are failing?

@pv
Copy link
Member

pv commented Apr 23, 2018 via email

@pv
Copy link
Member

pv commented Apr 23, 2018 via email

@lesteve
Copy link
Contributor

lesteve commented Apr 23, 2018

OK thanks for the details, I was asking to see if anyone had some insights off the top off their head just in case.

@pv
Copy link
Member

pv commented Apr 23, 2018 via email

@clementkng
Copy link
Contributor

Hi @pv, I would like to try working towards closing this PR. I took a look at the bug report, and was initially thinking of adding a test to duplicate the behavior and after that work on fixing the build issues. From your comments above, are you implying that the fix to this issue is to name the helper function _deduped_data to something else or would we need to explore an alternate fix to the original bug?

@rth
Copy link
Contributor

rth commented Jun 28, 2019

@clementkng If you are interested in investigating this that would be great! A test to reproduce the initial issue would be useful in any case.

are you implying that the fix to this issue is to name the helper function _deduped_data to something else or would we need to explore an alternate fix to the original bug?

Just changing the name of that function is not going to help I imagine. I'm not sure how to avoid the issue mentioned in #8679 (comment).

@clementkng
Copy link
Contributor

Thanks @rth. I was just trying to guage based off @pv's comments

The problem is that a "deduped_data" operation is impossible:

whether a solution to the bug was possible at all, or no longer possible due to how deduplication now modifies the index arrays.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-work Items that are pending response from the author
Projects
None yet
Development

Successfully merging this pull request may close these issues.

csr_matrix.astype raise ValueError when indices are not sorted and indices or indptr are read-only
6 participants