-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: sparse: Add copy parameter to all .toXXX() methods in sparse matrices #5829
ENH: sparse: Add copy parameter to all .toXXX() methods in sparse matrices #5829
Conversation
@@ master #5829 diff @@
======================================
Files 238 238
Stmts 43658 43688 +30
Branches 8197 8197
Methods 0 0
======================================
+ Hit 34086 34114 +28
Partial 2602 2602
- Missed 6970 6972 +2
|
Looks good to me. I don't like that there are some I'm +1 to merge, but I'll wait a bit to see if anyone else has comments first. |
Some of these methods have no docstrings, and the ones that do promise a copy. We should document that To prevent copy-pasting docstrings everywhere, I guess we could...
|
@larsmans Unfortunately, not all matrices support all the |
I'd rather not add an extra layer of method call indirection to all of these methods. Perhaps we can do something like the suggestions in http://stackoverflow.com/q/8100166/10601 ? |
Specifically, we could do something as simple as: # in coo.py, for example, after the class definition
vars(coo_matrix)['tocsr'].__doc__ = spmatrix.tocsr.__doc__ There's probably a better way to do this, but that's the general idea. I wouldn't be surprised if numpy already has a mechanism for this sort of thing. |
A cursory look at the documentation guide does not indicate that such a thing is possible or usual. Also, I may be flawed in my understanding, but if we define In any case, what should be the next steps? |
We are. I guess @perimosocordiae's solution is simpler. The sparse matrix classes already have some example of |
@larsmans That includes stubbing methods which do not exist across all the matrices (e.g. Also, indeed, doc copying happens in |
@musically-ut That looks like a good template to follow. |
def tocsr(self): | ||
return self.tocoo(copy=False).tocsr() | ||
def tocsr(self, copy=False): | ||
return self.tocoo(copy=copy).tocsr() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latter tocsr
always makes a copy, so it seems to me the call to tocoo
can always have copy=False
as it did before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Iow, return self.tocoo(copy=False).tocsr(copy=copy)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually was tempted to change this to:
if copy:
# One of the following depending on which is more efficient:
# return self.tocoo(copy=False).tocsr(copy=True)
return self.tocoo(copy=True).tocsr(copy=False)
else:
return self.tocoo(copy=False).tocsr(copy=False)
Also, I am not sure I see a reason why copy=False
should be the default.
I'll be very surprised if it effects the correctness of the copying. (Does it?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What @pv is saying here is that due to the current implementation of coo_matrix.tocsr()
, it will always produce a copy no matter what. So doing self.tocoo(copy=False)
is always acceptable, so long as you then convert to CSC/CSR format.
He also suggests using the copy kwarg in the final .tocsr()
call, because that will protect this method from causing trouble in the future if the CSR/CSC conversion changes to allow non-copying behavior.
The same goes for the BSR->COO->CSC conversion below.
@musically-ut I'd like to merge this soon, but I think it should include the suggested changes for BSR's |
0b02d16
to
09e9bf2
Compare
@perimosocordiae Sorry about the unclean rebase, but could you tell me if the last two commits are in line the kind of changes you had in mind? If so, I can base it off master, create a new branch and make another PR. ~ PS: I was seeing an odd test failure with |
Yeah, the last two commits look pretty good, minus some small nitpicks. It should be possible to clean up the history in the existing branch and force push, but you're free to close this and make a new PR if that's easier on your end. |
09e9bf2
to
71c44f2
Compare
Great! I have fixed the typos (except one indicies in PS: There was one other marginally unrelated test failure on my local computer in |
With copy=False, the data/indices may be shared between this matrix and | ||
the resultant dia_matrix. | ||
""" | ||
return self.tocoo(copy=False).todia(copy=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a bug: one of these should have copy=copy
71c44f2
to
3a8ebf0
Compare
I've taken care of the issues raised. Also, should I work on including the benchmarks as a part of this PR itself? Note that I am planning on having a more comprehensive set of benchmarks at musically-ut/scipy-sparse-benchmarks anyway. |
ENH: sparse: Add copy parameter to all .toXXX() methods in sparse matrices
Thanks @musically-ut, merged. We can add benchmarks in a separate PR. |
This closes #5822.
I have added
copy=False
as the default arguments for all.to???
conversion methods in sparse matrices. The only places where I left the defaults in place werebsr.py
andcsr.py
's.tobsr
method.All tests pass and no extra functionality has been added, requiring no additional unit-tests.