New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make COO canonical in the constructor. #141
Conversation
Codecov Report
@@ Coverage Diff @@
## master #141 +/- ##
=========================================
- Coverage 95.91% 95.81% -0.1%
=========================================
Files 10 10
Lines 1223 1195 -28
=========================================
- Hits 1173 1145 -28
Misses 50 50
Continue to review full report at Codecov.
|
2dc7322
to
d4564f8
Compare
@nils-werner This should address your concerns in #58. You're welcome to review as well. |
I don't have enough experience with performance applications to know if this is a good or a bad idea performance-wise in the common case. I recommend that we ping the original author of the comment, @perimosocordiae , for his thoughts. It might also be good to ask for more information and possibly for a wider viewpoint from some of the author maintainers of the library. Perhaps this is a good use of the scipy-dev mailing list? cc @stefanv |
This isn't meant to improve performance (since we already skip these steps when already done for an object). I believe @perimosocordiae was arguing maintenance wise. We don't have all the extra fluff in the object, and don't have to think "is it canonical now?" because it always will be. |
I still recommend asking more broadly within the scipy community. They may
have reasons for the sum_duplicates behavior. Do we know why this is? Are
there cases where this is important? I don't personally have enough
hands-on experience with the uses of COO to make this kind of decision.
…On Mon, Apr 23, 2018 at 11:20 AM, Hameer Abbasi ***@***.***> wrote:
This isn't meant to improve performance (since we already skip these steps
when already done for an object). I believe @perimosocordiae
<https://github.com/perimosocordiae> was arguing maintenance wise. We
don't have all the extra fluff in the object, and don't have to think "is
it canonical now?" because it always will be.
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub
<#141 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AASszOlydjt59cIDxrchxma3lY2yp6ugks5trfFYgaJpZM4TfJQX>
.
|
I think this is a good idea. At least with scipy's sparse matrices, so many operations require canonical (or at least duplicate-summed) matrices that the odds of constructing a COO matrix with dups and never indirectly calling The downside is that you're explicitly disallowing some uses of the object, like representing a multi-graph. I'd argue that those users are better served by an object designed for that purpose, though, and since this package is very new you're unlikely to break anyone's existing code. Pinging the scipy-dev list for objections is a good idea. |
I posted it to the Scipy-Dev mailing list. Link to thread. |
Allowing another day for comments and then merging, in the absence of objections. |
Sounds good. Thanks for doing this @hameerabbasi !
…On Wed, Apr 25, 2018 at 4:09 AM, Hameer Abbasi ***@***.***> wrote:
Allowing another day for comments and then merging, in the absence of
objections.
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub
<#141 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AASszDwOhEbnsVaEcft8KtjcdksHZDyKks5tsC9EgaJpZM4TfJQX>
.
|
Just to be clear, this "canonicalizes" COO arrays, right? So you an still provide non-canonical inputs to the COO constructor, and this will convert them into canonical form? If so, this makes complete sense to me. |
Yes, that's what it does. In addition, you can pass two flags (as before), |
Based on recommendations in scipy/scipy#8162 (comment) I believe that we should make
COO
always be canonical.This has several advantages:
x.sum_duplicates()
calls just disappear.sorted=?, duplicates=?
was just confusing to someone who didn't know howCOO
really worked.