-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REF: refactor Graph.to_W to avoid perf bottleneck #691
Conversation
I'd say this is an "enhancement" due to the performance improvement. |
Thinking about this more, I think that we shall use the same logic in |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #691 +/- ##
=======================================
+ Coverage 84.7% 85.0% +0.3%
=======================================
Files 141 141
Lines 15200 15203 +3
=======================================
+ Hits 12880 12924 +44
+ Misses 2320 2279 -41
|
i got a question about the performance of the weights builders in the workshop yesterday, and i had to proudly scoff that they're the fastest you can find... 'Martin uses these routinely on datasets in the millions' :D |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks nice. The idea is the groupby already gives you all the info you need, so just bypass the apply and build the lists cheaply?
Agreed, lets adopt this same style elsewhere
Do we know what's causing the failures on macOS & ubuntu-dev? |
the mac test looks like its probably a fluke. the ubuntu stuff is all over the place |
Builders are fast but some compatibility bits seem to be worse :). I'm noticing it only now when I regularly use Graph on large data. There's a reason it is still experimental :) |
This looks fine to me! I think we may want to consider some benchmarks with asv for these kinds of things... I think that construction, serialization, conversion, lag, and standardization are the big targets? |
Looks like all failures will be fixed by #692. Going ahead with that merge now and see if we can get this green. |
All green (for now --> the macOS failure just before had to do with a connection issue for Not sure if we wanted to see about implementing @ljwolf's idea for |
Not going to do |
Closes #672
For adjacency of 1,404,080 edges, the conversion time is 42.7s on main and 4.5s in this PR.
@jGaboardi not sure how to properly label this...