REF: refactor Graph.to_W to avoid perf bottleneck #691

martinfleis · 2024-03-08T14:38:41Z

Closes #672

For adjacency of 1,404,080 edges, the conversion time is 42.7s on main and 4.5s in this PR.

@jGaboardi not sure how to properly label this...

jGaboardi · 2024-03-08T14:41:27Z

I'd say this is an "enhancement" due to the performance improvement.

martinfleis · 2024-03-08T15:08:23Z

Thinking about this more, I think that we shall use the same logic in weights and neighbors properties that are unbearably slow for large graphs. Hard to catch these things when developing on small graphs...

codecov · 2024-03-08T15:10:54Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 85.0%. Comparing base (018f1e2) to head (bcabdbc).
Report is 4 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##            main    #691     +/-   ##
=======================================
+ Coverage   84.7%   85.0%   +0.3%     
=======================================
  Files        141     141             
  Lines      15200   15203      +3     
=======================================
+ Hits       12880   12924     +44     
+ Misses      2320    2279     -41

Files	Coverage Δ
libpysal/graph/base.py	`97.9% <100.0%> (+<0.1%)`	⬆️

... and 11 files with indirect coverage changes

knaaptime · 2024-03-08T15:34:09Z

i got a question about the performance of the weights builders in the workshop yesterday, and i had to proudly scoff that they're the fastest you can find... 'Martin uses these routinely on datasets in the millions' :D

knaaptime

looks nice. The idea is the groupby already gives you all the info you need, so just bypass the apply and build the lists cheaply?
Agreed, lets adopt this same style elsewhere

jGaboardi · 2024-03-08T16:00:44Z

Do we know what's causing the failures on macOS & ubuntu-dev?

knaaptime · 2024-03-08T16:04:40Z

the mac test looks like its probably a fluke. the ubuntu stuff is all over the place

martinfleis · 2024-03-08T16:07:22Z

Builders are fast but some compatibility bits seem to be worse :). I'm noticing it only now when I regularly use Graph on large data. There's a reason it is still experimental :)

ljwolf · 2024-03-08T16:12:48Z

This looks fine to me! I think we may want to consider some benchmarks with asv for these kinds of things... I think that construction, serialization, conversion, lag, and standardization are the big targets?

jGaboardi · 2024-03-09T02:16:02Z

Looks like all failures will be fixed by #692. Going ahead with that merge now and see if we can get this green.

jGaboardi · 2024-03-09T03:02:12Z

All green (for now --> the macOS failure just before had to do with a connection issue for geodatasets).

Not sure if we wanted to see about implementing @ljwolf's idea for asv benchmarking here, or that's something from the future. Thinking probably it's future, but I'll wait on merging until confirmed.

martinfleis · 2024-03-09T08:05:50Z

Not going to do asv here. I'm generally a bit skeptic about it given we have it in geopandas and no one is running it or anything.

REF: refactor Graph.to_W to avoid perf bottleneck

3fe873e

martinfleis added the graph label Mar 8, 2024

martinfleis requested review from sjsrey, ljwolf, knaaptime and jGaboardi March 8, 2024 14:38

martinfleis self-assigned this Mar 8, 2024

knaaptime approved these changes Mar 8, 2024

View reviewed changes

ljwolf approved these changes Mar 8, 2024

View reviewed changes

jGaboardi approved these changes Mar 8, 2024

View reviewed changes

jGaboardi added the enhancement label Mar 8, 2024

Merge remote-tracking branch 'upstream/main' into to_w

bcabdbc

martinfleis merged commit 8748bdd into pysal:main Mar 9, 2024
11 checks passed

martinfleis deleted the to_w branch March 9, 2024 08:06

martinfleis mentioned this pull request Apr 6, 2024

REF: minor performance improvements in Graph #697

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

REF: refactor Graph.to_W to avoid perf bottleneck #691

REF: refactor Graph.to_W to avoid perf bottleneck #691

martinfleis commented Mar 8, 2024

jGaboardi commented Mar 8, 2024

martinfleis commented Mar 8, 2024

codecov bot commented Mar 8, 2024 •

edited

Loading

knaaptime commented Mar 8, 2024

knaaptime left a comment

jGaboardi commented Mar 8, 2024

knaaptime commented Mar 8, 2024

martinfleis commented Mar 8, 2024

ljwolf commented Mar 8, 2024

jGaboardi commented Mar 9, 2024

jGaboardi commented Mar 9, 2024

martinfleis commented Mar 9, 2024

REF: refactor Graph.to_W to avoid perf bottleneck #691

REF: refactor Graph.to_W to avoid perf bottleneck #691

Conversation

martinfleis commented Mar 8, 2024

jGaboardi commented Mar 8, 2024

martinfleis commented Mar 8, 2024

codecov bot commented Mar 8, 2024 • edited Loading

Codecov Report

knaaptime commented Mar 8, 2024

knaaptime left a comment

Choose a reason for hiding this comment

jGaboardi commented Mar 8, 2024

knaaptime commented Mar 8, 2024

martinfleis commented Mar 8, 2024

ljwolf commented Mar 8, 2024

jGaboardi commented Mar 9, 2024

jGaboardi commented Mar 9, 2024

martinfleis commented Mar 9, 2024

codecov bot commented Mar 8, 2024 •

edited

Loading