New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Speed up common/non_neighbors by using _adj dict operations #7244
Conversation
This looks good. The reasons for the generator approach are old and no longer relevant (sets were slower and _adj was only used in the base classes). I am not sure that we need a deprecation -- it is true that the type of the return value is changed from generator to set. But that will only break code that is calling There are quite a few other functions in the |
The non_neighbors looks good too. |
I think I have covered the all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be cool to see if these changes make any difference. That could help motivate us to do similar things elsewhere. Perhaps timing the tests is a place to start -- though they are not big graphs there...
Other possibilities:
1292: Maybe node in G._adj
is faster than node in G
?
654: values[n] -> v because otherwise why loop over items()
902-904: return from inside the if-structure rather than assigning to values
That's what I found -- and I think this is getting into the weeds. :)
This changes behavior. |
Nice catch! I guess that raises the question of whether a single try/except with a loop inside is better than many try/excepts inside a loop. That should be a python dependent question maybe answered elsewhere -- but I didn't find in a quick search. |
Co-authored-by: Dan Schult <dschult@colgate.edu>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve this PR.
style changes if you want:
- in the test module, change
nbors
->nbrs
- the suggestion below for avoiding ugly(IMO) line breaks
Co-authored-by: Dan Schult <dschult@colgate.edu>
I hope this doesn't distract the discussion too much... but since this PR largely deals with performance, I thought it'd be a good opportunity to try adding some benchmarks! I went ahead and pushed up benchmarks for
Which on my system (m1 mac running asahi linux) gives
|
Let's merge this in with the benchmarks :) |
…workx#7244) * ENH: Speed up common_neighbors by using _adj dict operations * Speed up non_neighbors * need for speed * Update networkx/classes/function.py Co-authored-by: Dan Schult <dschult@colgate.edu> * Update networkx/algorithms/link_prediction.py Co-authored-by: Dan Schult <dschult@colgate.edu> * Add benchmarks for non_neighbors. * Add benchmarks for common_neighbors. --------- Co-authored-by: Dan Schult <dschult@colgate.edu> Co-authored-by: Ross Barnowski <rossbar@berkeley.edu> Co-authored-by: Ross Barnowski <rossbar@caltech.edu>
I'm not sure why we should not use direct set operations on the
_adj
dictionary to find common neighbors. This will require a deprecation probably as we will change the behavior of the return type (generator
toset
).This was spurred by #7243.