-
-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NX 3.0: changed return type of directed_combinatorial_laplacian to SciPy Sparse Matrix #4141
Conversation
Hello @willpeppo! Thanks for updating this PR. There are currently no PEP 8 issues detected in this Pull Request. Cheers!Please install and run psf/black. Comment last updated at 2020-08-07 19:08:15 UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woohoo! Congrats Will, and thanks for sticking with us through the afternoon!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took another look at this and noticed a few things.
The original implementation of _transition_matrix
actually gives different return types based on walk_type
(or, in the cases of walk_type=None
, based on the properties of the input graph). Walk-types of random
and lazy
give scipy.sparse.csr_matrix
outputs whereas pagerank
gives dense numpy.matrix
outputs. For example, with v2.5::
>>> G = nx.complete_graph(32, create_using=nx.DiGraph) # strongly-connected to avoid div-by-zero warnings
>>> from networkx.linalg.laplacianmatrix import _transition_matrix
>>> _transition_matrix(G, walk_type="random")
<32x32 sparse matrix of type '<class 'numpy.float64'>'
with 992 stored elements in Compressed Sparse Row format>
>>> _transition_matrix(G, walk_type="lazy")
<32x32 sparse matrix of type '<class 'numpy.float64'>'
with 1024 stored elements in Compressed Sparse Row format>
>>> _transition_matrix(G, walk_type="pagerank")
matrix([[ ...
...
]])
32c6b7a illustrates how this could be modified to ensure that sparsity is maintained throughout the computations in _transition_matrix
for all modes, though the trick with dangling
will result in sparse efficiency warnings.
Following up on the dangling
line: this seems like a way to avoid div-by-zero errors. A question for those more familiar with the algorithm(s): Is this procedure specific to the pagerank method? Could the same procedure be used for the random
and lazy
modes?
The pagerank walk (as I understand it based not on NetworkX, but from teaching) is a random walk over the webpage network where at each step with probability alpha, you take a normal random step chosen uniformly from all possible links, but with probability (1-alpha) you jump to a webpage randomly with equal probability for all webpages. The idea is that webpage users follow links approximately randomly but every once in a while get bored and just pick a webpage at random. Mathematically, this removes all "absorbing states" from the process...and theP-F theorem holds with a unique steady state vector of probabilities for each webpage. In terms of "pagerank power", no webpage collects pagerank power just because random walkers don't have a way to leave. Let Q be a matrix with all entries 1 (nxn matrix same as the transition matrix). Let T be the transition matrix for randomly walking on the webpages. Then the pagerank transition matrix will be: alpha*T + (1-alpha)Q/n A Lazy walk is a random walk with a probability of not taking a step being just as likely as any of the links... So the transition matrix gets 1/degree(v) added to the diagonal for row corresponding to node v. A random walk should follow some edge (with equal chance) out of the node at each timestep. Notice that the pagerank transition matrix is NOT a sparse matrix. It has a non-zero entry in every position. (The chance of going from any webpage to any other webpage is nonzero.) So, maybe it shouldn't be stored as a scipy sparse matrix. It isn't sparse. Hope this helps.... |
It definitely does, thanks for taking the time to lay this out. Given that the |
Okay, I've looked at this (and related It's not worth the effort to remove the uses of That's my opinion anyway :) This issue proved to be very thorny during the sprint and doesn't really have a clear-cut solution as far as I can see. |
This one has also been superseded by #5139. |
This is a change for nx 3.0.