Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NX 3.0: changed return type of directed_combinatorial_laplacian to SciPy Sparse Matrix #4141

Closed
wants to merge 3 commits into from

Conversation

willpeppo
Copy link
Contributor

This is a change for nx 3.0.

@pep8speaks
Copy link

pep8speaks commented Aug 7, 2020

Hello @willpeppo! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers!Please install and run psf/black.

Comment last updated at 2020-08-07 19:08:15 UTC

Copy link
Contributor

@ericmjl ericmjl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woohoo! Congrats Will, and thanks for sticking with us through the afternoon!

@willpeppo
Copy link
Contributor Author

@ericmjl, @rossbar, @dschult thanks ! that was awesome hanging out with you guys today. I really appreciate it and thanks for sticking with ME. I learned a lot and look forward to collaborating on some other things...

Copy link
Contributor

@rossbar rossbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took another look at this and noticed a few things.

The original implementation of _transition_matrix actually gives different return types based on walk_type (or, in the cases of walk_type=None, based on the properties of the input graph). Walk-types of random and lazy give scipy.sparse.csr_matrix outputs whereas pagerank gives dense numpy.matrix outputs. For example, with v2.5::

>>> G = nx.complete_graph(32, create_using=nx.DiGraph)  # strongly-connected to avoid div-by-zero warnings
>>> from networkx.linalg.laplacianmatrix import _transition_matrix
>>> _transition_matrix(G, walk_type="random")
<32x32 sparse matrix of type '<class 'numpy.float64'>'
        with 992 stored elements in Compressed Sparse Row format>
>>> _transition_matrix(G, walk_type="lazy")
<32x32 sparse matrix of type '<class 'numpy.float64'>'
        with 1024 stored elements in Compressed Sparse Row format>
>>> _transition_matrix(G, walk_type="pagerank")
matrix([[ ...
   ...
]])

32c6b7a illustrates how this could be modified to ensure that sparsity is maintained throughout the computations in _transition_matrix for all modes, though the trick with dangling will result in sparse efficiency warnings.

Following up on the dangling line: this seems like a way to avoid div-by-zero errors. A question for those more familiar with the algorithm(s): Is this procedure specific to the pagerank method? Could the same procedure be used for the random and lazy modes?

@dschult
Copy link
Member

dschult commented Oct 23, 2020

The pagerank walk (as I understand it based not on NetworkX, but from teaching) is a random walk over the webpage network where at each step with probability alpha, you take a normal random step chosen uniformly from all possible links, but with probability (1-alpha) you jump to a webpage randomly with equal probability for all webpages. The idea is that webpage users follow links approximately randomly but every once in a while get bored and just pick a webpage at random. Mathematically, this removes all "absorbing states" from the process...and theP-F theorem holds with a unique steady state vector of probabilities for each webpage. In terms of "pagerank power", no webpage collects pagerank power just because random walkers don't have a way to leave.

Let Q be a matrix with all entries 1 (nxn matrix same as the transition matrix). Let T be the transition matrix for randomly walking on the webpages. Then the pagerank transition matrix will be: alpha*T + (1-alpha)Q/n

A Lazy walk is a random walk with a probability of not taking a step being just as likely as any of the links... So the transition matrix gets 1/degree(v) added to the diagonal for row corresponding to node v.

A random walk should follow some edge (with equal chance) out of the node at each timestep.

Notice that the pagerank transition matrix is NOT a sparse matrix. It has a non-zero entry in every position. (The chance of going from any webpage to any other webpage is nonzero.) So, maybe it shouldn't be stored as a scipy sparse matrix. It isn't sparse.

Hope this helps....

@rossbar
Copy link
Contributor

rossbar commented Oct 23, 2020

Hope this helps....

It definitely does, thanks for taking the time to lay this out. Given that the pagerank gives a non-sparse matrix by design, that would seem to eliminate the motivation for these changes IMO.

@rossbar
Copy link
Contributor

rossbar commented Feb 5, 2021

Okay, I've looked at this (and related np.matrix issues) again and have formed the following opinion:

It's not worth the effort to remove the uses of np.matrix in these cases. np.matrix is baked into scipy.sparse, so problems like this will continue to arise up any time scipy.sparse is used. IMO it wouldn't be worth it to ask people contributing algorithms to work around np.matrix, and likewise is probably not worth the effort of developers/maintainers to figure out how to work around it either. The benefits of scipy.sparse far outweigh the drawbacks (i.e. not having sparse representations), so my preference would be to kick the can on this issue until a more suitable solution can be found. In principle, a better solution would be to solve this problem upstream either by removing the dependence on np.matrix from scipy.sparse or by investigating and switching to an alternative sparse library that supports nD arrays.

That's my opinion anyway :) This issue proved to be very thorny during the sprint and doesn't really have a clear-cut solution as far as I can see.

Base automatically changed from master to main March 4, 2021 18:20
@rossbar rossbar mentioned this pull request Apr 12, 2021
@rossbar
Copy link
Contributor

rossbar commented Jan 27, 2022

This one has also been superseded by #5139.

@rossbar rossbar closed this Jan 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants