NX 3.0: changed return type of directed_combinatorial_laplacian to SciPy Sparse Matrix #4141

willpeppo · 2020-08-07T19:02:08Z

This is a change for nx 3.0.

…se Matrix

pep8speaks · 2020-08-07T19:02:12Z

Hello @willpeppo! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers!Please install and run psf/black.

Comment last updated at 2020-08-07 19:08:15 UTC

ericmjl

Woohoo! Congrats Will, and thanks for sticking with us through the afternoon!

willpeppo · 2020-08-08T01:26:19Z

@ericmjl, @rossbar, @dschult thanks ! that was awesome hanging out with you guys today. I really appreciate it and thanks for sticking with ME. I learned a lot and look forward to collaborating on some other things...

rossbar

I took another look at this and noticed a few things.

The original implementation of _transition_matrix actually gives different return types based on walk_type (or, in the cases of walk_type=None, based on the properties of the input graph). Walk-types of random and lazy give scipy.sparse.csr_matrix outputs whereas pagerank gives dense numpy.matrix outputs. For example, with v2.5::

>>> G = nx.complete_graph(32, create_using=nx.DiGraph)  # strongly-connected to avoid div-by-zero warnings
>>> from networkx.linalg.laplacianmatrix import _transition_matrix
>>> _transition_matrix(G, walk_type="random")
<32x32 sparse matrix of type '<class 'numpy.float64'>'
        with 992 stored elements in Compressed Sparse Row format>
>>> _transition_matrix(G, walk_type="lazy")
<32x32 sparse matrix of type '<class 'numpy.float64'>'
        with 1024 stored elements in Compressed Sparse Row format>
>>> _transition_matrix(G, walk_type="pagerank")
matrix([[ ...
   ...
]])

32c6b7a illustrates how this could be modified to ensure that sparsity is maintained throughout the computations in _transition_matrix for all modes, though the trick with dangling will result in sparse efficiency warnings.

Following up on the dangling line: this seems like a way to avoid div-by-zero errors. A question for those more familiar with the algorithm(s): Is this procedure specific to the pagerank method? Could the same procedure be used for the random and lazy modes?

dschult · 2020-10-23T19:50:31Z

The pagerank walk (as I understand it based not on NetworkX, but from teaching) is a random walk over the webpage network where at each step with probability alpha, you take a normal random step chosen uniformly from all possible links, but with probability (1-alpha) you jump to a webpage randomly with equal probability for all webpages. The idea is that webpage users follow links approximately randomly but every once in a while get bored and just pick a webpage at random. Mathematically, this removes all "absorbing states" from the process...and theP-F theorem holds with a unique steady state vector of probabilities for each webpage. In terms of "pagerank power", no webpage collects pagerank power just because random walkers don't have a way to leave.

Let Q be a matrix with all entries 1 (nxn matrix same as the transition matrix). Let T be the transition matrix for randomly walking on the webpages. Then the pagerank transition matrix will be: alpha*T + (1-alpha)Q/n

A Lazy walk is a random walk with a probability of not taking a step being just as likely as any of the links... So the transition matrix gets 1/degree(v) added to the diagonal for row corresponding to node v.

A random walk should follow some edge (with equal chance) out of the node at each timestep.

Notice that the pagerank transition matrix is NOT a sparse matrix. It has a non-zero entry in every position. (The chance of going from any webpage to any other webpage is nonzero.) So, maybe it shouldn't be stored as a scipy sparse matrix. It isn't sparse.

Hope this helps....

rossbar · 2020-10-23T20:13:57Z

Hope this helps....

It definitely does, thanks for taking the time to lay this out. Given that the pagerank gives a non-sparse matrix by design, that would seem to eliminate the motivation for these changes IMO.

rossbar · 2021-02-05T23:32:02Z

Okay, I've looked at this (and related np.matrix issues) again and have formed the following opinion:

It's not worth the effort to remove the uses of np.matrix in these cases. np.matrix is baked into scipy.sparse, so problems like this will continue to arise up any time scipy.sparse is used. IMO it wouldn't be worth it to ask people contributing algorithms to work around np.matrix, and likewise is probably not worth the effort of developers/maintainers to figure out how to work around it either. The benefits of scipy.sparse far outweigh the drawbacks (i.e. not having sparse representations), so my preference would be to kick the can on this issue until a more suitable solution can be found. In principle, a better solution would be to solve this problem upstream either by removing the dependence on np.matrix from scipy.sparse or by investigating and switching to an alternative sparse library that supports nD arrays.

That's my opinion anyway :) This issue proved to be very thorny during the sprint and doesn't really have a clear-cut solution as far as I can see.

rossbar · 2022-01-27T18:13:18Z

This one has also been superseded by #5139.

changed return type of directed_combinatorial_laplacian to SciPy Spar…

02aa89b

…se Matrix

ran black formatting

52c6667

willpeppo mentioned this pull request Aug 7, 2020

added future warning to directed_combinatorial_laplacian for new return type #4142

Closed

ericmjl reviewed Aug 7, 2020

View reviewed changes

Keep sparsity throughout all computations in _transition_matrix.

32c6b7a

rossbar reviewed Oct 23, 2020

View reviewed changes

Base automatically changed from master to main March 4, 2021 18:20

rossbar mentioned this pull request Apr 12, 2021

Remove numpy matrix #3107

Closed

rossbar closed this Jan 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NX 3.0: changed return type of directed_combinatorial_laplacian to SciPy Sparse Matrix #4141

NX 3.0: changed return type of directed_combinatorial_laplacian to SciPy Sparse Matrix #4141

willpeppo commented Aug 7, 2020

pep8speaks commented Aug 7, 2020 •

edited

Loading

ericmjl left a comment

willpeppo commented Aug 8, 2020

rossbar left a comment

dschult commented Oct 23, 2020

rossbar commented Oct 23, 2020

rossbar commented Feb 5, 2021

rossbar commented Jan 27, 2022

NX 3.0: changed return type of directed_combinatorial_laplacian to SciPy Sparse Matrix #4141

NX 3.0: changed return type of directed_combinatorial_laplacian to SciPy Sparse Matrix #4141

Conversation

willpeppo commented Aug 7, 2020

pep8speaks commented Aug 7, 2020 • edited Loading

Comment last updated at 2020-08-07 19:08:15 UTC

ericmjl left a comment

Choose a reason for hiding this comment

willpeppo commented Aug 8, 2020

rossbar left a comment

Choose a reason for hiding this comment

dschult commented Oct 23, 2020

rossbar commented Oct 23, 2020

rossbar commented Feb 5, 2021

rossbar commented Jan 27, 2022

pep8speaks commented Aug 7, 2020 •

edited

Loading