Algebraic connectivity, Fiedler vector and spectral ordering #1134

ysitu · 2014-05-01T04:09:11Z

This addresses #1123.

This basically involves finding the second smallest eigenvalue and its eigenvector of the Laplacian matrix of a graph. Supposedly, this should be handled by SciPy, but I included an one extra algorithm so that for each of my eight test problems, at least one algorithm solves it within 10 seconds.

Algorithms

To be clear up front, NumPy alone or anything based on dense matrices will not do the job as it will become a memory bomb when used on large graphs. SciPy provides two sparse solvers eigsh and lobpcg. eigsh is the implicitly restarted Lanczos iteration from ARPACK. lobpcg stands for "locally optimal block preconditioned conjugate gradient". I know about the "PCG" part but not the "LOB" part.

The algorithm not from SciPy is TraceMin. I am using a version specifically designed for finding Fiedler vectors. TraceMin relies on a linear system solver. PCG is an obvious options. But personally I prefer sparse direct solvers as they are robust, and their comfort zone well covers the graph sizes within NetworkX's reach. SciPy bundles SuperLU as splu. There are better options, but they are shut out by licensing issues—UMFPACK, CHOLMOD and PARDISO all fall into this category.

Benchmarking

Long-n (n = 32768, m = 63472, weighted)
----------------------------------------------------------------------
eigsh               (did not converge) 2191.91
lobpcg              0.0302404279791517   38.62
TraceMin_pcg        0.0302404277402188   49.09
TraceMin_chol       0.0302404277401874    0.95   (Cholesky      0.058)
TraceMin_lu         0.0302404277402692    1.11   (LU            0.145)

Random4-n (n = 32768, m = 131056, weighted)
----------------------------------------------------------------------
eigsh                 3900.84878990967   29.43
lobpcg                3900.84878990965    4.03
TraceMin_pcg          3900.84878997654   26.30
TraceMin_chol         3900.84878997652   76.62   (Cholesky     21.657)
TraceMin_lu           3900.84878997653  819.56   (LU          615.625)

Square-n (n = 32761, m = 65160, weighted)
----------------------------------------------------------------------
eigsh                 3.88833036807499  680.97
lobpcg                3.88833036795032  253.81
TraceMin_pcg          3.88833036815728   49.98
TraceMin_chol         3.88833036815718    1.62   (Cholesky      0.097)
TraceMin_lu           3.88833036815734    1.92   (LU            0.249)

ak6 (n = 32774, m = 49159)
----------------------------------------------------------------------
eigsh               (did not converge) 1826.40
lobpcg            5.70035769998342e-08  102.51
TraceMin_pcg      5.72572519105717e-08   80.71
TraceMin_chol     5.72572519439957e-08    0.62   (Cholesky      0.014)
TraceMin_lu       5.72572486979079e-08    0.70   (LU            0.095)

gl6 (n = 32786, m = 93145)
----------------------------------------------------------------------
eigsh              0.00026223266873431   94.72
lobpcg            0.000262232668724498    4.54
TraceMin_pcg      0.000262232668735754    5.78
TraceMin_chol     0.000262232668735774    1.55   (Cholesky      0.283)
TraceMin_lu       0.000262232668735766    3.06   (LU            1.637)

gw6 (n = 32768, m = 93184)
----------------------------------------------------------------------
eigsh                0.152240934977433    1.16
lobpcg               0.152240934977427    2.60
TraceMin_pcg         0.152240934977428    2.56
TraceMin_chol        0.152240934977426   11.75   (Cholesky      7.699)
TraceMin_lu          0.152240934977426  345.57   (LU          331.141)

netgen-6 (n = 8000, m = 14999, weighted)
----------------------------------------------------------------------
eigsh                 74.0188255406067    4.29
lobpcg                74.0188255406006    0.73
TraceMin_pcg          74.0188255457315    1.28
TraceMin_chol         74.0188255457307    0.55   (Cholesky      0.126)
TraceMin_lu           74.0188255457307    1.93   (LU            1.299)

wlm6 (n = 8194, m = 179238)
----------------------------------------------------------------------
eigsh              0.00829854600893233   13.50
lobpcg             0.00829854600893219    2.75
TraceMin_pcg       0.00829854600893260    2.92
TraceMin_chol      0.00829854600893253    1.41   (Cholesky      0.089)
TraceMin_lu        0.00829854600893275    1.73   (LU            0.344)

The second column in the computed eigenvalue, and the third column is the time in seconds. The numbers in brackets are the Cholesky/LU factorization times. Cholesky factorization relies on the GPL scikits.sparse wrapper for CHOLMOD. For this reason, the related code is not in this PR. I suppose that scikits.sparse can be stripped down to use only the LGPL portion of CHOLMOD and become LGPL itself, but I am not familiar with Cython to do the work.

From the data, it is evident that eigsh is not nearly as robust as the rest, although it can be fast at times. lobpcg and TraceMin_chol/lu both have good and bad cases, while TraceMin_pcg sits somewhere between the two. It should be noted that for lobpcg and TraceMin_pcg, I am using only Jacobi preconditioning. (In separate tests, passing the Cholesky factorization as the preconditioner to lobpcg improves its bad cases by at least an order magnitude but produces incorrect results in one case.) lobpcg is very sensitive to the initial guess and can fail to converge if it is not good enough. I produce an estimate of the Fiedler vector using the reverse Cuthill–McKee ordering. The same estimate does not seem to help TraceMin.

The effectiveness of Cholesky/LU factorization highly depends on the structure of the matrix and the matrix reordering algorithm. For example, both gl6 and gw6 are three-dimensional grid graphs. gw6 has a larger bisection width and thus requires long time to factorize. The METIS library used by CHOLMOD is much more effective in discovering small bisections than the multiple minimum degree algorithm used by SuperLU. As a result, CHOLMOD is more than 40 times faster in factorizing gw6. (As a separate issue, it may be of interest to some users if NetworkX includes graph ordering and partitioning functionalities.)

…method

… 3.2

…vity

…ormula

argriffing · 2014-05-06T15:18:34Z

what else is left to do

Looks good. I think the tracemin implementation could be vectorized more, but that doesn't need to be in this PR. By the way, the application of Laplacian system solvers to graph algorithms is trendy, for example Lx=b, so maybe more of these efficient algorithms could be separately added later.

ysitu · 2014-05-06T15:33:02Z

@chebee7i The remaining question is whether I should restore the Cholesky solver from scikits.sparse.

argriffing · 2014-05-06T15:39:30Z

The question is whether to paste scikits.sparse code into networkx so that if users have CHOLMOD separately installed then they can use it through this wrapper?

ysitu · 2014-05-06T15:53:49Z

@argriffing Importing from scikits.sparse is good enough.

chebee7i · 2014-05-06T17:38:50Z

TLDR: I vote for including it.

My personal opinion is that open source software packages are too fussy in deciding whether to use other open source packages. If a mistake is made (wrt licensing), it's not as if it was done in bad faith. So if someone upstream makes an issue of it and has weight to throw around, then we change it. No big deal. We are not lawyers and the legal status of linking and derivative works, especially as it relates to Python's run-time compiled modules, is still undecided. Due to project relevance, we are also probably some of the least likely projects to face litigation .

IMO, simply typing: import X where X is some GPL or LGPL module does not mean that NetworkX is "linking" to X or is a derivative work of X. The meaning of X is determined at runtime. I can, on my computer, put anything I want in my PYTHONPATH with the name X that supports the same public API that NetworkX is expecting---and that thing I put in my PYTHONPATH need not be the GPL'd or LGPL'd package X. So it doesn't logically follow that "import X" means we are "using" GPL/LGPL'd code. This remains true even when we distribute wheel files for NetworkX (of course, this would get messier if the NetworkX wheel included, say, compiled Cython libraries).

Interesting discussion in this thread...here are two views similar to what i just described:

Grant Edwards: https://groups.google.com/forum/#!msg/comp.lang.python/zYZGRRqs_Mk/cMJkm9gfi-0J
Alex Martelli: https://groups.google.com/d/msg/comp.lang.python/zYZGRRqs_Mk/kxt6qtLjR_oJ

Conflicts: networkx/linalg/algebraic_connectivity.py

This reverts commit 1be3dd8. Conflicts: .travis.yml networkx/linalg/algebraic_connectivity.py

ysitu · 2014-05-07T16:11:15Z

@argriffing I wonder what you mean by "vectorization" here. I suppose that it is not the same thing as what you get from icc/ifort.

BTW, for the orthonormalization step in TraceMIN I had wanted to use scipy.linalg.orth. But I think that it is a memory bomb for tall, skinny input. Can you help confirm?

argriffing · 2014-05-07T17:09:40Z

I wonder what you mean by "vectorization" here.

@ysitu I just meant avoiding the python loops

argriffing · 2014-05-07T17:17:41Z

I had wanted to use scipy.linalg.orth. But I think that it is a memory bomb

I haven't checked this in detail but given that its first line https://github.com/scipy/scipy/blob/master/scipy/linalg/decomp_svd.py#L201 calls svd without specifying extra options, and given that svd without extra options computes full MxM and NxN matrices https://github.com/scipy/scipy/blob/master/scipy/linalg/decomp_svd.py#L15 I would agree that appears to unnecessarily bomb the memory. ~~Maybe add a scipy issue or PR or test?~~ fixed

argriffing · 2014-05-07T19:23:54Z

I've added a PR to fix the orth problem in scipy/scipy#3626

…nnectivity

ysitu · 2014-05-10T01:43:38Z

I think that I have no further planned changes. Here are the latest benchmarking results (let me omit ARPACK and SuperLU to save testing time):

Long-n
32768 63472
lobpcg              0.0302404277260102   27.56
TraceMIN_pcg         0.030240428135769   22.76
TraceMIN_chol        0.030240427731946    0.97
Random4-n
32768 131056
lobpcg                3900.84878990957    4.25
TraceMIN_pcg          3900.84878997912   11.25
TraceMIN_chol         3900.84878991507   38.24
Square-n
32761 65160
lobpcg                3.88833036817791  129.01
TraceMIN_pcg          3.88833036816306   10.04
TraceMIN_chol         3.88833036813151    1.31
ak6
32774 49159
lobpcg            5.70036654894323e-08   71.02
TraceMIN_pcg      5.79241219647608e-08   42.79
TraceMIN_chol     5.70164761757119e-08    0.67
gl6
32786 93145
lobpcg            0.000262232668737997    4.02
TraceMIN_pcg      0.000262232668735729    2.96
TraceMIN_chol     0.000262232668735724    1.50
gw6
32768 93184
lobpcg               0.152240934977426    2.76
TraceMIN_pcg         0.152240934977427    1.39
TraceMIN_chol        0.152240934977427   11.69
netgen-6
8000 14999
lobpcg                74.0188255406044    0.71
TraceMIN_pcg          74.0188255429854    0.74
TraceMIN_chol         74.0188255417018    0.48
wlm6
8194 179238
lobpcg             0.00829854600893217    2.83
TraceMIN_pcg       0.00829854600893189    2.00
TraceMIN_chol      0.00829854600893205    1.51

jtorrents · 2014-05-11T20:51:42Z

Great work @ysitu! This is very, very nice. I've tagged this as 1.9 because it seems that it's ready for merging, but I didn't look at it closely. Any more comments?

argriffing · 2014-05-12T01:35:24Z

Any more comments?

It looks well tested so I assume it works, and I can vouch that it provides functions that are useful for spectral graph theory. I hope it is merged soon!

chebee7i · 2014-05-12T03:19:15Z

Merging...

Algebraic connectivity, Fiedler vector and spectral ordering

ysitu added 30 commits April 26, 2014 11:47

Add algebraic connectivity and Fiedler vector algorithms

f1e6972

Fix compatibility with SciPy on Travis

2015e69

Hide pcg inside tracemin_fiedler

67fecb4

Verify scipy.sparse availability in algebraic_connectivity tests

01804fd

Make algebraic_connectivity tests deterministic

477c8c0

Fix missing words in algebraic_connectivity

a8c4343

Fix issue that algebraic_connectivity tests are not run

a02c5dd

Fix compatibility with SciPy on Travis (cont'd)

2640e8f

Fix division truncation in algebraic_connectivity

1cae957

Change algebraic_connectivity to raise NetworkXError on unrecognized …

4c5e9a0

…method

Add Cholesky-based solver in algebraic_connectivity

7cec89d

Fix typo in algebraic_connectivity

552ba49

Fix typo in .travis.yml

c091e6f

Use libcholmod from libsuitesparse-dev in .travis.yml

a80ed14

Enable testing Cholesky solver in algebraic_connectivity under Python…

a48d564

… 3.2

Force Cython being installed before scikits.sparse in .travis.yml

5b2bd11

Add note that algebraic_connecitivity does not handle negative edges

4542f4c

Add/fix comments in algebraic_connectivity

55ff59e

Simplify preparation for Cholesky factorization in algebraic_connecti…

f34d73d

…vity

Add LU solver in algebraic_connectivity

b907a7f

Fix handling of disconnected graphs in algebraic_connectivity

07fd454

Add spectral ordering algorithm

afafb51

Support normalized Laplacian matrix in algebraic_connectivity

955133e

Rename 'arnoldi' option to 'lanczos' in algebraic_connectivity

bb8b864

Change algebraic_connectivity to use NumPy random facilities

0410307

Modify TraceMIN stopping criterion to closely follow backward error f…

b662f37

…ormula

Document choices of linear system solvers in algebraic_connectivity

bbf991a

Guard against NumPy unavailability in algebraic_connectivity tests

cd3ee75

Support LOBPCG as solver in algebraic_connectivity

a17bd50

Work around scipy/scipy#3592 in algebraic_connectivity

078b4fa

ysitu added 5 commits May 6, 2014 15:59

Simplify TraceMIN main loop in algebraic_connectivity

8180cb6

Remove gram_schmidt from TrainMIN in algebraic_connectivity

da4f74f

Merge branch 'fiedler' into fiedler_nochol

9faf7a4

Conflicts: networkx/linalg/algebraic_connectivity.py

Revert "Remove Cholesky solver in algebraic_connectivity"

3834f43

This reverts commit 1be3dd8. Conflicts: .travis.yml networkx/linalg/algebraic_connectivity.py

Change linear solvers to use block format in algebraic_connectivity

75e9301

argriffing mentioned this pull request May 7, 2014

MAINT: linalg.orth memory efficiency scipy/scipy#3626

Merged

ysitu added 5 commits May 7, 2014 16:49

Change formulation of PCG system in TraceMIN in algebraic_connectivity

e8e6754

Avoid division by zero when forming CG preconditioner in algebraic_co…

afea489

…nnectivity

Fix algebraic_connectivity documentation

b629c15

Add algebraic_connectivity documentation to reference

7126c78

Improve performance of PCG solver in algebraic_connectivity

a5bb7a7

jtorrents added this to the networkx-1.9 milestone May 11, 2014

chebee7i added a commit that referenced this pull request May 12, 2014

Merge pull request #1134 from ysitu/fiedler_nochol

3eb0369

Algebraic connectivity, Fiedler vector and spectral ordering

chebee7i merged commit 3eb0369 into networkx:master May 12, 2014

ysitu deleted the fiedler_nochol branch May 12, 2014 05:15

ysitu added the 1.9 label May 13, 2014

ysitu mentioned this pull request May 17, 2014

Graph partitioning with METIS #1167

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Algebraic connectivity, Fiedler vector and spectral ordering #1134

Algebraic connectivity, Fiedler vector and spectral ordering #1134

ysitu commented May 1, 2014

argriffing commented May 6, 2014

ysitu commented May 6, 2014

argriffing commented May 6, 2014

ysitu commented May 6, 2014

chebee7i commented May 6, 2014

ysitu commented May 7, 2014

argriffing commented May 7, 2014

argriffing commented May 7, 2014

argriffing commented May 7, 2014

ysitu commented May 10, 2014

jtorrents commented May 11, 2014

argriffing commented May 12, 2014

chebee7i commented May 12, 2014

Algebraic connectivity, Fiedler vector and spectral ordering #1134

Algebraic connectivity, Fiedler vector and spectral ordering #1134

Conversation

ysitu commented May 1, 2014

Algorithms

Benchmarking

argriffing commented May 6, 2014

ysitu commented May 6, 2014

argriffing commented May 6, 2014

ysitu commented May 6, 2014

chebee7i commented May 6, 2014

ysitu commented May 7, 2014

argriffing commented May 7, 2014

argriffing commented May 7, 2014

argriffing commented May 7, 2014

ysitu commented May 10, 2014

jtorrents commented May 11, 2014

argriffing commented May 12, 2014

chebee7i commented May 12, 2014