WIP: Use SparseMatrixCSC to store graphs #150

jpfairbanks · 2015-09-03T02:49:11Z

Making the PR as a place to anchor discussion

jpfairbanks · 2015-09-03T03:00:05Z

src/core.jl

 """Returns the backwards adjacency list of a graph.
 For each vertex the Array of `dst` for each edge eminating from that vertex."""
 badj(g::SimpleGraph) = g.badjlist
 badj(g::SimpleGraph, v::Int) = g.badjlist[v]

+badj(g::SimpleSparseGraph, v::Int) = g.m[v,:]'.rowval
+badj(g::SimpleSparseGraph) = @inbounds [badj(g,i) for i in 1:nv(g)]
+


we could store tril(m)' in order to get efficient access to badj(g,i) just like fadj

sbromberger · 2015-09-06T02:11:32Z

See:

…ptimizations. katz centrality now supports receive and broadcast

sbromberger · 2015-09-08T04:47:44Z

@jpfairbanks would you mind going through operators.jl and finding ways of optimizing now that we're storing adjacency matrices directly as sparse matrices? Accessors are fmat() and bmat() for forward and backward adjacencies, respectively. (Undirected graphs have bmat == fmat.)

sbromberger · 2015-09-08T15:50:15Z

On a related note: since I don't anticipate tagging this before 0.4, I've gone ahead and stripped out @compat and added julia 0.4 in REQUIRES. The tipping point was the realization late last night that 0.3 doesn't have | defined for sparse matrices (even in Compat).

Any objections to this approach? We'll make the current version (0.3.3) the last supported 0.3 version and I'll tag 0.4.0 for 0.4 (nice symmetry there).

codecov-io · 2015-09-08T16:48:55Z

Current coverage is `86.17%`

Merging #150 into master will decrease coverage by -13.83% as of 706e161

@@            master    JuliaLang/julia#150   diff @@
======================================
  Files           24      23     -1
  Stmts         1166    1309   +143
  Branches         0       0       
  Methods          0       0       
======================================
- Hit           1166    1128    -38
  Partial          0       0       
- Missed           0     181   +181

Review entire Coverage Diff as of 706e161

Powered by Codecov. Updated on successful CI builds.

sbromberger · 2015-09-08T18:32:19Z

Todo:

parallelize dijkstra and other functions
code coverage to 100%
figure out elegant way to specify parallel or nonparallel operation.

sbromberger · 2015-09-11T18:18:34Z

What do people think about having a LightGraphs variable called parallel that was 0..n, where n is the number of workers? It could be initialized to either 0 (forcing single-process functions) or to nworkers().

This variable could be used throughout the package to determine which method (for parallelized functions) should be used.

…mes in dijkstra and betweenness centrality

IainNZ · 2015-09-21T01:19:31Z

src/core.jl

    edges::Set{Edge}
-    fadjlist::Vector{Vector{Int}} # [src]: (dst, dst, dst)
-    badjlist::Vector{Vector{Int}} # [dst]: (src, src, src)
+    fm::SparseMatrixCSC{Bool, Int}


Why do you need two matrices to represent a directed graph?

~~Primarily because it's very efficient to look up forward and backward adjacencies this way. Consider an edge (1,2). It will have fadjlist[1] = [2, ...] and badjlist[2] = [1, ...].~~

Note also that we're changing the game a bit with the move to sparse matrices for version 0.4. (Check out the 'sparsemx' branch for details). Directed Graphs will have two sparse matrices (accessible via fmat and bmat); undirected only needs one.

Sorry, just saw that you're commenting on sparsemx. The reason is still for efficient forward and backward adjacencies. Technically we can just transpose whenever we need the other, but I don't know what the performance implications are for sparse matrices.

IainNZ · 2015-09-21T18:19:59Z

Have you considered converting the adjacency list representation to the sparse matrix representation on-demand only for the the operations that would benefit from it?

Using a sparse matrix as essentially an adjacency list with a contiguous backing store seems really unorthodox, and I'm not sure it makes much sense for the most common use cases. It also (slightly) wasteful, requiring a useless extra byte for each edge, and makes operations such as dynamically adding or removing nodes and edges more expensive/complicated.

sbromberger · 2015-09-21T18:23:34Z

@IainNZ Yes - in fact, that's what we currently do.

This effort is by no means a fait accompli - it's really just a proof of concept to see whether it makes sense to change the underlying representation of graphs. Right now there's not much arguing against making a switch, but it's still early days and roadblocks can come up. Your opinions/objections are very much welcome!

sbromberger · 2015-10-18T16:18:34Z

After this effort, I'm not convinced that this provides any significant benefit, and has some risks that may cause problems down the road. It seems, frankly, that sparse matrices are "second class citizens" right now in Julia, and if that ever changes we can see a slew of modifications to the class that may require lots of work here.

I'll close this out now but will be incorporating the non-sparse improvements into the current master.

wip first cut

e441183

jpfairbanks reviewed Sep 3, 2015
View reviewed changes

sbromberger added 4 commits September 3, 2015 17:06

degree optimizations

715ba0b

fixes

d248fe4

default distance flexibility

509d284

parallel betweenness is working

a82b2c0

sbromberger added 3 commits September 6, 2015 08:40

removed dependency on ParallelSparseMatMul, lots of other changes / o…

d07c9ef

…ptimizations. katz centrality now supports receive and broadcast

split out sparse into separate files

98ea88d

all tests working

65dbc12

sbromberger added 3 commits September 8, 2015 09:38

update travis for codecov.io

c31e05f

drop support for 0.3

18507b1

nightly test only

928601d

sbromberger added 2 commits September 8, 2015 10:21

travis new infrastructure

ba3936c

travis fix

082e1ae

sbromberger added 2 commits September 10, 2015 12:21

parallelsparsematmul, other optimizations, parallel dijkstra testing

0368f66

faster methods, fixes to tests

e1ce65e

sbromberger mentioned this pull request Sep 11, 2015

Placeholder for graph parallelism #41

Closed

sbromberger changed the title ~~Use SparseMatrixCSC to store graphs~~ WIP: Use SparseMatrixCSC to store graphs Sep 11, 2015

more efficient smallgraphs, global parallelize, change to function na…

203f7b3

…mes in dijkstra and betweenness centrality

This was referenced Sep 14, 2015

betweenness_centrality docstring #145

Closed

Opt/connected components #152

Merged

IainNZ reviewed Sep 21, 2015
View reviewed changes

sbromberger closed this Oct 18, 2015

sbromberger mentioned this pull request Sep 16, 2016

Parallel Push-Relabel algorithm #445

Closed

sbromberger deleted the sparsemx branch August 15, 2017 01:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Use SparseMatrixCSC to store graphs #150

WIP: Use SparseMatrixCSC to store graphs #150

jpfairbanks commented Sep 3, 2015

jpfairbanks Sep 3, 2015

sbromberger commented Sep 6, 2015

sbromberger commented Sep 8, 2015

sbromberger commented Sep 8, 2015

codecov-io commented Sep 8, 2015

sbromberger commented Sep 8, 2015

sbromberger commented Sep 11, 2015

IainNZ Sep 21, 2015

sbromberger Sep 21, 2015

IainNZ commented Sep 21, 2015

sbromberger commented Sep 21, 2015

sbromberger commented Oct 18, 2015

WIP: Use SparseMatrixCSC to store graphs #150

WIP: Use SparseMatrixCSC to store graphs #150

Conversation

jpfairbanks commented Sep 3, 2015

jpfairbanks Sep 3, 2015

Choose a reason for hiding this comment

sbromberger commented Sep 6, 2015

sbromberger commented Sep 8, 2015

sbromberger commented Sep 8, 2015

codecov-io commented Sep 8, 2015

Current coverage is 86.17%

sbromberger commented Sep 8, 2015

sbromberger commented Sep 11, 2015

IainNZ Sep 21, 2015

Choose a reason for hiding this comment

sbromberger Sep 21, 2015

Choose a reason for hiding this comment

IainNZ commented Sep 21, 2015

sbromberger commented Sep 21, 2015

sbromberger commented Oct 18, 2015

Current coverage is `86.17%`