[Feature Request] Count triangles for edges #967

Open
opened this Issue Oct 28, 2016 · 8 comments

None yet

4 participants

 The function count_triangles() gives the number of triangles each vertex is in. Feature request: a function to count the number of triangles each edge occurs in. Justification: the number of triangles an edge occurs in is important for computing certain clustering algorithms, for example the k-truss algorithm which requires the "support" (number of triangles) for each edge. For reference of k-truss, see: https://arxiv.org/pdf/1205.6693.pdf
Contributor
 This is very easy to compute without igraph. Suppose i-j is an edge. Take rows i and j of the adjacency matrix and compute their scalar product. That's your answer. On 28 October 2016 at 20:04, alexperrone notifications@github.com wrote: The function count_triangles() gives the number of triangles each vertex is in. Feature request: a function to count the number of triangles each edge occurs in. Justification: the number of triangles an edge occurs in is important for computing certain clustering algorithms, for example the k-truss algorithm which requires the "support" (number of triangles) for each edge. For reference of k-truss, see: https://arxiv.org/pdf/1205.6693.pdf — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub #967, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKBx03H9GeYRqOCni4cDf4PEF62EXsHks5q4jlCgaJpZM4Kjs_P .
 That seems to work, but how to do it in igraph? I still need the edge IDs, for example. It took over 1 second just to run `get.adjacency(g)` on a small (8 nodes, 14 edges) undirected igraph object `g`, before even doing the i-j scalar product. Whereas `count_triangles(g)` is basically instant to run (~.001 seconds).
Member
 @alexperrone I am not sure what you are doing, but `get.adjacency` does not take 1 second for me: ```❯ g <- sample_gnm(8, 14) ❯ system.time(A <- get.adjacency(g)) user system elapsed 0.001 0.000 0.001 `````
 Wow, I was doing the equivalent of: ``````> g <- sample_gnm(8, 14) > plot(g) > system.time(A <- get.adjacency(g)) user system elapsed 0.050 0.006 0.054 > system.time(count_triangles(g)) user system elapsed 0.002 0.000 0.002 `````` Apparently, the system.time somehow is sensitive to commands before it, like plotting. I don't know how I got 1 second on my data (I think more intensive plotting preceding), but anyway, you should be able to reproduce above to see how one could get perception that .05 >> .002
Member
 Because it probably contains garbage collection.
 Thanks to @szhorvat I was able to accomplish this. However, it still may be good to have in igraph. Here's what I did. ``````library(igraph) set.seed(1) g <- sample_gnm(6, 9) plot(g) A <- get.adjacency(g) triangle_matrix <- as.matrix(Matrix::tcrossprod(A)) # scalar product of rows E(g)\$num_triangles <- triangle_matrix[ends(g, E(g))] # look up each edge E(g)[[]] # show results ``````
referenced this issue Nov 2, 2016
Closed

Initially slow computations times for get.adjacency #970

 Any ideas on how to do this for each edge that would be more efficient than matrix multiplication shown in previous post (Oct 29)? Or do you think it is not worth it to pursue? The functions in `igraph/src/triangles.c`, e.g. `igraph_transitivity_local_undirected`, compute for each vertex and seem difficult to transform to do it for each edge.
Member
commented Dec 13, 2016
 If you are using a sparse matrix for `A`, then this is probably as good as it gets - the alternative would be to query the neighbor sets for each vertex in advance, and do repeated set intersections for pairs of vertices i and j if they are connected by an edge. This is exactly what the matrix multiplication does anyway. There might be some leeway in optimizing the set intersection operation, but it would have a measurable effect only if the entire looping over the edges and the set intersection operation are both implemented in C -- otherwise a sparse matrix multiplication (which I assume is also implemented in C) probably beats it hands down.