Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement of is_triangle_free' addition of triangles_count' and a minor change in `spanning_trees_count' #13503

Closed
sagetrac-azi mannequin opened this issue Sep 17, 2012 · 39 comments

Comments

@sagetrac-azi
Copy link
Mannequin

sagetrac-azi mannequin commented Sep 17, 2012

The main change of this patch is to improve the `is_triangle_free' method. The introduced algorithm uses the fact that given that A is the adjacency matrix of a simple graph G, then the number of triangles in G is tr(A^3)/6.

The change has been tested resulting in the following benchmarks:

  1. Testing all graphs of order up to 10 for triangles: (old: 201m44.339s, new: 87m38.146s)
  2. Testing all triangle-free graphs of order 12 for triangles: (old: 15m56.679s, new 9m41.116s)
  3. Testing all graphs of order up to 10 that do contain triangles: (old: 0m10.305s, new : 0m5.798s)

I can provide the test programs if needed.

Since there are many applications in which one needs to count the number of triangles in a graph the named routine was also added.

Finally a minor change has been made to the generic_graph function spanning_trees_count() which calls abs on a determinant that is always positive.

Apply:

CC: @rbeezer @nathanncohen

Component: graph theory

Keywords: triangles, number of triangles

Author: Jernej Azarija, David Coudert

Reviewer: Jernej Azarija

Merged: sage-5.6.beta1

Issue created by migration from https://trac.sagemath.org/ticket/13503

@sagetrac-azi sagetrac-azi mannequin added this to the sage-5.6 milestone Sep 17, 2012
@kini
Copy link
Contributor

kini commented Sep 17, 2012

Author: Jernej Azarija

@sagetrac-azi sagetrac-azi mannequin added the s: needs review label Sep 25, 2012
@sagetrac-azi
Copy link
Mannequin Author

sagetrac-azi mannequin commented Sep 26, 2012

comment:4

The patch needs a fix of the documentation before its ready for review.

@dcoudert
Copy link
Contributor

comment:5

Hello,

I added Nathann in Cc because we had a discussion about counting triangles last week ;)

The proposed method for testing if the graph is triangle free is not a good one. Indeed, the time complexity for testing if the graph is triangle free is at most N^w, where 2 <= w < 3 is the best possible exponent for computing the square of a matrix of side N. You compute the square of the adjacency matrix to get all paths of length 2, and then you test for every edge (u,v) if there is a path of length 2 from u to v. Furthermore, the basic algorithm of testing if some neighbors of a vertex are incident is fast.

Let us compare the following method with your proposition for testing if the graph is triangle free.

def my_is_triangle_free(G):
    for u in G.vertex_iterator():
        if any(G.has_edge(v,w) for v,w in combinations_iterator(G.neighbors(u),2)):
            return False
    #
    return True

We get the following running times:

sage: G = graphs.GridGraph([5,2]); M = G.adjacency_matrix() # 10 nodes
sage: %timeit G.is_triangle_free()
625 loops, best of 3: 603 µs per loop
sage: %timeit (M^3).trace() > 0
625 loops, best of 3: 101 µs per loop
sage: %timeit my_is_triangle_free(G)
625 loops, best of 3: 376 µs per loop
sage:
sage: G = graphs.GridGraph([10,5]); M = G.adjacency_matrix() # 50 nodes
sage: %timeit G.is_triangle_free()
25 loops, best of 3: 9.49 ms per loop
sage: %timeit (M^3).trace() > 0
125 loops, best of 3: 3.92 ms per loop
sage: %timeit my_is_triangle_free(G)
125 loops, best of 3: 2.67 ms per loop
sage:
sage: G = graphs.GridGraph([10,10]); M = G.adjacency_matrix() # 100 nodes
sage: %timeit G.is_triangle_free()
25 loops, best of 3: 31.5 ms per loop
sage: %timeit (M^3).trace() > 0
25 loops, best of 3: 12.3 ms per loop
sage: %timeit my_is_triangle_free(G)
125 loops, best of 3: 5.71 ms per loop
sage: 
sage: G = graphs.GridGraph([50,20]); M = G.adjacency_matrix() # 1000 nodes
sage: %timeit G.is_triangle_free()
5 loops, best of 3: 1.1 s per loop
sage: %timeit (M^3).trace() > 0
5 loops, best of 3: 1.04 s per loop
sage: %timeit my_is_triangle_free(G)
5 loops, best of 3: 63.8 ms per loop
sage:
sage: G = graphs.RandomGNP(10,.1); M = G.adjacency_matrix()  # 10 nodes
sage: %timeit G.is_triangle_free()
625 loops, best of 3: 492 µs per loop
sage: %timeit (M^3).trace() > 0
625 loops, best of 3: 90.5 µs per loop
sage: %timeit my_is_triangle_free(G)
625 loops, best of 3: 160 µs per loop
sage: 
sage: G = graphs.RandomGNP(50,.1); M = G.adjacency_matrix()  # 50 nodes
sage: %timeit G.is_triangle_free()
25 loops, best of 3: 9.92 ms per loop
sage: %timeit (M^3).trace() > 0
125 loops, best of 3: 3.93 ms per loop
sage: %timeit my_is_triangle_free(G)
625 loops, best of 3: 110 µs per loop
sage:
sage: G = graphs.RandomGNP(100,.2); M = G.adjacency_matrix() # 100 nodes, dense
sage: %timeit G.is_triangle_free()
5 loops, best of 3: 40 ms per loop
sage: %timeit (M^3).trace() > 0
25 loops, best of 3: 12.9 ms per loop
sage: %timeit my_is_triangle_free(G)
625 loops, best of 3: 125 µs per loop
sage:
sage: G = graphs.RandomGNP(100,.01); M = G.adjacency_matrix()  # 100 nodes, sparse
sage: %timeit G.is_triangle_free()
25 loops, best of 3: 29.7 ms per loop
sage: %timeit (M^3).trace() > 0
25 loops, best of 3: 12.2 ms per loop
sage: %timeit my_is_triangle_free(G)
125 loops, best of 3: 2.05 ms per loop
sage: 
sage: G = graphs.RandomGNP(1000,.01); M = G.adjacency_matrix() # 1000 nodes
sage: %timeit G.is_triangle_free()
5 loops, best of 3: 33.9 s per loop
sage: %timeit (M^3).trace() > 0
5 loops, best of 3: 33.8 s per loop
sage: %timeit my_is_triangle_free(G)
625 loops, best of 3: 1.24 ms per loop

For very small graphs it is apparently faster to do (M^3).trace() > 0, but for large graphs, basic iterations are way faster because we stop computations as soon as a triangle is found (if any). Therefore, a better solution could be an hybrid method using matrix multiplication if the graph is very small, and basic iterations otherwise.


For counting triangles, the situation is rather different and the density of the graph matters. Furthermore, it is interesting to use networkx graphs:

def my_triangles_count(G):
    tr = 0
    for u in G.vertex_iterator():
        tr += sum(G.has_edge(v,w) for v,w in combinations_iterator(G.neighbors(u),2))
    return tr/3

def my_triangles_countnx(G):
    tr = 0
    ggnx = G.networkx_graph()
    for u in ggnx.nodes_iter():
        tr += sum(ggnx.has_edge(v,w) for v,w in combinations_iterator(ggnx.neighbors(u),2))
    return tr/3

We get the following running time:

sage: G = graphs.GridGraph([5,2])
sage: %timeit G.triangles_count()
625 loops, best of 3: 608 µs per loop
sage: %timeit my_triangles_count(G)
625 loops, best of 3: 444 µs per loop
sage: %timeit my_triangles_countnx(G)
625 loops, best of 3: 353 µs per loop
sage:
sage: G = graphs.GridGraph([10,5])
sage: %timeit G.triangles_count()
25 loops, best of 3: 9.44 ms per loop
sage: %timeit my_triangles_count(G)
125 loops, best of 3: 2.98 ms per loop
sage: %timeit my_triangles_countnx(G)
125 loops, best of 3: 1.99 ms per loop
sage:
sage: G = graphs.RandomGNP(10,.1)
sage: %timeit G.triangles_count()
625 loops, best of 3: 493 µs per loop
sage: %timeit my_triangles_count(G)
625 loops, best of 3: 217 µs per loop
sage: %timeit my_triangles_countnx(G)
625 loops, best of 3: 218 µs per loop
sage:
sage: G = graphs.RandomGNP(50,.1)
sage: %timeit G.triangles_count()
25 loops, best of 3: 9.79 ms per loop
sage: %timeit my_triangles_count(G)
125 loops, best of 3: 5.49 ms per loop
sage: %timeit my_triangles_countnx(G)
125 loops, best of 3: 2.78 ms per loop
sage:
sage: G = graphs.RandomGNP(50,.5)
sage: %timeit G.triangles_count()
25 loops, best of 3: 15 ms per loop
sage: %timeit my_triangles_count(G)
5 loops, best of 3: 82.4 ms per loop
sage: %timeit my_triangles_countnx(G)
25 loops, best of 3: 23.4 ms per loop
sage:
sage: G = graphs.RandomGNP(100,.5)
sage: %timeit G.triangles_count()
5 loops, best of 3: 58.4 ms per loop
sage: %timeit my_triangles_count(G)
5 loops, best of 3: 716 ms per loop
sage: %timeit my_triangles_countnx(G)
5 loops, best of 3: 181 ms per loop
sage:
sage: G = graphs.RandomGNP(100,.1)
sage: %timeit G.triangles_count()
25 loops, best of 3: 35.4 ms per loop
sage: %timeit my_triangles_count(G)
25 loops, best of 3: 33.6 ms per loop
sage: %timeit my_triangles_countnx(G)
25 loops, best of 3: 12.8 ms per loop
sage:
sage: G = graphs.RandomGNP(100,.05)
sage: %timeit G.triangles_count()
25 loops, best of 3: 32.3 ms per loop
sage: %timeit my_triangles_count(G)
25 loops, best of 3: 10.8 ms per loop
sage: %timeit my_triangles_countnx(G)
125 loops, best of 3: 5.53 ms per loop

As you can see, when the graph is dense, the matrix multiplication method is the fastest, but for spare graphs it is better to use iterations and to convert the graph into a networkx graph. However, I don't know how to choose the threshold and I don't have better proposals in mind yet.

@dcoudert
Copy link
Contributor

Reviewer: David Coudert

@sagetrac-azi
Copy link
Mannequin Author

sagetrac-azi mannequin commented Sep 29, 2012

comment:6

Hello,

I agree with what you said.

As you mentioned it is hard to deduce a threshold that is going to be good for all platforms (and times) so I suppose we could simply add an optional argument for the user to specify which algorithm to use. We could then also add the even faster triangle detection algorithms based on probability.

@dcoudert
Copy link
Contributor

comment:7

Yes, adding an argument to specify the algorithm is the best solution.

@sagetrac-azi
Copy link
Mannequin Author

sagetrac-azi mannequin commented Sep 29, 2012

comment:8

Also not to forget that the code for triangle detection would be much much faster if we compute AAA in the real field. Could you also check that? To compare with your other results?

@dcoudert
Copy link
Contributor

comment:9

It is true that AAA is a bit faster, but not in the real field where it seems very slow.

@rbeezer
Copy link
Mannequin

rbeezer mannequin commented Sep 29, 2012

comment:10

Replying to @dcoudert:

It is true that AAA is a bit faster, but not in the real field where it seems very slow.

RR = RealField(53) is arbitrary precision reals, so perhaps predictably slow, given the generality.

RDF (real double field) is floating point doubles. Whatever the 53-bit precision IEEE format is. Try adjacency matrices over that ring. They might be an improvement.

@dcoudert
Copy link
Contributor

comment:11

Since the adjacency matrix is a matrix of integers (in fact booleans or bits), I expect matrix multiplication to be significantly faster than in any other rings.

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Sep 30, 2012

comment:12

HMmmm... I just read your tests, and did not get this one :

sage: %timeit G.is_triangle_free()
5 loops, best of 3: 1.1 s per loop
sage: %timeit (M^3).trace() > 0
5 loops, best of 3: 1.04 s per loop
sage: %timeit my_is_triangle_free(G)
5 loops, best of 3: 63.8 ms per loop

Could it be that there would be another difference in runtimes if one requires the adjacency matrix to be dense ? :-)

Nathann

@dcoudert
Copy link
Contributor

comment:13

If the graph is dense, the first visited vertex will have a triangle with high probability. So after a very small number of iterations and so operations, the algorithm returns False. But with matrix multiplications, you have to pay the full cost whatever the result.

Of course, if the graph is dense and triangle free (not so many such graphs), my algorithm will take long time, and possibly more than matrix multiplications, but in average it is faster. Do your own experiments to convince yourself.

So for testing if G is triangle free, it is good to have a parameter to choose the algorithm, and a default behavior like: if G has less than 15 nodes, then do matrix multiplications, else use other algo.

For counting triangles, the situation is different and we clearly have to take into account the density to choose the best algorithm.

@dcoudert
Copy link
Contributor

comment:14

Two interesting (worst case) examples:

sage: G = graphs.CompleteBipartiteGraph(100,100); M = G.adjacency_matrix()
sage: G.density()
100/199
sage: %timeit not (M*M*M).trace()
5 loops, best of 3: 59.8 ms per loop
sage: %timeit my_is_triangle_free(G)
5 loops, best of 3: 5.66 s per loop
sage:
sage: G = graphs.RandomGNP(200,100/199); M = G.adjacency_matrix()
sage: G.density()
10009/19900
sage: %timeit not (M*M*M).trace()
5 loops, best of 3: 63.8 ms per loop
sage: %timeit my_is_triangle_free(G)
625 loops, best of 3: 300 µs per loop

@rbeezer
Copy link
Mannequin

rbeezer mannequin commented Sep 30, 2012

comment:15

Replying to @dcoudert:

Since the adjacency matrix is a matrix of integers (in fact booleans or bits), I expect matrix multiplication to be significantly faster than in any other rings.

I'd guess that managing (potentially) large integers has more overhead than managing doubles:

sage: G = graphs.CompleteBipartiteGraph(100, 100)
sage: M = G.am()
sage: timeit("M*M*M")
25 loops, best of 3: 31 ms per loop
sage: MR = M.change_ring(RDF)
sage: timeit("MR*MR*MR")     
125 loops, best of 3: 2.4 ms per loop
sage: (MR*MR*MR).trace()
0.0

I can't recall seeing a Sage class for integers of a bounded size.

@dcoudert
Copy link
Contributor

comment:16

I'm certainly not used to matrix multiplications and rings in sage, and I'm surprised that computing with doubles is faster than with integer. For instance, if G has 10^6 vertices, then since M is a matrix of 0 and 1's, the largest value in M^3 is in 10^18 < 2^63 and so all computations are doable using 64bits integers.

Following your example, and after some search, I found a much faster solution: use GF(2)

sage: G = graphs.CompleteBipartiteGraph(100,100); M = G.am(); M
200 x 200 dense matrix over Integer Ring (type 'print M.str()' to see all of the entries)
sage: %timeit not (M*M*M).trace()
5 loops, best of 3: 39.5 ms per loop
sage: MR = M.change_ring(RDF); MR
200 x 200 dense matrix over Real Double Field (type 'print MR.str()' to see all of the entries)
sage: %timeit (MR*MR*MR).trace() == 0.0
125 loops, best of 3: 1.47 ms per loop
sage: M2 = M.change_ring(GF(2)); M2
200 x 200 dense matrix over Finite Field of size 2 (type 'print M2.str()' to see all of the entries)
sage: %timeit not (M2*M2*M2).trace()
625 loops, best of 3: 201 µs per loop

However, this is not always the best solution and it is extremely slow for sparse matrix O_o

sage: G = graphs.RandomGNP(1000,.1); M = G.am(); M2 = M.change_ring(GF(2)); M2
1000 x 1000 dense matrix over Finite Field of size 2 (type 'print M2.str()' to see all of the entries)
sage: %timeit my_is_triangle_free(G)
625 loops, best of 3: 424 µs per loop
sage: %timeit not (M2*M2*M2).trace()
125 loops, best of 3: 4.09 ms per loop
sage: MR = M.change_ring(RDF)
sage: %timeit (MR*MR*MR).trace() == 0.0
5 loops, best of 3: 96.9 ms per loop
sage:
sage: G = graphs.RandomGNP(1000,.01); M = G.am(); M2 = M.change_ring(GF(2)); M2
1000 x 1000 sparse matrix over Finite Field of size 2 (type 'print M2.str()' to see all of the entries)
sage: %timeit my_is_triangle_free(G)
625 loops, best of 3: 594 µs per loop
sage: %timeit not (M2*M2*M2).trace()
5 loops, best of 3: 3.41 s per loop
sage: MR = M.change_ring(RDF)
5 loops, best of 3: 8.08 s per loop
sage:
sage: G = graphs.RandomGNP(100,.01); M = G.am(); M2 = M.change_ring(GF(2)); M2
100 x 100 dense matrix over Finite Field of size 2 (type 'print M2.str()' to see all of the entries)
sage: %timeit my_is_triangle_free(G)
625 loops, best of 3: 1.22 ms per loop
sage: %timeit not (M2*M2*M2).trace()
625 loops, best of 3: 95.7 µs per loop
sage:
sage: G = graphs.RandomGNP(100,.1); M = G.am(); M2 = M.change_ring(GF(2)); M2
100 x 100 dense matrix over Finite Field of size 2 (type 'print M2.str()' to see all of the entries)
sage: %timeit my_is_triangle_free(G)
625 loops, best of 3: 71.4 µs per loop
sage: %timeit not (M2*M2*M2).trace()
625 loops, best of 3: 95.5 µs per loop

I don't know how to force a matrix to be dense. for "small" graphs, it is apparently always dense.

Clearly, the algorithm should use tradeoffs between size and density.

@jhpalmieri
Copy link
Member

comment:17

I don't know how to force a matrix to be dense.

mat.dense_matrix() should do it.

@sagetrac-azi
Copy link
Mannequin Author

sagetrac-azi mannequin commented Sep 30, 2012

comment:18

@dcoudert

While matrix multiplication over GF(2) is quite fast, it is wrong for the adjacency matrix. The trace of the cube of an adjacency matrix is always zero over GF(2).

@dcoudert
Copy link
Contributor

comment:19

While matrix multiplication over GF(2) is quite fast, it is wrong for the adjacency matrix. The trace of the cube of an adjacency matrix is always zero over GF(2).

That's a pity :(

Any interesting alternatives?
At least you can force matrices to be dense, but for small graphs (10 nodes), the extra cost might be non-negligeable.

@rbeezer
Copy link
Mannequin

rbeezer mannequin commented Sep 30, 2012

comment:20

"dense" in Sage is an implementation detail. Any matrix can be made dense or sparse, even if it is not a good idea for the algorithm at hand.

In the discussion here, the actual "density" of the adjacency matrix is relevant for which approach (matrix algebra, vertex neighborhoods) might be faster.

@dcoudert
Copy link
Contributor

Changed reviewer from David Coudert to none

@dcoudert
Copy link
Contributor

Changed author from Jernej Azarija to Jernej Azarija, David Coudert

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Oct 22, 2012

comment:25

Hellooooooooo David !!

First, you do not know how glad I am to hear things like "the difficulty with graph algorithms is that the efficiency depends on the data structure" from the mouth of researchers. "Adjacency test ? O(1) !" is their usual answer. :-D

I have been thinking of reimplementing another version using FastGraph, but that would be more trouble than necessary for the moment. And I could actually use the same technique to reimplement SubgraphSearch a bit better anyway, so I guess I will go there directly.

About the patch :

  • Where did you find that this "abs" was not needed ? I thought it was O_o
  • I'm not a big fan of having triangle_count in generic_graph, as I don't really see it used with DiGraphs... And the code reflects that indeed, but if you think they can, then why not ? Could you at least say that not all algorithms are available for Digraphs, and that the method looks for "directed C3" in this case, or something similar ? And in particular not C3 in the underlying graph.
  • "tests if the trace of the adjacency matrix is positive" would be counting the number of loops :-D Btw I guess it would be better to have "return (A * * 3 ).trace() == 0" instead of "return (A*A*A).trace() == 0", just in case they might implement some smart thing for powers of binary matrices eventually... The logarithmic power method changes nothing for k=3 :-)

Thanks for that patch !!

nathann

@dcoudert
Copy link
Contributor

comment:26
  • Where did you find that this "abs" was not needed ? I thought it was O_o

It was in the original patch from Jernej Azarija ([trac_17334_triangle_free_speedup.patch]). I don't know if its true or not, and since the gain is very limited, we could put it back to be on the safe side.

  • I'm not a big fan of having triangle_count in generic_graph, as I don't really see it used with DiGraphs... And the code reflects that indeed, but if you think they can, then why not ? Could you at least say that not all algorithms are available for Digraphs, and that the method looks for "directed C3" in this case, or something similar ? And in particular not C3 in the underlying graph.

Jernej should answer that remark.

  • "tests if the trace of the adjacency matrix is positive" would be counting the number of loops :-D Btw I guess it would be better to have "return (A * * 3 ).trace() == 0" instead of "return (A*A*A).trace() == 0", just in case they might implement some smart thing for powers of binary matrices eventually... The logarithmic power method changes nothing for k=3 :-)

OK.

I have implemented some of the modifications.

Thanks for that patch !!

I was also thinking of some nice and fast methods from Alon et al., or from Latapy (see the code in c and the survey at http://www-rp.lip6.fr/~latapy/Triangles/). We could add them (in particular the compact-forward method) at the cost of adding an interface and a method for converting the graph into the tricky data structure used by Latapy.

@sagetrac-azi
Copy link
Mannequin Author

sagetrac-azi mannequin commented Oct 28, 2012

comment:27

Replying to @dcoudert:

  • Where did you find that this "abs" was not needed ? I thought it was O_o

It was in the original patch from Jernej Azarija ([trac_17334_triangle_free_speedup.patch]). I don't know if its true or not, and since the gain is very limited, we could put it back to be on the safe side.

abs is not needed. Perhaps we can still remove it from this patch and I add it later in a generic patch fixing various small details in graph.py. The reason it is not needed is that Kirchhoff matrix tree theorem states that the number of spanning trees equals

det(L')*(-1)^(i+j}+)

where L' is the matrix that is obtained from the Laplacian after removing row i and column j.

  • I'm not a big fan of having triangle_count in generic_graph, as I don't really see it used with DiGraphs... And the code reflects that indeed, but if you think they can, then why not ? Could you at least say that not all algorithms are available for Digraphs, and that the method looks for "directed C3" in this case, or something similar ? And in particular not C3 in the underlying graph.

Jernej should answer that remark.

I am not really sure what to do here.

  • "tests if the trace of the adjacency matrix is positive" would be counting the number of loops :-D Btw I guess it would be better to have "return (A * * 3 ).trace() == 0" instead of "return (A*A*A).trace() == 0", just in case they might implement some smart thing for powers of binary matrices eventually... The logarithmic power method changes nothing for k=3 :-)

OK.

I have implemented some of the modifications.

Thanks for that patch !!

I was also thinking of some nice and fast methods from Alon et al., or from Latapy (see the code in c and the survey at http://www-rp.lip6.fr/~latapy/Triangles/). We could add them (in particular the compact-forward method) at the cost of adding an interface and a method for converting the graph into the tricky data structure used by Latapy.

Haven't looked at the site yet, but is it hard to implement the algorithm directly into sage?

@dcoudert
Copy link
Contributor

comment:28

Replying to @sagetrac-azi:

Replying to @dcoudert:

  • Where did you find that this "abs" was not needed ? I thought it was O_o

It was in the original patch from Jernej Azarija ([trac_17334_triangle_free_speedup.patch]). I don't know if its true or not, and since the gain is very limited, we could put it back to be on the safe side.

abs is not needed. Perhaps we can still remove it from this patch and I add it later in a generic patch fixing various small details in graph.py. The reason it is not needed is that Kirchhoff matrix tree theorem states that the number of spanning trees equals

det(L')*(-1)^(i+j}+)

where L' is the matrix that is obtained from the Laplacian after removing row i and column j.

yes, it is better to remove the abs in a patch dedicated to small details ;-)

I was also thinking of some nice and fast methods from Alon et al., or from Latapy (see the code in c and the survey at http://www-rp.lip6.fr/~latapy/Triangles/). We could add them (in particular the compact-forward method) at the cost of adding an interface and a method for converting the graph into the tricky data structure used by Latapy.

Haven't looked at the site yet, but is it hard to implement the algorithm directly into sage?

It's difficult in python. The method is fast because it uses optimized data structure. It can be done in cython, but then it is certainly better to only implement the interfaces with the original c code of Latapy. We can easily obtained agreement from him for integrating is code in sage.
This should be done in a dedicated patch to ease reviewing process.

@sagetrac-azi
Copy link
Mannequin Author

sagetrac-azi mannequin commented Nov 19, 2012

comment:30

Can someone review this patch?

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Nov 20, 2012

comment:31

Can someone review this patch?

Well. Why wouldn't you ? ^^;

It looks like David updated the patch last, so you can review his changes.. I mean, for as long as two persons look at the file together and agree that it is all good, we usually accept this as a review and put both names as Authors and Reviewers :-)

Nathann

@sagetrac-azi
Copy link
Mannequin Author

sagetrac-azi mannequin commented Nov 20, 2012

comment:32

Oh, didn't know I can do that too. Here we go...

I have tested the code briefly and it looks good to me as far as correction goes. I have two additional remarks though.

  1. Can we avoid using graphx? Why exactly are we using it? I know its because of performance but what exactly is making the algorithm tick that faster? It doesn’t seem the right thing (to me) to call a graphx object for this purpose but rather see where the problem in Sage is. Its weird that its that faster even when you take the wrapping overhead into account.

  2. The code in is_triangle_free currently looks as follows

            BB = {} 
            for u in self.vertex_iterator():
                BB[u] = Bitset(capacity=N)
                for v in self.vertex_iterator():
                    if B[u]&B[v]:
                        BB[u].add(map[v])

            # search for triangles
            return not any(B[u]&BB[u] for u in self.vertex_iterator())

Am I missing something or this should be equivalent (to the slightly more optimal):

            BB = {} 
            for u in self.vertex_iterator():
                BB[u] = Bitset(capacity=N)
                for v in self.vertex_iterator():
                    if B[u]&B[v]:
                        BB[u].add(map[v])
                if B[u] & BB[u] == 1:
                   return False

            return True

@dcoudert
Copy link
Contributor

comment:34

Attachment: trac_13503_triangles-v2.patch.gz

  1. networkx uses dictionaries to store edges. If G is a networkx graph, it is also a dictionary indexed by the vertices, and G[u] is a dictionary indexed by neighbors and containing edge data. This way, iterations are fast. I don't know exactly how are implemented sage graphs in the backends, and so I don't know how to speedup the basic iterations over the edges, the neighbors, etc. This is certainly needed.

  2. I have implemented the proposed modification in is_triangle_free.

@sagetrac-azi
Copy link
Mannequin Author

sagetrac-azi mannequin commented Nov 20, 2012

comment:35

Hello!

  1. I would suggest we make this into a new ticket or something - how to make graph structures more efficient , and leave the code as is for now?

What do you guys think?

@dcoudert
Copy link
Contributor

comment:36

I agree.

@sagetrac-azi
Copy link
Mannequin Author

sagetrac-azi mannequin commented Nov 20, 2012

comment:37

Okay. Could you (dcoudert) pleaase make another ticket that basically explains what you said under 1? We can then discuss this there.

I'll mark this ticket with a positive review now.

@jdemeyer
Copy link

Reviewer: Jernej Azarija

@jdemeyer
Copy link

Merged: sage-5.6.beta1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants