Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in Gomory-Hu tree algorithm #16475

Closed
sagetrac-foosterhof mannequin opened this issue Jun 12, 2014 · 29 comments
Closed

Bug in Gomory-Hu tree algorithm #16475

sagetrac-foosterhof mannequin opened this issue Jun 12, 2014 · 29 comments

Comments

@sagetrac-foosterhof
Copy link
Mannequin

sagetrac-foosterhof mannequin commented Jun 12, 2014

When trying to come up with a doctest to verify ticket #16404, I stumbled across this error:

sage: G = graphs.PetersenGraph()
sage: for u, v in [(0, 1), (0, 4), (0, 5), (1, 2), (1, 6), (3, 4), (5, 7), (5, 8)]:
....:     G.set_edge_label(u, v, 2)
....: T = G.gomory_hu_tree(method="FF")
....: f = G.flow(1, 9, use_edge_labels = True)
....: f2 = T.edge_label(1, 9)
....: if f != f2:
....:     print 'Proper Tree Edge Error:', f, f2
....: 
Proper Tree Edge Error: 3 6

This is a violation of the property that defines a Gomory-Hu tree.

Furthermore, the derived property that the flow between two vertices in a graph is the minimum of the edge_labels in the constructed Gomory-Hu tree can be violated, for instance in the above graph:

sage: f = G.flow(5, 1, use_edge_labels = True)
sage: P = T.shortest_path(5, 1)
sage: E = zip(P, P[1::])
sage: f2 = min(map(lambda x: T.edge_label(x[0], x[1]), E))
sage: if f != f2:
....:     print 'Tree Path Error:', f, f2, P
....: 
Tree Path Error: 6 3 [5, 9, 1]

Depends on #12797

CC: @sagetrac-Rudi

Component: graph theory

Keywords: gomory hu tree gomory-hu gomory_hu_tree

Author: Nathann Cohen

Branch/Commit: b0bbcd4

Reviewer: Michele Borassi

Issue created by migration from https://trac.sagemath.org/ticket/16475

@sagetrac-foosterhof sagetrac-foosterhof mannequin added this to the sage-6.3 milestone Jun 12, 2014
@sagetrac-foosterhof
Copy link
Mannequin Author

sagetrac-foosterhof mannequin commented Jun 12, 2014

Branch: u/foosterhof/ticket/16475

@sagetrac-foosterhof
Copy link
Mannequin Author

sagetrac-foosterhof mannequin commented Jun 13, 2014

Commit: 27025d1

@sagetrac-foosterhof
Copy link
Mannequin Author

sagetrac-foosterhof mannequin commented Jun 13, 2014

comment:2

The problem with the algorithm was that after the recursion, an edge was added. This edge was added to just any 2 vertices before, but there is are 2 specific vertices to which this edge should be added. These vertices are now calculated correctly (hopefully)


New commits:

27025d1Fixed Gomory-Hu tree algorithm by adding the partitions

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jun 13, 2014

Changed commit from 27025d1 to 9326d15

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jun 13, 2014

Branch pushed to git repo; I updated commit sha1. New commits:

9326d15Added doctest

@sagetrac-foosterhof
Copy link
Mannequin Author

sagetrac-foosterhof mannequin commented Jun 13, 2014

comment:4

This fix works correctly, as far as I tested, when ticket #12797 is applied as well.

Florian

@sagetrac-foosterhof
Copy link
Mannequin Author

sagetrac-foosterhof mannequin commented Jun 13, 2014

Dependencies: 12797

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Jun 17, 2014

comment:5

Hello !

Just some cosmetic remarks :

  • Set is MUCH slower than set or frozenset, so if you don't need Set for a specific reason don't use it:
sage: %timeit Set(range(50))
10000 loops, best of 3: 29.1 µs per loop
sage: %timeit frozenset(range(50))
100000 loops, best of 3: 2.05 µs per loop
  • It would be cool if you could add some comments about what the following block of code does. Just a line or 5 words, i.e. something to say with words what is happening there
+        C = {}
...
+            C[r] = C2[r]

Thanks for fixing that, btw :-)

Nathann

@fchapoton
Copy link
Contributor

Changed dependencies from 12797 to #12797

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jun 17, 2014

Changed commit from 9326d15 to e4604fa

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jun 17, 2014

Branch pushed to git repo; I updated commit sha1. New commits:

e4604faAdded comment on goal and use of Partition code

@sagetrac-foosterhof
Copy link
Mannequin Author

sagetrac-foosterhof mannequin commented Jun 17, 2014

comment:8

Hi

I am trying to convert to use your suggestion, using frozenset, but the call g1.vertices() (and thus also g2.vertices()) complains that it cannot compare a (frozen)set to an int, as it will sort them. I have been trying to give it a nice key lambda function to circumvent this problem, but no luck so far. Any suggestions? Or does this mean that I can better just leave it as using Set?

I have added a comment, including a reference to a piece of a book by A. Schrijver, which explains the use of the partitions. (A Gomory-Hu tree can be defined for a specific subset R of V, in which case these partitions play a big role)


New commits:

e4604faAdded comment on goal and use of Partition code

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Jun 17, 2014

comment:9

Hello !

I am trying to convert to use your suggestion, using frozenset, but the call g1.vertices() (and thus also g2.vertices()) complains that it cannot compare a (frozen)set to an int, as it will sort them.

HMmmmmmmmm O_o

I do not understand what you mean. Could you give me an example of code that produces it ? It looks that you cannot use "&" on a frozenset, but it has an "intersection" method

sage: range(5) & frozenset(range(8))
...
TypeError: unsupported operand type(s) for &: 'list' and 'frozenset'
sage: frozenset(range(8)).intersection(range(5))
frozenset({0, 1, 2, 3, 4})

I have added a comment, including a reference to a piece of a book by A. Schrijver, which explains the use of the partitions. (A Gomory-Hu tree can be defined for a specific subset R of V, in which case these partitions play a big role)

Thanks ! :-)

Nathann

@sagetrac-foosterhof
Copy link
Mannequin Author

sagetrac-foosterhof mannequin commented Jun 17, 2014

comment:10

In Graph.vertices():

if not boundary_first:
    return sorted(list(self.vertex_iterator()), key=key)

Apparently, it is possible to compare Set and int (or sage.blabla.Integer) objects, but not (frozen)set and int.

sage: Set([252,35]) < 1
True
sage: set([252,35]) < 1
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-16-ea79432632b4> in <module>()
----> 1 set([Integer(2352),Integer(2352)]) < Integer(1)

TypeError: can only compare to a set

sage: frozenset([2352,2352]) < 1
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-17-55b270cfd9be> in <module>()
----> 1 frozenset([Integer(2352),Integer(2352)]) < Integer(1)

TypeError: can only compare to a set

And thus, when doing:

R1 = vertices & frozenset(g1.vertices())

it gives a TypeError, as g1 contains a vertex that is a frozenset, namely g1_v:

g1_v = frozenset(set2)
g2_v = frozenset(set1)
g1.add_vertex(g1_v)
g2.add_vertex(g2_v)

Florian

EDIT:

This can be fixed very easily, but also more efficiently.
First of all, we can pass g1.vertex_iterator() to frozenset(), which will probably solve it all together.

Second of all, we know exactly which vertices g1 has, namely all vertices in set1, and the extra vertex g1_v. None more. Since g1_v is not in the variable vertices, we can just pass set1 to frozenset(), and analogously set2 for g2.

EDIT2:

Ok, so apparently Graph._backend.iterator_edges() also compares vertices before yielding them, to make sure the 'lesser' one is reported as the first vertex. And _ford_fulkerson uses this function through Graph.edge_iterator().

I guess there is no way around using a Set?

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Jun 17, 2014

comment:11

Oh. I see ! Sorry, I had not realized that the problem came from frozensets inside of the graph !

This can be fixed very easily, but also more efficiently.
First of all, we can pass g1.vertex_iterator() to frozenset(), which will probably solve it all together.

Right, but that's going too far I guess.

Ok, so apparently Graph._backend.iterator_edges() also compares vertices before yielding them, to make sure the 'lesser' one is reported as the first vertex. And _ford_fulkerson uses this function through Graph.edge_iterator().

Yepyep. And g.vertices() sorts the vertices, which is not a good idea in itself but has some advantages, still ... :-/

I guess there is no way around using a Set?

Ahahaah. As you said there are many ways around, but the best is indeed to use Set. I don't think that it is a performance issue in this context, I was just saying aloud that it's better to avoid Set in general. Let's keep the code as it is :-)

Nathann

@sagetrac-vbraun-spam sagetrac-vbraun-spam mannequin modified the milestones: sage-6.3, sage-6.4 Aug 10, 2014
@fchapoton
Copy link
Contributor

comment:13

branch does not apply

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Mar 15, 2015

comment:14

Hello,

I had totally forgotten this branch, and this is very bad (wrong results) (EDIT:see below). I looked at the code this week-end and fixed it (the fix actually takes one line ...), and also simplified the function a bit.

As the fix actually avoids adding more code to the function, I propose it instead, and change the branch (I hope that nobody minds). Needs review again.

Nathann

(EDIT 16/05/2015) This sentence is misleading: I meant that the bug is very bad, and that forgetting it is dangerous since Sage will return wrong results for as long as it is not fixed.


New commits:

df60f70trac #16475: Move connectivity test to the main function
005e77ctrac #16475: A useless case
ebc6504trac #16475: simplify handling of edge labels (capacities)
70ca162trac #16475: slightly Simplify update of edge labels (capacities)
18c6493trac #16475: Set->frozenset
f8b054etrac #16475: actual bugfix
be53d23trac #16475: Documentation and renamed variables
5732579trac #16475: Bibliographical reference

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Mar 15, 2015

Changed commit from e4604fa to 5732579

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Mar 15, 2015

Changed branch from u/foosterhof/ticket/16475 to public/16475

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Mar 15, 2015

Author: Nathann Cohen

@sagetrac-foosterhof
Copy link
Mannequin Author

sagetrac-foosterhof mannequin commented May 15, 2015

comment:15

Hello,

First of all, you say that my code was very bad, because it gave wrong results. Could you tell me why? I see no counter-example or anything to back up your claim.

Of course I dont mind any refactoring, and actually I think it is indeed much more readable, and even more efficient, but there is something bugging me:

In the bibliographical reference you give (which is actually what I used as well), the proof to show that a Gomory-Hu tree exists uses a Partition { C_r | r \in R }, which dictates how the two induction-obtained trees should be merged. This is what I tried to add to the code with my patch, but you have completely removed it. Why? You simply put the edge between the two trees to be {u, v} where u and v are the picked vertices. However, why would Schrijver put such rigor in proving that this tree obtained using the partition is correct, and not instead simply use 2 arbitrary vertices.

You say you have fixed the algorithm by a single line of code, however I still think that it might give incorrect results in some cases, for reasons above. Would you agree?

Florian

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented May 16, 2015

comment:16

Helloooooooooooo,

First of all, you say that my code was very bad, because it gave wrong results. Could you tell me why? I see no counter-example or anything to back up your claim.

Oh no sorry, that's a misunderstanding! I just re-read my comment and indeed I see how you could have understood it this way. I will edit it in a second. What I meant to say is "I had forgotten about this bug, and it is very bad (that I forgot it) because meanwhile Sage is returning wrong results".

I have found nothing wrong with your code. Actually, as I began reading the proof that you followed, I merely noticed that it could be done in a simpler way. Which is why I proposed this alternative implementation, hoping that you would not mind.

In the bibliographical reference you give (which is actually what I used as well), the proof to show that a Gomory-Hu tree exists uses a Partition { C_r | r \in R }, which dictates how the two induction-obtained trees should be merged. This is what I tried to add to the code with my patch, but you have completely removed it. Why? You simply put the edge between the two trees to be {u, v} where u and v are the picked vertices. However, why would Schrijver put such rigor in proving that this tree obtained using the partition is correct, and not instead simply use 2 arbitrary vertices.

What the current branch does is apply Lemma 15.14alpha (p249 in my edition of it). While I obviously cannot tell you why Schrijver wrote the proof the way he did, I suspect that it is because he wants to prove the algorithm while I only want to implement it.

Right now, the code applies Lemma 15.14alpha while assuming that every graph has a gomory-hu Tree. Schrijver cannot do this for the very obvious reasons that he wants to prove this existence result.

Assuming the existence of a gomory-hu tree, I apply Lemma 15.14alpha to two points to obtain a bipartition U,V of the graph, then successively contract U and V to obtain two 'reduced' graphs. In each graph you will find 'two kinds' of vertices, i.e.:

  • "real vertices from the original graphs" and
  • "fake vertices that only appear because of the merge operation"

With Lemma 15.14alpha, what we know is that the max flow between any two "real points" u,v of a specific "new" graph is equal to their max flow in the original graph. Consequently, the three that is being built must indeed be the gomory-hu tree (that we know exists).

You say you have fixed the algorithm by a single line of code, however I still think that it might give incorrect results in some cases, for reasons above. Would you agree?

Do you believe that the way it is done is wrong? Or do you believe that it is suspect of being wrong because it is a simpler algorithm than the one given by Schrijver's proofs?

In my experience, writing the proof of an algorithm requires much more theoretical machinery than what is actually needed when one implements it. Sometimes, it is even easier to make the (theoretical) algorithm a bit slower in order to make the explanation simpler. Very recently, I had to give a formal proof of an algorithm needed for some research problem: while it was easy to explain on a table, it was so complicated to explain it on paper that I started trying to change the algorithm itself to make the proof easier.

Sorry for the misunderstanding, and thank you for your attention with respect to this issue.

Nathann

@sagetrac-borassi
Copy link
Mannequin

sagetrac-borassi mannequin commented May 18, 2015

comment:17

Hello!

I have analyzed the proofs in Schrijver: foosterhof is right, because you cannot take two arbitrary vertices, but the code is still correct, because the vertices taken are exactly the vertices u,v used in computing the initial u-v cut (these vertices have no name in Schrijver). We could then modify the proof of Theorem 15.14 in Schrijver by using u as r' and v as r!* (r' and r!* are the extremal points of the "new" arc added). In the last sentence of the proof, Lemma 15.14alpha yields the same conclusion for arcs e \neq r'r!, while for the arc r'r!, the two components of the tree obtained by removing r'r!* correspond to a minimum r'-r!* cut because we have chosen u=r' and v=r!''.

I have no clue on why Schrijver did not use this proof, which looks much simpler (at least to me). If you have any doubts, I can write the whole proof more formally.

Other (small) issues:

  • in the comments, I'd use "maximum flow" instead of "maximal flow": in my opinion, "maximal" means that there might be other elements that are not comparable (for instance, C is a maximal clique if there is no clique C' \supset C, while it is a maximum clique if it is a clique of maximum cardinality).
  • reading the code, I could not find Schrijver's reference: I think that adding this reference might be a good idea.
  • do you think we should provide a proof that our "simplification" of Schrijver's algorithm works?

Michele

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented May 18, 2015

Changed commit from 5732579 to b0bbcd4

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented May 18, 2015

Branch pushed to git repo; I updated commit sha1. New commits:

7fadf1ftrac #16475: Merged with 6.7
b0bbcd4trac #16475: maximal->maximum

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented May 18, 2015

comment:19

Hello !

  • in the comments, I'd use "maximum flow" instead of "maximal flow"

I think that there is no confusion possible. I changed it anyway.

  • reading the code, I could not find Schrijver's reference: I think that adding this reference might be a good idea.

It appears in the documentation of Graph.gomory_hu_tree, right before the 'INPUT' block.

  • do you think we should provide a proof that our "simplification" of Schrijver's algorithm works?

As you please. Just remember that for as long as this ticket is not reviewed, Sage returns wrong answers. Also, Wikipedia contains a description of some algorithm to compute a Gomory-Hu tree (I did not read it). If it is the same then that's cool, and if ours is simpler then it is probably where the proof should be written.

http://en.wikipedia.org/wiki/Gomory%E2%80%93Hu_tree

Nathann

@sagetrac-borassi
Copy link
Mannequin

sagetrac-borassi mannequin commented May 19, 2015

Reviewer: Michele Borassi

@sagetrac-borassi
Copy link
Mannequin

sagetrac-borassi mannequin commented May 19, 2015

comment:21

Hello!

  • in the comments, I'd use "maximum flow" instead of "maximal flow"

I think that there is no confusion possible. I changed it anyway.

Ok :-)

  • reading the code, I could not find Schrijver's reference: I think that adding this reference might be a good idea.

It appears in the documentation of Graph.gomory_hu_tree, right before the 'INPUT' block.

Sorry, I didn't find it!

  • do you think we should provide a proof that our "simplification" of Schrijver's algorithm works?

As you please. Just remember that for as long as this ticket is not reviewed, Sage returns wrong answers. Also, Wikipedia contains a description of some algorithm to compute a Gomory-Hu tree (I did not read it). If it is the same then that's cool, and if ours is simpler then it is probably where the proof should be written.

I agree: this is not the place where the proof should be written. I think the code is fine as it is!

@vbraun
Copy link
Member

vbraun commented May 19, 2015

Changed branch from public/16475 to b0bbcd4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants