New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revised clique related code #365
Conversation
- Refactored and added timeout in BronKerbosch clique enumeration - Extracted base implementation - Added interface for maximal clique enumeration - Added pivot version of Bron-Kerbosch - Added Coreness vertex scorer - Added Degeneracy graph iterator - Added Bron-Kerbosch variant with pivoting and degeneracy ordering - Added performance test between different maximal clique enumeration algorithms - Moved CliqueMinimalSeparatorDecomposition into org.jgrapht.alg.clique - Added tests for all new code
This pull request also has implications for pull request #249 If I recall correctly, there were still quite a few issues with that pull request. Nevertheless it's probably a good idea to see what can be used from that pull request, e.g. perhaps some of the test code? |
This one touches the maximal-clique enumeration algorithm and adds 2 more variants. Pull-request #249 tries to solve the maximum clique by (a) reducing to independent set, (b) then reducing to vertex cover and (c) calling the exact VC algorithm. There are definitely some useful stuff in there. If the original author does not find any time to finish it, we could sync it with the rest of the library. |
} | ||
} | ||
|
||
Set<V> R = new HashSet<>(Arrays.asList(vi)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably marginally more efficient to split this into 2:
Set R = new HashSet<>();
R.add(vi);
/** | ||
* Timeout in nanoseconds | ||
*/ | ||
protected final long nanos; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good to have a time limit, but currently there's no way of telling whether the algorithm terminated due to a time limit, or whether it simply finished its computations. I would propose to add a method which returns the termination status of the algorithm. This can be simply a boolean function, e.g. boolean timeLimReached()
or it can be a more fancy 'getStatus()' function which returns a TerminationStatus (enum), such as:
TerminationStatus.Optimal, TerminationStatus.TimeLimReached, TerminationStatus.MaxIterationsReached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a boolean returning method to test whether the timeout has been reached.
buckets[i] = new HashSet<>(); | ||
} | ||
|
||
int minDegree = n + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minDegree = n + 1;
This raises some questions:
- do you allow graphs with self loops? If so, then the max degree of any vertex in a complete graph + self loop = n
- if you do not allow self loops, then the max degree of any vertex = n-1
- what about pseudo graphs where the degree of a vertex can be arbitrarily large? If this class doesn't support pseudo graphs, then we should probably make a remark about this in the class description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restricted to simple graphs.
g.addEdge(V5, V7); | ||
g.addEdge(V6, V7); | ||
|
||
// for fun, add an overlapping clique { V3, V4, V5 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
finder.iterator().forEachRemaining(cliques::add); | ||
|
||
int found = cliques.size(); | ||
assertTrue(found == 0 || found == 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how valuable this test is? This test works on the assumption that the machine which executes the clique finder finds 0 or 1 cliques within the time limit. The faster our test machines become, the more often it will find 1. I would just remove this test. Or, if you really want to test this, put a time limit on the computations, and use an external timer to measure how long the computations take (and whether the time limit was respected).
Personally I wouldn't bother. If this breaks, then I'm sure we'll get complaints. This issue would be easy to troubleshoot.
/** | ||
* . | ||
* | ||
* @author John V. Sichi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just in case: Are you or John the author of this class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just copy-pasted most of the code in there, so kept the same header.
public void testBadParameters() | ||
{ | ||
try { | ||
new PageRank<>(null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should probably be:
new Coreness<>(null);
* | ||
* @author Dimitrios Michail | ||
*/ | ||
public class MaximalCliqueEnumerationPerformanceTest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the results of this class, can you add some recommendations to the javadoc in the various maximal clique enumeration classes? Something like:
"On average, the best performance is achieved with MaxCliqueEnumerator X. If you have a sparse graph, it may be better to use MaxCliqueEnumerator Y. The best runtime complexity is achieved by MaxCliqueEnumerator Z."
I don't know whether something like that is possible. It may quite well be the case that neither of the implementations dominates all the other implementations. So this is just a suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure whether we can add a valid enough statement there.
} | ||
|
||
@Test | ||
public void testGraph2WithListener() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test seems pretty much the same as the testGraph2
test, except that you added a listener. I would recommend combining the two.
* | ||
* @return an iterator over all the maximum cliques | ||
*/ | ||
Iterator<Set<V>> maximumIterator(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just from the description, I wouldn't understand the difference between the maximumIterator() function and the iterator() function. Could you clarify/rephrase this a bit? If the only difference is the size, then I would rename this method to:
descCliqueSizeIterator();
and update the description to:
Returns an iterator which iterators over all maximal cliques, in descending order (clique size).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed it from the interface, as it was indeed confusing. I left it in the implementation with improved documentation.
Another nice addition! Thanks! |
Main changes: