Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The lca package #597

Merged
merged 39 commits into from
Aug 18, 2018
Merged

The lca package #597

merged 39 commits into from
Aug 18, 2018

Conversation

AlexandruValeanu
Copy link
Collaborator

@AlexandruValeanu AlexandruValeanu commented Jun 12, 2018

I have finished the lca package. The jgrapth-dev topic relevant to this PR is here.

  • Add small comments to explain harder parts
  • Write better documentation for all implementations

This should close #453.

# Conflicts:
#	jgrapht-core/src/main/java/org/jgrapht/GraphTests.java
#	jgrapht-core/src/main/java/org/jgrapht/generate/BarabasiAlbertForestGenerator.java
#	jgrapht-core/src/test/java/org/jgrapht/generate/BarabasiAlbertForestGeneratorTest.java
# Conflicts:
#	jgrapht-core/src/main/java/org/jgrapht/GraphTests.java
#	jgrapht-core/src/main/java/org/jgrapht/generate/BarabasiAlbertForestGenerator.java
#	jgrapht-core/src/test/java/org/jgrapht/generate/BarabasiAlbertForestGeneratorTest.java
# Conflicts:
#	jgrapht-core/src/main/java/org/jgrapht/alg/decomposition/HeavyPathDecomposition.java
#	jgrapht-core/src/main/java/org/jgrapht/alg/decomposition/package-info.java
#	jgrapht-core/src/test/java/org/jgrapht/alg/decomposition/HeavyPathDecompositionTest.java
@AlexandruValeanu AlexandruValeanu changed the title The lca package + Heavy-path decomposition The lca package Jul 21, 2018
@d-michail d-michail self-requested a review July 24, 2018 10:33
Copy link
Member

@d-michail d-michail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice contribution. A few mostly stylistic and documentation comments.

*/
public class NaiveLcaFinder<V, E>
@Deprecated public class NaiveLcaFinder<V, E>

This comment was marked as resolved.

* @return internal parent array
*/
public int[] getParentArray(){
return parent.clone();

This comment was marked as resolved.

@SuppressWarnings("unchecked")
public Set<V> getAllLCAs(V a, V b)
{
Set<V>[] visitedSets = new Set[2];

This comment was marked as resolved.

* has been visited from both a and b, it is no longer expanded in our search (we know that its
* ancestors won't be part of the SLCA(x, y) set).
*/
@SuppressWarnings("unchecked")

This comment was marked as resolved.

* </p>
*
* Preprocessing Time complexity: $O(1)$
* Preprocessing Memory complexity: $O(1)$

This comment was marked as resolved.

public class EulerTourRMQLCAFinder<V, E> implements LCAAlgorithm<V> {
private final Graph<V, E> graph;
private final Set<V> roots;
private final int MAX_LEVEL;

This comment was marked as resolved.

}
}

private void dfs(int u, int parent, int lvl) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you convert this to iterative for better scalability?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

for (int i = 0; i < sizeTour; i++)
rmq[0][i] = i;

for (int i = 1; (1 << i) <= sizeTour; i++)

This comment was marked as resolved.

* Algorithm for computing lowest common ancestors in rooted trees and forests based on {@link HeavyPathDecomposition}.
*
* Preprocessing Time complexity: $O(|V|)$
* Preprocessing Memory complexity: $O(|V|)$

This comment was marked as resolved.


/**
* Algorithm for computing lowest common ancestors in rooted trees and forests based on {@link HeavyPathDecomposition}.
*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some reference to appropriate paper which introduced the technique. If this is not explicit to a particular paper, add a reference to the dynamic trees paper of Sleator and Tarjan which made all these possible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added on in the HeavyPathDecomposition. Thank you for the remainder.

Copy link
Collaborator

@jkinable jkinable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you post a link to the place where you put the performance tests results? Can't seem to find it in my mailbox.

This PR introduces a variety of ways to compute LCAs.
Is there a clear winner? Is there a clear looser? Are there specific trade-offs the user should consider when using a specific LCA algorithm? Are there certain types of trees/forests which are solved better by a specific LCA implementation? Your typical LCA user might just want the "best" algorithm, without wasting time on performing a performance comparison.
I would add a paragraph of javadoc to each of the LCAs where you discuss the above questions, and crossreference to other LCAs to make it as simple as possible for the user to select the best LCA algorithm for his application.

Are there LCA algorithms or related problems that are not included in this PR but which we might want to consider including somewhere in the future?

import static org.jgrapht.util.MathUtil.log2;

/**
* Algorithm for computing lowest common ancestors in rooted trees and forests using the binary lifting method.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a reference for this technique?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure a formal reference (i.e. academic paper) exists. I've added a link to the original article on Topcoder. That's the best I can think of.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It actually looks like a "brute-force" approach, using the lookup table.

This reference might be related, but it needs further investigation.

Bender, Michael A., and Martın Farach-Colton. "The level ancestor problem simplified." Theoretical Computer Science 321.1 (2004): 5-12.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is indeed a technique from that paper (or some older one). See Demaine's notes page 7, Algorithm B.

"The basic idea is to use jump pointers. These are pointers at a node which reference one of the
node’s ancestors. For each node create jump pointers to ancestors at levels 1, 2, 4, . . . , 2k. Queries are answered by repeatedly jumping from node to node, each time jumping more than half of the remaining levels between the current ancestor and goal ancestor. So worst-case number of jumps is O(log n). Preprocessing is done by filling in jump pointers using dynamic programming."

No one is actually using the term "binary lifting". I think this was coined up by people in topcoder or some other programming competition.

While keeping the name is fine, I think the Javadocs need a reference to an academic paper using this technique. Moreover, they need at least a paragraph explaining the lookup table with the jump pointers, and how you answer the LCA query in O(logn) time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we need more than the current explanation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it looks fine. @jkinable ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine if we add the reference (+quoted text) pointed out by @d-michail

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it already added? See

* The method appears in <i>Bender, Michael A., and Martın Farach-Colton. "The level ancestor problem

* and b. Of course we may have to wait longer if the path to a is of length n, but the path to
* b&gt;n. at the first loop we have a path of 0 length from the nodes we are considering as LCA to
* their respective children which we wish to find the LCA for.
*

This comment was marked as resolved.

*
* <p>
* See the article on
* <a href="https://en.wikipedia.org/wiki/Tarjan%27s_off-line_lowest_common_ancestors_algorithm">wikipedia</a> for more

This comment was marked as resolved.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was marked as resolved but I dont see any changes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are. See

* The original algorithm can be found in <i>Gabow, H. N.; Tarjan, R. E. (1983),

Am I the only one that sees them? :)

@AlexandruValeanu
Copy link
Collaborator Author

@jkinable

Can you post a link to the place where you put the performance tests results? Can't seem to find it in my mailbox.

You can get some performance results by running perf.lca.LowestCommonAncestorAlgorithmPerformanceTest

This PR introduces a variety of ways to compute LCAs.
Is there a clear winner? Is there a clear looser? Are there specific trade-offs the user should consider when using a specific LCA algorithm? Are there certain types of trees/forests which are solved better by a specific LCA implementation? Your typical LCA user might just want the "best" algorithm, without wasting time on performing a performance comparison.
I would add a paragraph of javadoc to each of the LCAs where you discuss the above questions, and crossreference to other LCAs to make it as simple as possible for the user to select the best LCA algorithm for his application.

Tarjan's algorithm appears to be a clear looser but it uses the smaller amount of extra memory to perform a query and it doesn't need to preprocess the graph. The other three implementations are quite close and I guess it will depend on the input to see which is better. There doesn't appear to be a clear winner. I'll add some notes to each doc to try to guide the user.

Are there LCA algorithms or related problems that are not included in this PR but which we might want to consider including somewhere in the future?

I do not think there exist popular algorithms (for computing LCAs in trees) that are not included in this package. The RMQ part in EulerTourRMQLCAFinder can be done in linear time but it is more complicated and I am not sure how well it behaves in practice. Another LCA-related problem will be to compute LCAs in dynamic trees (trees where you can add/remove a vertex). Tarjan and Sleator have introduced link-cut trees that support these operations.

*
* @author Alexandru Valeanu
*/
public interface LCAAlgorithm<V> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the main interface name, let's spell it out as LowestCommonAncestorAlgorithm. Elsewhere (e.g. in the method names and in the implementation classes) we can keep the acronym.

return null;

/*
* Idea: Get a anb b on the same vertex path by 'jumping' from one path to another
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and

* @return the LCA of a and b, or null if there is no LCA.
*/
V getLCA(V a, V b);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we expose getAllLCAs at the interface level as well? Algorithms are free to throw UnsupportedOperationException if they don't support it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. That method is only added to classes that can implement the function more efficiently than the default one (only Tarjan so far) and to make it easier for the user to get LCAs for a batch of queries.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm referring to the getAllLCAs in NaiveLCAFinder, which finds all the lowest common ancestors (in the case where there are multiple) instead of just returning one chosen arbitrarily.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I completely misread it. We can expose it and make all implementations except NaiveLCAFinder return the result of getLCA. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works, but then it shouldn't have "all" in the name since it's not guaranteed to return a comprehensive set (and as with isomorphism inspectors, each alg doc should specify its own behavior). How about

getLCA(V, V) : returns one
getLCASet(V, V) : returns many
getBatchLCA(List): batch version, each query returns one
getBatchLCASet(List): batch version, each query returns many

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I assume that the batch version is implemented in terms of the non-batch one. And in this case, getLCASet is not supported for all tree-based implementations.

private int[] component;

/**
* Construct a new instance of the algorithm..<br>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why two dots? Also, use p instead of br.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea where the two dots come from. Will change to <p>

}

V vertexU = indexList.get(u);
for (E edge: graph.edgesOf(vertexU)){

This comment was marked as resolved.

private int[] representative;

private int[][] rmq;
private int[] log2;

This comment was marked as resolved.

@@ -555,34 +556,6 @@
Graphs.addEdgeWithVertices(graph, 1, 3);
assertTrue(GraphTests.isWeaklyChordal(graph));
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these tests being eliminated?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are not. My editor auto-converted @Test public void testIsWeaklyChordal() to

@Test
public void testIsWeaklyChordal()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm referring to failRequireIsWeightedOnUnweightedGraph and failRequireIsWeightedOnNull

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea what happened. I've put the three tests back.

*
* @author Alexandru Valeanu
*/
public abstract class LCATestBase {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this class only provides non-DAG cases, it should be named LCATreeTestBase

* @author Barak Naveh
* @author Alexandru Valeanu
*/
public class NaiveLCAFinderTest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it's probably worth splitting to distinguish the small cases from the large ones; either that or use an explicit suite to select them.

Assert.assertEquals("b", createSolver(g, Collections.singleton("b")).getLCA("h", "b"));
}

// TODO: keep?

This comment was marked as resolved.

}

private void dfsIterative(int u, int startLevel){
// set of vertices for which the the part of the if has been performed

This comment was marked as resolved.

@jkinable
Copy link
Collaborator

jkinable commented Aug 15, 2018

@AlexandruValeanu can you add reference given by @d-michail to BinaryLiftingLCAFinder.java? Also, for TarjanLCAFinder: do you have a better reference than the wiki url?

@d-michail if you and John's comments have been addressed, can you go ahead and merge this one?

@AlexandruValeanu
Copy link
Collaborator Author

AlexandruValeanu commented Aug 15, 2018

@jkinable I have added both references (see

* The method appears in <i>Bender, Michael A., and Martın Farach-Colton. "The level ancestor problem
for Binary... and
* The original algorithm can be found in <i>Gabow, H. N.; Tarjan, R. E. (1983),
for Tarjan...)

@d-michail d-michail merged commit 048bf34 into jgrapht:master Aug 18, 2018
@d-michail
Copy link
Member

Thanks!

@syoon2 syoon2 mentioned this pull request Apr 12, 2024
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: lowest common ancestor algorithms for trees
4 participants