Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Naive lowest common ancestor implementation #5736

Merged
merged 14 commits into from Jul 14, 2022

Conversation

dtekinoglu
Copy link
Contributor

No description provided.

@dschult
Copy link
Member

dschult commented Jun 15, 2022

Looks like nx.ancestors has two arguments: nx.ancestors(G, v)

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through the code and made some suggestions. Looks good!

networkx/algorithms/lowest_common_ancestors.py Outdated Show resolved Hide resolved
networkx/algorithms/lowest_common_ancestors.py Outdated Show resolved Hide resolved
networkx/algorithms/lowest_common_ancestors.py Outdated Show resolved Hide resolved
@@ -133,6 +136,7 @@ def test_tree_all_pairs_lowest_common_ancestor12(self):
G = nx.DiGraph([(3, 4), (5, 4)])
pytest.raises(nx.NetworkXError, list, tree_all_pairs_lca(G))

# HOW TO PARAMETRIZE THIS ONE?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rossbar do you have any magic to handle parametrizing this test method?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that for this test it makes more sense to parametrize on the inputs since some of the tested objects are generators (i.e. some require next in order to trigger there behavior) and some are functions. For example, I'd probably write this test like so:

@pytest.mark.parametrize("graph_type", (nx.Graph, nx.MultiGraph, nx.MultiDiGraph))
def test_not_implemented(self, graph_type):
    G = graph_type([(0, 1)])
    with pytest.raises(nx.NetworkxNotImplemented):
        next(tree_all_pairs_lca(G))
    with pytest.raises(nx.NetworkxNotImplemented):
        next(all_pairs_lca(G))
    with pytest.raises(nx.NetworkxNotImplemented):
        nx.lowest_common_ancestor(G, 0, 1)

... and if you wanted to add more generators/functions, you could add more with statements (e.g.

with pytest.raises(nx.NetworkxNotImplemented):
    nx.naive_lowest_common_ancestor(G, 0, 1)

The other thing you could do is split this up into two separate tests, one for generators and one for functions, that would allow you to stacked parametrizations (see the end of this section). That might look something like:

@pytest.mark.parametrize("graph_type", (nx.Graph, nx.MultiGraph, nx.MultiDiGraph))
@pytest.mark.parametrize("lca_gen", (tree_all_pairs_lca, all_pairs_lca))
def test_not_implemented_generator(self, graph_type, lca_gen):
    G = graph_type([(0, 1)])
    with pytest.raises(nx.NetworkxNotImplemented):
        next(lca_gen(G))

@pytest.mark.parametrize("graph_type", (nx.Graph, nx.MultiGraph, nx.MultiDiGraph))
@pytest.mark.parametrize("lca_fn", (tree_all_pairs_lca, all_pairs_lca))
def test_not_implemented_function(self, graph_type, lca_fn):
    G = graph_type([(0, 1)])
    with pytest.raises(nx.NetworkxNotImplemented):
        lca_fn(G, 0, 1)

The trick with parametrizing tests (IMO) is to strike a balance between utility and readability - of course, where that balance lies is usually pretty subjective :)

@dtekinoglu dtekinoglu marked this pull request as ready for review July 4, 2022 20:48
@dtekinoglu
Copy link
Contributor Author

Ready for review @MridulS @dschult @rossbar

Copy link
Member

@MridulS MridulS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great @dtekinoglu! added some little nit picks :)

pairs.add((u, v))
else:
if type(pairs) != list:
pairs = list(pairs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not too sure if this check is required. Even if pairs isn't a list (and just an iterator), the for loop below should work just fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I don't include this check test_naive_all_pairs_lowest_common_ancestor3 fails and I cannot figure out what else to do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh yes, this is happening because the iterator is being consumed in this checking loop. So the next loop over pairs ends up just being a loop over an empty iterator. Maybe we shouldn't even check this at all here. If nx.ancestors is used with a node which is not in the graph it gives a nice clean error, something like

....
....

NetworkXError: The node 1100 is not in the digraph.

What do you think about removing this else condition here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

networkx/algorithms/lowest_common_ancestors.py Outdated Show resolved Hide resolved
networkx/algorithms/lowest_common_ancestors.py Outdated Show resolved Hide resolved
networkx/algorithms/lowest_common_ancestors.py Outdated Show resolved Hide resolved
Copy link
Member

@MridulS MridulS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Thanks @dtekinoglu :)

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! I have some comments below about possibly different ways of arranging loops in python. See what you think. None are critical. If you'd like the current code just reply with that and we'll merge it. :}

networkx/algorithms/lowest_common_ancestors.py Outdated Show resolved Hide resolved
networkx/algorithms/lowest_common_ancestors.py Outdated Show resolved Hide resolved
Copy link
Contributor

@rossbar rossbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, thanks @dtekinoglu ! This LGTM, just a stray comment about the tests.

Comment on lines +345 to +360
def assert_lca_dicts_same(self, d1, d2, G=None):
"""Checks if d1 and d2 contain the same pairs and
have a node at the same distance from root for each.
If G is None use self.DG."""
if G is None:
G = self.DG
root_distance = self.root_distance
else:
roots = [n for n, deg in G.in_degree if deg == 0]
assert len(roots) == 1
root_distance = nx.shortest_path_length(G, source=roots[0])

for a, b in ((min(pair), max(pair)) for pair in chain(d1, d2)):
assert (
root_distance[get_pair(d1, a, b)] == root_distance[get_pair(d2, a, b)]
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not missing anything, it looks like the G=None arg isn't used in any of the following tests. If so, then this check could be made easier to understand by removing the unused kwarg and associated scaffolding, e.g.

Suggested change
def assert_lca_dicts_same(self, d1, d2, G=None):
"""Checks if d1 and d2 contain the same pairs and
have a node at the same distance from root for each.
If G is None use self.DG."""
if G is None:
G = self.DG
root_distance = self.root_distance
else:
roots = [n for n, deg in G.in_degree if deg == 0]
assert len(roots) == 1
root_distance = nx.shortest_path_length(G, source=roots[0])
for a, b in ((min(pair), max(pair)) for pair in chain(d1, d2)):
assert (
root_distance[get_pair(d1, a, b)] == root_distance[get_pair(d2, a, b)]
)
def assert_lca_dicts_same(self, d1, d2):
"""Checks if d1 and d2 contain the same pairs and
have a node at the same distance from root for each.
"""
for a, b in ((min(pair), max(pair)) for pair in chain(d1, d2)):
assert (
self.root_distance[get_pair(d1, a, b)] ==
self.root_distance[get_pair(d2, a, b)]
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah - so I just looked at the rest of the file and realized that this method (along with most of the other patterns, e.g. the test method names) were taken from the existing test classes, which makes total sense. I think there's actually a lot to clean up in the existing tests, so let's not worry about this here... I will create a followup issue so that we can handle this separately and not block this PR.

I'm going to switch my previous comment to an approval, sorry for the noise @dtekinoglu !

This was referenced Jul 14, 2022
@dschult dschult merged commit b2f91c3 into networkx:main Jul 14, 2022
@jarrodmillman jarrodmillman added this to the networkx-2.8.6 milestone Aug 21, 2022
jarrodmillman pushed a commit that referenced this pull request Aug 21, 2022
* Add naive lca methods

* Naive algorithm implementation for LCA

* Modify naive lca functions

* Correct parameters of nx.ancestors

* Update lowest_common_ancestors.py

* Parametrize tests

* Apply suggestions from code review

Co-authored-by: Dan Schult <dschult@colgate.edu>

* Yield instead of append

* Tests for naive lca

* Correct test cases for naive lca algorithms

* Apply suggestions from code review

Co-authored-by: Mridul Seth <mail@mriduls.com>

* Fix function name -when calling

* Make requested changes

* Inlining _get_a_lowest_common_ancestor

Co-authored-by: dtuncturk <dilaramemis@sabanciuniv.edu>
Co-authored-by: Dan Schult <dschult@colgate.edu>
Co-authored-by: Mridul Seth <mail@mriduls.com>
MridulS added a commit to MridulS/networkx that referenced this pull request Feb 4, 2023
* Add naive lca methods

* Naive algorithm implementation for LCA

* Modify naive lca functions

* Correct parameters of nx.ancestors

* Update lowest_common_ancestors.py

* Parametrize tests

* Apply suggestions from code review

Co-authored-by: Dan Schult <dschult@colgate.edu>

* Yield instead of append

* Tests for naive lca

* Correct test cases for naive lca algorithms

* Apply suggestions from code review

Co-authored-by: Mridul Seth <mail@mriduls.com>

* Fix function name -when calling

* Make requested changes

* Inlining _get_a_lowest_common_ancestor

Co-authored-by: dtuncturk <dilaramemis@sabanciuniv.edu>
Co-authored-by: Dan Schult <dschult@colgate.edu>
Co-authored-by: Mridul Seth <mail@mriduls.com>
cvanelteren pushed a commit to cvanelteren/networkx that referenced this pull request Apr 22, 2024
* Add naive lca methods

* Naive algorithm implementation for LCA

* Modify naive lca functions

* Correct parameters of nx.ancestors

* Update lowest_common_ancestors.py

* Parametrize tests

* Apply suggestions from code review

Co-authored-by: Dan Schult <dschult@colgate.edu>

* Yield instead of append

* Tests for naive lca

* Correct test cases for naive lca algorithms

* Apply suggestions from code review

Co-authored-by: Mridul Seth <mail@mriduls.com>

* Fix function name -when calling

* Make requested changes

* Inlining _get_a_lowest_common_ancestor

Co-authored-by: dtuncturk <dilaramemis@sabanciuniv.edu>
Co-authored-by: Dan Schult <dschult@colgate.edu>
Co-authored-by: Mridul Seth <mail@mriduls.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants