Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transitive_closure loses self loops #3187

Closed
alan-isaac opened this issue Oct 9, 2018 · 5 comments · Fixed by #3613
Closed

transitive_closure loses self loops #3187

alan-isaac opened this issue Oct 9, 2018 · 5 comments · Fixed by #3613

Comments

@alan-isaac
Copy link

Consider the following example:

import numpy as np
import networkx as nx
a = np.array([[0, 1, 1], [1, 0, 1], [1, 1, 0]])
G = nx.from_numpy_matrix(a, create_using=nx.MultiDiGraph())
T = nx.transitive_closure(G)
print(nx.to_numpy_matrix(T))

The transitive closure lacks the expected self loops. Why? (The documentation link does not work, and the docstring implies that self loops should be kept.) By "expected", I mean "according to standard definitions", such as the Wikipedia definition. I anticipate that a different definition is being used, but what is it?

The current answer at StackExchange claims this is an implementation error.

@dschult
Copy link
Member

dschult commented Oct 10, 2018 via email

@alan-isaac
Copy link
Author

alan-isaac commented Oct 10, 2018

Yes, the current documentation link is http://www.ics.uci.edu/~eppstein/PADS/PartialOrder.py, which fails. [Note: This link does work as of September 2019.]

I have not been able to find a definition of the transitive closure of a directed graph that does not imply that self loops should be included. Absence of evidence may not be evidence of absence, but my belief is that a closure must embed the original relation and all its reachable points.

@dschult dschult added this to the networkx-2.3 milestone Oct 12, 2018
@dschult
Copy link
Member

dschult commented Oct 12, 2018

We should make this clear at the very least -- and probably change the current default behavior. But as the link you reference above suggests, some people do not expect self-loops in the closure. Just because the mathematical meaning of a word like closure seems clear doesn't mean the word will be clear. Besides, your definition of "reachable points" may be different from others.

It seems to me there are two types of self-loops that arise and we need to be clear about which is treated which way. Is a node reachable from itself in zero steps? If so, we need self-loops on every node. I believe this is called "reflexive transitive closure". If we enforce that paths must have positive length, then we only get self-loops when there exists a cycle that includes that node.

I gather that you are saying that we should include self-loops due to cycles in the graph, but exclude self-loops if no such cycle exists. If this is what you are suggesting, then I agree.

@alan-isaac
Copy link
Author

Yes, you understand me. But I would understand an option (say, forceReflexive) for those who want the reflexive transitive closure to get it.

@dschult
Copy link
Member

dschult commented Oct 13, 2018

I agree.. An option reflexive=True could include all selfloops, reflexive=False could be the default with self-loops only for nodes on cycles. And we could add an option: reflexive=None which does not include any self-loops -- if that doesn't make the code to convoluted.

In any case, it means we'll have to separate this from using dfs_preorder_nodes because we need to see when cycles are formed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

2 participants