-
-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] MixedEdgeGraph for enablement of causal graphs #5947
base: main
Are you sure you want to change the base?
Conversation
Kay I got the CI passing. I think this is ready for discussion @dschult . Lmk if perhaps a call would be easier. |
Hi @dschult just wanted to follow up to see if we could converse on some of the high level design choices made here and if they're in line w/ what you think would work and be acceptable from maintainers end. Lmk if you need more time tho. I totally understand since this is a pretty large PR. We are actively building up causal algorithms in |
Thanks for the gentle ping. I still am time-crunched but took a quick look. Would it reduce maintenance burden to make MixedEdgeGraph a subclass of Graph? Then much of the node functionality would already be there. The general approach looks good to me. |
Sure completely understand! Re subclassing Graph: The issue I have there is how to handle the factory methods. My understanding is with #5850 we are at least improving this pattern. E.g. mainly how to handle:
Should I override the Or for now... should I just make the inheritance pretty ugly for the sake of getting it to work and at least subclassing the node functionality? |
Well -- I've looked through your code a little more carefully. And I have to say that I don't see that subclassing will help very much. So many of the methods would have to be changed, it would probably be harder to read than the current version. I'll go through in more detail. But I think this is a good framework and a good design choice. |
@adam2392 Thanks for this! I was revisiting this PR and I was also skimming through https://github.com/py-why/pywhy-graphs (it looks nice!). Do you think given |
@MridulS thanks for following up! Yes we would be interested in getting some form of this into networkX! Right now tho there's various class functions that don't work "exactly the way" I think networkx does but perhaps this can be improved in some iterations with networkx dev team input. I think if the API or internals change and it affects pywhy-graphs that's fine. ESP rn it's in early stages of experimentation. What's the best way to proceed here? |
I think first step would be either cleaning up this PR or creating a new one. I also see that pywhy-graphs uses type annotations everywhere, we aren't there yet. Would that be a blocker to get |
No it wouldn't be a blocker. We can remove the type annotations. I can adjust the PR here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove this
Kay @MridulS lmk wdyt? This contains our WIP example, documentation and related algorithms for the new class Re MixedEdgeGraph usage in pywhy-graphs: Just for transparency, we are planning on refactoring the
This is more inline w/ networkx, which puts responsibility of checking if a graph is of a certain type (e.g. acyclic) on a function instead. We would then implement these specific functions. However, as far as I can tell, this won't require any changes to |
Signed-off-by: Adam Li <adam2392@gmail.com>
* improve m-separation property runtime and efficiency Signed-off-by: Jaron Lee <jaron2005@gmail.com> * Add unit tests and coverage of error statements Co-authored-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
I tried to rebase this PR on the current main so it can pass the tests and github, workflows, etc. It looks like there might still be some errors/updates where the code tries to import from pywhy, etc. |
Signed-off-by: Adam Li <adam2392@gmail.com>
Thanks @dschult ! I fixed the doc-test. I think this works now at least locally for me. I'll take a pause here to allow you all to take a look? |
Yes @adam2392. I'll try to go over this PR this week. Thanks again for the updates :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like the right direction for moving forward. That is, it looks good!
I think you can go ahead and make other changes and improvements. It is a pretty big change in terms of lines of code, so it's hard to review, but I think it is generally fine. I have added a few minor suggestions below.
# Using the ``MixedEdgeGraph``, we can represent a causal graph | ||
# with two different kinds of edges. To create the graph, we | ||
# use networkx ``nx.DiGraph`` class to represent directed edges, | ||
# and ``nx.Graph`` class to represent edges without directions (i.e. | ||
# bidirected edges). The edge types are then specified, so the mixed edge | ||
# graph object knows which graphs are associated with which types of edges. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these comment lines repeat what is described above in the doc_string. This should probably be removed and any new ideas from here should be transferred to the doc_string. (maybe? push back if you think otherwise)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I re-wrote this part to just have a short summary.
bidirected_edge_name="bidirected", | ||
): | ||
"""Return the moral graph from an ancestral graph in :math:`O(|V|^2)`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a paragraph reminding people what the moral graph is and what an ancestral graph is?
Also, you should move the complexity statement out of the first line (that is used for short text near any link in the docs to this function. That first line should have a blank line after it. Then the paragraph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done lmk wdyt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay -- progress!
Some comments on the paragraph:
How can the undirected moral graph have a v-structure (which is directed) (it sounds like it is from the same graph)?
Is it correct that the v-structure appears in the ancestral graph? You don't actually say what the ancestral graph is -- just that it contains many types of edges. How are those edges determined (from ancestors?)
Also, the moral and ancestral graph must be related in some way, but the paragraph doesn't say how they are related. :}
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay tried to address this in a better way. Lmk if this still could be improved!
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
Apologies @dschult and @MridulS for the delay. I've been backlogged on stuff to do. I just addressed some of Dan's comments. I also updated the PR description to note some nuances that I forgot to describe when comparing networkx graphs and mixed-edge-graph. Btw totally happy to break up this PR, but just wanted clarity on how to best move forward first, slash if this is even something networkx wants (i.e. I don't want to put in the time to refactor code for PR sake if the PR is not desired). |
Signed-off-by: Adam Li <adam2392@gmail.com>
I am wondering if it would be better for If we return a dictionary of nodes with always the edge type, this is more in-line w/ how networkx currently works. The only change is there is essentially always an extra "node attribute" that is returned that indicates the edge type. On the other hand, if we return the nested dictionary, it would always preserve the edge type semantics. |
Closes: #5811
Summary of changes
MixedEdgeGraph
is a graph that accepts arbitrary combinations of "base" networkx graphs (Graph, DiGraph). These internal graphs can be used to then represent any arbitrary types of edges. Each internal graph is linked to a str of characters that semantically represent the type of edge the user wants. E.g.'bidirected': nx.Graph
links the word 'bidirected' with anx.Graph
object to represent the set of bidirected edges.nx.algorithms.causal
submodule is a set of general-purpose causal algorithms that operate over mixed edge graphs. The most canonical example is the generalization of d-separation to that of "m-separation". It's pretty much the same thing, except now there are bidirected edges.In general, I realize this is a big PR, so once the high-level changes are deemed desirable by the networkx team, I can just simply break up the PR into smaller chunks to make reviewing easier at the "detailed" level.
Relevant Examples of MixedEdgeGraph Usage
The most prominent explicit example demonstrating how
MixedEdgeGraph
is used is inPywhy-graphs
:
Here is an example spun up in the pywhy-graphs docs.
Note (11/20/23): We will most likely rename these graphs to something like
DiBiUnGraph
andDiBiUnCiGraph
to reflect the same design philosophy networkx took.Some notes on how mixed-edge graph API differs from regular networkx graphs. Iterating through the graph is not exactly the same, and I don't have a great way of consolidating. Iterating through a NetworkX graph iterates through the edges. However, when we iterate through a Mixed-edge-graph, what do we then return? Edge types and then edges, or edges and their edge-types simultaneously?
Misc.
See: https://github.com/py-why/dowhy/wiki/Networkx-Proposals-For-MixedEdgeGraph-Class-To-Enable-Causal-Graphs for a high level API design overview.
See: #5811 for discussion on the inclusion of graphs with mixed edges for the sake of causality based procedures.
cc: @dschult
cc folks invested in py-why: the result of this thread will affect future API and operations in pywhy. @amit-sharma @robertness @darthtrevino @petergtz @bloebp @emrekiciman @jaron-lee