[ENH] MixedEdgeGraph for enablement of causal graphs #5947

adam2392 · 2022-08-27T15:56:03Z

Closes: #5811

Summary of changes

MixedEdgeGraph is a graph that accepts arbitrary combinations of "base" networkx graphs (Graph, DiGraph). These internal graphs can be used to then represent any arbitrary types of edges. Each internal graph is linked to a str of characters that semantically represent the type of edge the user wants. E.g. 'bidirected': nx.Graph links the word 'bidirected' with a nx.Graph object to represent the set of bidirected edges.
nx.algorithms.causal submodule is a set of general-purpose causal algorithms that operate over mixed edge graphs. The most canonical example is the generalization of d-separation to that of "m-separation". It's pretty much the same thing, except now there are bidirected edges.

In general, I realize this is a big PR, so once the high-level changes are deemed desirable by the networkx team, I can just simply break up the PR into smaller chunks to make reviewing easier at the "detailed" level.

Relevant Examples of MixedEdgeGraph Usage

The most prominent explicit example demonstrating how MixedEdgeGraph is used is in Pywhy-graphs
:

The ADMG class (bidirected, directed edges and sometimes undirected): link
The PAG class (directed, bidirected, undirected and directed edges with circle endpoints): link

Here is an example spun up in the pywhy-graphs docs.

Note (11/20/23): We will most likely rename these graphs to something like DiBiUnGraph and DiBiUnCiGraph to reflect the same design philosophy networkx took.

Some notes on how mixed-edge graph API differs from regular networkx graphs. Iterating through the graph is not exactly the same, and I don't have a great way of consolidating. Iterating through a NetworkX graph iterates through the edges. However, when we iterate through a Mixed-edge-graph, what do we then return? Edge types and then edges, or edges and their edge-types simultaneously?

Misc.

See: https://github.com/py-why/dowhy/wiki/Networkx-Proposals-For-MixedEdgeGraph-Class-To-Enable-Causal-Graphs for a high level API design overview.

See: #5811 for discussion on the inclusion of graphs with mixed edges for the sake of causality based procedures.

cc: @dschult

cc folks invested in py-why: the result of this thread will affect future API and operations in pywhy. @amit-sharma @robertness @darthtrevino @petergtz @bloebp @emrekiciman @jaron-lee

adam2392 · 2022-08-29T19:08:29Z

Kay I got the CI passing. I think this is ready for discussion @dschult . Lmk if perhaps a call would be easier.

adam2392 · 2022-09-07T13:51:32Z

Hi @dschult just wanted to follow up to see if we could converse on some of the high level design choices made here and if they're in line w/ what you think would work and be acceptable from maintainers end. Lmk if you need more time tho. I totally understand since this is a pretty large PR.

We are actively building up causal algorithms in py-why/dodiscover and also specific causal graphs in py-why/pywhy-graphs, which all rely on the underlying MixedEdgeGraph presented here. So preferably can converge on the API, so I'm refactoring fewer LOC later :p

dschult · 2022-10-12T03:07:05Z

Thanks for the gentle ping. I still am time-crunched but took a quick look. Would it reduce maintenance burden to make MixedEdgeGraph a subclass of Graph? Then much of the node functionality would already be there.

The general approach looks good to me.

adam2392 · 2022-10-13T19:36:54Z

Sure completely understand!

Re subclassing Graph: The issue I have there is how to handle the factory methods. My understanding is with #5850 we are at least improving this pattern. E.g. mainly how to handle:

adjlist_outer_dict_factory
adjlist_inner_dict_factory

Should I override the The Graph class uses a dict-of-dict-of-dict data structure. structure with a dict-of-dict-of-dict of edge_type dict data structure?

Or for now... should I just make the inheritance pretty ugly for the sake of getting it to work and at least subclassing the node functionality?

dschult · 2022-10-17T00:28:25Z

Well -- I've looked through your code a little more carefully. And I have to say that I don't see that subclassing will help very much. So many of the methods would have to be changed, it would probably be harder to read than the current version.

I'll go through in more detail. But I think this is a good framework and a good design choice.
Thanks!

MridulS · 2023-09-22T15:28:58Z

@adam2392 Thanks for this! I was revisiting this PR and I was also skimming through https://github.com/py-why/pywhy-graphs (it looks nice!). Do you think given pywhy-graphs exists should we try to merge the MixedEdgeGraph graph upstream in NetworkX? Personally I would love to see to get this in NetworkX but I also don't want to slow down the pace of development if merging this inside NetworkX means pywhy-graphs wouldn't be able to quickly experiment with new stuff.

adam2392 · 2023-09-22T16:02:30Z

@MridulS thanks for following up!

Yes we would be interested in getting some form of this into networkX! Right now tho there's various class functions that don't work "exactly the way" I think networkx does but perhaps this can be improved in some iterations with networkx dev team input.

I think if the API or internals change and it affects pywhy-graphs that's fine. ESP rn it's in early stages of experimentation.

What's the best way to proceed here?

MridulS · 2023-09-24T19:43:13Z

What's the best way to proceed here?

I think first step would be either cleaning up this PR or creating a new one. I also see that pywhy-graphs uses type annotations everywhere, we aren't there yet. Would that be a blocker to get MixedEdgeGraph into networkx?

adam2392 · 2023-09-25T16:27:26Z

No it wouldn't be a blocker. We can remove the type annotations. I can adjust the PR here.

adam2392 · 2023-09-25T16:50:41Z

networkx/algorithms/causal/convert.py

We can remove this

adam2392 · 2023-09-25T16:54:28Z

Kay @MridulS lmk wdyt? This contains our WIP example, documentation and related algorithms for the new class MixedEdgeGraph, which we would utilize in pywhy-graphs.

Re MixedEdgeGraph usage in pywhy-graphs: Just for transparency, we are planning on refactoring the pywhy-graphs graphs to contain instead of things like ADMG and CPDAG to have graph classes:

DiBiUnGraph: directed, bidirected and undirected edge graph (i.e. ADMG, or CPDAG, MAG, )
DiBiUnCiGraph: the above, with also circular endpoint edges (i.e. PAG)

This is more inline w/ networkx, which puts responsibility of checking if a graph is of a certain type (e.g. acyclic) on a function instead. We would then implement these specific functions. However, as far as I can tell, this won't require any changes to nx.MixedEdgeGraph. But any significant changes to MixedEdgeGraph might impact our plan downstream.

Signed-off-by: Adam Li <adam2392@gmail.com>

* improve m-separation property runtime and efficiency Signed-off-by: Jaron Lee <jaron2005@gmail.com> * Add unit tests and coverage of error statements Co-authored-by: Adam Li <adam2392@gmail.com>

Signed-off-by: Adam Li <adam2392@gmail.com>

dschult · 2023-09-25T18:56:54Z

I tried to rebase this PR on the current main so it can pass the tests and github, workflows, etc.
The pytest test-finding routine was finding duplicates in causal and mixed_edge and balking. So I ended up removing the causal directory because it looked like it was an early version of the tools in mixed_edge directory. If that isn't correct, either open a new PR or let me know which files go where.

It looks like there might still be some errors/updates where the code tries to import from pywhy, etc.
Can you try to pull this down to your local repo and then work with it? I thought it would be easier for me to rebase than for you to go through that. Hopefully I didn't cause more trouble than it was worth.

Signed-off-by: Adam Li <adam2392@gmail.com>

adam2392 · 2023-09-25T19:42:07Z

Thanks @dschult !

I fixed the doc-test. I think this works now at least locally for me. I'll take a pause here to allow you all to take a look?

adam2392 · 2023-09-28T13:58:56Z

@MridulS and @dschult sorry just to clarify, I should leave this as is until you both have a chance to look things over, correct? Just wanted to make sure I am not missing anything. Thanks!

MridulS · 2023-09-28T14:07:33Z

Yes @adam2392. I'll try to go over this PR this week. Thanks again for the updates :)

dschult

This looks like the right direction for moving forward. That is, it looks good!
I think you can go ahead and make other changes and improvements. It is a pretty big change in terms of lines of code, so it's hard to review, but I think it is generally fine. I have added a few minor suggestions below.

dschult · 2023-09-27T19:31:39Z

examples/mixededge/plot_mixed_edge_graph.py

+# Using the ``MixedEdgeGraph``, we can represent a causal graph
+# with two different kinds of edges. To create the graph, we
+# use networkx ``nx.DiGraph`` class to represent directed edges,
+# and ``nx.Graph`` class to represent edges without directions (i.e.
+# bidirected edges). The edge types are then specified, so the mixed edge
+# graph object knows which graphs are associated with which types of edges.


I think these comment lines repeat what is described above in the doc_string. This should probably be removed and any new ideas from here should be transferred to the doc_string. (maybe? push back if you think otherwise)

I agree. I re-wrote this part to just have a short summary.

examples/mixededge/plot_mixed_edge_graph.py

dschult · 2023-10-11T19:28:47Z

networkx/algorithms/mixed_edge/mixed_edge_moral.py

+    bidirected_edge_name="bidirected",
+):
+    """Return the moral graph from an ancestral graph  in :math:`O(|V|^2)`.
+


Can you add a paragraph reminding people what the moral graph is and what an ancestral graph is?
Also, you should move the complexity statement out of the first line (that is used for short text near any link in the docs to this function. That first line should have a blank line after it. Then the paragraph.

Done lmk wdyt.

Yay -- progress!

Some comments on the paragraph:
How can the undirected moral graph have a v-structure (which is directed) (it sounds like it is from the same graph)?
Is it correct that the v-structure appears in the ancestral graph? You don't actually say what the ancestral graph is -- just that it contains many types of edges. How are those edges determined (from ancestors?)
Also, the moral and ancestral graph must be related in some way, but the paragraph doesn't say how they are related. :}

Thanks!

Okay tried to address this in a better way. Lmk if this still could be improved!

Signed-off-by: Adam Li <adam2392@gmail.com>

adam2392 · 2023-11-20T22:23:28Z

Apologies @dschult and @MridulS for the delay. I've been backlogged on stuff to do. I just addressed some of Dan's comments.

I also updated the PR description to note some nuances that I forgot to describe when comparing networkx graphs and mixed-edge-graph.

Btw totally happy to break up this PR, but just wanted clarity on how to best move forward first, slash if this is even something networkx wants (i.e. I don't want to put in the time to refactor code for PR sake if the PR is not desired).

Signed-off-by: Adam Li <adam2392@gmail.com>

adam2392 · 2023-11-21T04:01:49Z

I am wondering if it would be better for G.adj[node] to return a dictionary of nodes with edge_type being the data, rather than returning a nested dictionary of the form {edge_type: G.get_graph(edge_type).adj}?

If we return a dictionary of nodes with always the edge type, this is more in-line w/ how networkx currently works. The only change is there is essentially always an extra "node attribute" that is returned that indicates the edge type. On the other hand, if we return the nested dictionary, it would always preserve the edge type semantics.

adam2392 mentioned this pull request Aug 30, 2022

implement more efficient m-separation property with tests py-why/graphs#5

Closed

5 tasks

adam2392 mentioned this pull request Sep 22, 2022

[DOC] Questions about the internal data structure for DynDiGraph and DynGraph GiulioRossetti/dynetx#144

Open

adam2392 mentioned this pull request Dec 21, 2022

[Networkx] Add MixedEdgeGraph class into pywhy-graphs py-why/pywhy-graphs#28

Closed

jaron-lee mentioned this pull request Feb 23, 2023

Fixes minimal d-separator function failing to handle cases where no d-separators exist #6438

Closed

adam2392 commented Sep 25, 2023

View reviewed changes

networkx/algorithms/causal/convert.py Outdated

Copy link

Contributor Author

adam2392 Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove this

adam2392 and others added 12 commits September 25, 2023 14:14

Basic initial commit

8905a09

Add warning

e9e3952

Add warning

244fbae

Add causal updates

045f925

Fix imports

b87225d

Fix ci

9c902f7

Signed-off-by: Adam Li <adam2392@gmail.com>

Fix ci

5770780

Signed-off-by: Adam Li <adam2392@gmail.com>

Fix ci

cce4fb6

Signed-off-by: Adam Li <adam2392@gmail.com>

fix import

c81d6ff

Signed-off-by: Adam Li <adam2392@gmail.com>

Fix error unnecessary

78cc122

Improved m-separation property (#1)

9efe8b7

* improve m-separation property runtime and efficiency Signed-off-by: Jaron Lee <jaron2005@gmail.com> * Add unit tests and coverage of error statements Co-authored-by: Adam Li <adam2392@gmail.com>

Ran precommit

1c70bfc

Signed-off-by: Adam Li <adam2392@gmail.com>

dschult force-pushed the mixededge branch from 25e79a3 to 1c70bfc Compare September 25, 2023 18:18

dschult added 2 commits September 25, 2023 14:30

try to clean up duplicate functions

6de96ba

fix linters

af63ea9

fix errors

97d321d

Fix doctest

39a0326

Signed-off-by: Adam Li <adam2392@gmail.com>

dschult reviewed Oct 11, 2023

View reviewed changes

adam2392 added 2 commits November 20, 2023 17:16

Merge main

51c2be2

Signed-off-by: Adam Li <adam2392@gmail.com>

Address dan's comments' -s

946f9f2

Signed-off-by: Adam Li <adam2392@gmail.com>

Add better description

e6f5308

Signed-off-by: Adam Li <adam2392@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] MixedEdgeGraph for enablement of causal graphs #5947

[ENH] MixedEdgeGraph for enablement of causal graphs #5947

adam2392 commented Aug 27, 2022 •

edited

adam2392 commented Aug 29, 2022

adam2392 commented Sep 7, 2022

dschult commented Oct 12, 2022

adam2392 commented Oct 13, 2022

dschult commented Oct 17, 2022

MridulS commented Sep 22, 2023

adam2392 commented Sep 22, 2023

MridulS commented Sep 24, 2023

adam2392 commented Sep 25, 2023

adam2392 Sep 25, 2023

adam2392 commented Sep 25, 2023

dschult commented Sep 25, 2023

adam2392 commented Sep 25, 2023

adam2392 commented Sep 28, 2023

MridulS commented Sep 28, 2023

dschult left a comment

dschult Sep 27, 2023

adam2392 Nov 20, 2023

dschult Oct 11, 2023

adam2392 Nov 20, 2023

dschult Nov 21, 2023

adam2392 Nov 21, 2023

adam2392 commented Nov 20, 2023 •

edited

adam2392 commented Nov 21, 2023

[ENH] MixedEdgeGraph for enablement of causal graphs #5947

Are you sure you want to change the base?

[ENH] MixedEdgeGraph for enablement of causal graphs #5947

Conversation

adam2392 commented Aug 27, 2022 • edited

Summary of changes

Relevant Examples of MixedEdgeGraph Usage

Misc.

adam2392 commented Aug 29, 2022

adam2392 commented Sep 7, 2022

dschult commented Oct 12, 2022

adam2392 commented Oct 13, 2022

dschult commented Oct 17, 2022

MridulS commented Sep 22, 2023

adam2392 commented Sep 22, 2023

MridulS commented Sep 24, 2023

adam2392 commented Sep 25, 2023

adam2392 Sep 25, 2023

Choose a reason for hiding this comment

adam2392 commented Sep 25, 2023

dschult commented Sep 25, 2023

adam2392 commented Sep 25, 2023

adam2392 commented Sep 28, 2023

MridulS commented Sep 28, 2023

dschult left a comment

Choose a reason for hiding this comment

dschult Sep 27, 2023

Choose a reason for hiding this comment

adam2392 Nov 20, 2023

Choose a reason for hiding this comment

dschult Oct 11, 2023

Choose a reason for hiding this comment

adam2392 Nov 20, 2023

Choose a reason for hiding this comment

dschult Nov 21, 2023

Choose a reason for hiding this comment

adam2392 Nov 21, 2023

Choose a reason for hiding this comment

adam2392 commented Nov 20, 2023 • edited

adam2392 commented Nov 21, 2023

adam2392 commented Aug 27, 2022 •

edited

adam2392 commented Nov 20, 2023 •

edited