Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shouldn't OrderedGraph.subgraph() preserve edge order? #2048

Closed
baharev opened this issue Mar 28, 2016 · 12 comments
Closed

Shouldn't OrderedGraph.subgraph() preserve edge order? #2048

baharev opened this issue Mar 28, 2016 · 12 comments
Assignees
Milestone

Comments

@baharev
Copy link

baharev commented Mar 28, 2016

My expectation was that when running:

from __future__ import print_function
from collections import OrderedDict
from networkx import DiGraph

class OrderedGraph(DiGraph):
    node_dict_factory = OrderedDict
    adjlist_dict_factory = OrderedDict

g = OrderedGraph()
g.add_nodes_from([1, 2, 3])
# Note the relative order: 2 and 1 below!
g.add_edges_from([(2, 3), (1, 3)])
print(list(g.pred[3]))
g_sub = g.subgraph([1, 2, 3])
print(list(g_sub.pred[3]))

to see:

[2, 1]
[2, 1]

that is, the relative order of the predecessors will be the same in the subgraph but this is clearly not the case, as I get

[2, 1]
[1, 2]

In graph g, the relative order of nodes 1 and 2 is 1, 2 in the nodes, but 2, 1 in the predecessors; most likely this is the reason why I see this. I checked with b6b362a.

What is the intended behavior? Or am I using the OrderedGraph wrong?

I am currently implementing reverse-mode automatic differentiation where the argument order of an operation, such as subtraction or division, has to be kept (z=x-y but in the subgraph I getz=y-x which is a disaster for me). Maybe I misunderstood what the OrderedGraph is meant to be used for.

@hagberg
Copy link
Member

hagberg commented Mar 28, 2016

I'd say that is unintended. The DiGraph.subgraph function intentionally uses a shortcut in assigning the successors and predecessors which can permute the order of predecessors (as you have found). This implementation of subgraph is slower but should preserve the order.

    def subgraph(self, nbunch):
        """Return the subgraph induced on nodes in nbunch.

        The induced subgraph of the graph contains the nodes in nbunch
        and the edges between those nodes.

        Parameters
        ----------
        nbunch : list, iterable
            A container of nodes which will be iterated through once.

        Returns
        -------
        G : Graph
            A subgraph of the graph with the same edge attributes.

        Notes
        -----
        The graph, edge or node attributes just point to the original graph.
        So changes to the node or edge structure will not be reflected in
        the original graph while changes to the attributes will.

        To create a subgraph with its own copy of the edge/node attributes use:
        nx.Graph(G.subgraph(nbunch))

        If edge attributes are containers, a deep copy can be obtained using:
        G.subgraph(nbunch).copy()

        For an inplace reduction of a graph to a subgraph you can remove nodes:
        G.remove_nodes_from([ n in G if n not in set(nbunch)])

        Examples
        --------
        >>> G = nx.Graph()   # or DiGraph, MultiGraph, MultiDiGraph, etc
        >>> G.add_path([0,1,2,3])
        >>> H = G.subgraph([0,1,2])
        >>> list(H.edges())
        [(0, 1), (1, 2)]
        """
        bunch = self.nbunch_iter(nbunch)
        # create new graph and copy subgraph into it
        H = self.__class__()
        # copy node and attribute dictionaries
        for n in bunch:
            H.node[n]=self.node[n]
        # namespace shortcuts for speed
        H_succ=H.succ
        H_pred=H.pred
        self_succ=self.succ
        self_pred=self.pred
        # add nodes
        for n in H:
            H_succ[n]=H.adjlist_dict_factory()
            H_pred[n]=H.adjlist_dict_factory()
        # add successors
        for u in H_succ:
            Hnbrs=H_succ[u]
            for v,datadict in self_succ[u].items():
                if v in H_succ:
                    Hnbrs[v]=datadict
        # add predecessors
        for u in H_pred:
            Hnbrs=H_pred[u]
            for v,datadict in self_pred[u].items():
                if v in H_pred:
                    Hnbrs[v]=datadict
        H.graph=self.graph
        return H

@hagberg
Copy link
Member

hagberg commented Mar 28, 2016

Comments on adding this as a fix or a more efficient solution?

@baharev
Copy link
Author

baharev commented Mar 28, 2016

I haven't tested it yet but your suggested solution is fine with me. Should I just add this method to my OrderedGraph class? In that case you might want to add this code snippet to the documentation

https://networkx.github.io/documentation/latest/reference/classes.digraph.html

as well. As it is now in the documentation, I see this issue as a banana skin.

@hagberg
Copy link
Member

hagberg commented Mar 28, 2016

Try adding it and see if it works for you. We'll figure out how to proceed with a fix with the code and/or docs.

@baharev
Copy link
Author

baharev commented Mar 28, 2016

OK. At the moment we are in the middle of releasing another, unrelated open source software here. I will get back to you soon.

@baharev
Copy link
Author

baharev commented Mar 29, 2016

I did some sanity checks and some profiling, and everything seems sane and the performance is also OK. The code is here:

https://github.com/baharev/nx-ordered-graph/blob/master/main.py

I certainly did not go overboard with the testing: I just checked the simplest case. As for the performance, the subgraph() call takes as long as adding the nodes and the edges from g_rnd to g_orig, so I don't see how it could be faster than that.

So everything looks fine to me.

The only catch that I see is that the nodes to the subgraph() call have to be enumerated in the same relative order as they are in the source graph, otherwise things are still messed up silently. It is not a problem for me but I do see it as a pitfall.

One solution would be to provide a keyword argument to subgraph() which, by default, orders the nodes in nbunch in the same relative order as they are in the original graph. The user then can disable it if he/she knows for sure that the order of nbunch is correct and there is no need to order it.

What do you think? Did I miss anything?

@dschult
Copy link
Member

dschult commented Mar 29, 2016

Thanks @baharev for tracking this down, creating a very nice example of the problem, and putting together the code to check @hagberg's fix.

I agree that there doesn't seem to be a better way to create the subgraph while matching the order of both successors and predecessors in the subgraph to the order in the graph.

As far as how best to implement this, it looks like this version of subgraph takes twice as long as the existing version for simple (unordered, no attributes, etc) example graphs (which makes sense since we are essentially traversing the edges twice). So I guess we shouldn't put it into the standard DiGraph class.

We definitely should add it to the new OrderedDiGraph class provided in the classes/ordered.py module. We should probably also rewrite the docs about classes to include pointers to the new OrderedGraph. As it is, the docs suggest building your own--but evidently that is more tricky than we thought. I wonder if there are other pitfalls lurking for ordered graphs in e.g. G.reversed or G.to_undirected, G.copy etc. Probably good to go through those to at least check what the expected result should be.

@baharev
Copy link
Author

baharev commented Mar 29, 2016

@dschult Tough questions. I am working with expression graphs for automatic differentiation. I don't exactly have uses cases for G.reversed or G.to_undirected, G.copy. I asked this questions because I did not know what the G.subgraph was supposed to do either. It seems to me we all agree that, at least in my use case, the right thing seems to be keeping the relative order in subgraph.

As for reversed, I would either keep the original order, or add a keyword argument to indicate wether the node and edge order should be reversed too. As I said: I don't have any use case for this, I am just guessing here.

The copy should make a logically equivalent copy, meaning that both the node and the edge order should be kept. I do not see any other sensible way to implement copy.

The edge order in the to_undirected call seems to be quite tricky. I would need use cases for this to tell. I see multiple possibilities and I do not have any use case, so I will just pass on this one.

@baharev
Copy link
Author

baharev commented Apr 1, 2016

@dschult @hagberg 3 days have passed and now I am just pinging: Is there anything else I can help you with?

@hagberg
Copy link
Member

hagberg commented Apr 1, 2016

If you want to help you could check out the behavior of copy, to_undirected and reversed. Those might be out of scope for your interest but we'll need to figure that out before we decide on how to fix this.

@hagberg hagberg added this to the networkx-2.0 milestone Apr 11, 2016
@dschult
Copy link
Member

dschult commented Jul 10, 2017

I did some checking on whether OrderedDiGraph adjdict order is preserved with methods.

G.copy works OK with OrderedDiGraph
G.reverse does not work- The new order is that of G.edges() not G.succ or G.pred
G.subgraph does not work (as described above). Order of G.succ is Ok but G.pred changes.
G.to_undirected does not work - Returns a Graph instead of an OrderedGraph

I also re-noticed that the edges are reported in a different order than the order they are added. They are reported in the order of the adjacency dict-of-dict. Even if the dicts are ordered, the edges may lead to differences between edge-add-order and edge-report-order.

G.add_nodes_from([3, 2, 1])
G.add_edges_from([(1, 3), (2, 3)])
G.edges()    # -> [(2, 3), (1, 3)]     because the node order dominates the edge reporting order.

Even if we don't use add_nodes_from to specify the order, earlier edges can create a node order that misaligns add_order from report_order. So OrderedGraph and friends don't preserve edge-add-order. They only preserve node-add-order but at least they preserve SOME order for the edges. But it is NOT the add-order. Let's make this clear in the docs.

@dschult
Copy link
Member

dschult commented Jul 24, 2017

The ordering works as you expect for the subgraph filters proposed in #2523

@dschult dschult self-assigned this Jul 28, 2017
dschult added a commit to dschult/networkx that referenced this issue Jul 30, 2017
The bug in networkx#2048 would be fixed by the subgraph view machinery.
dschult added a commit that referenced this issue Jul 31, 2017
* Add reversed/to_directed/to_undirected views

Refactor views. Add tests.

* correct treatment of slots in coreviews

* Add tests for subgraph filters and to show #2048 reported bug

The bug in #2048 would be fixed by the subgraph view machinery.

* Fix to_(un)directed when already that direction

* Rename subgraph module to avoid conflict

* Simplify subgraphviews and tests

* refactor subgraphviews to fit coreviews structure

* Move subgraphviews into graphviews

* Rename views.py to reportviews.py

* Move functions out of views modules into function.py and simplify

* separate contextmanager from views for now.

* Fix induced_subgraph in function.py

* get reshuffle of modules right
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants