Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

To implement graph union and graph difference by vertex names in Python, in the same way it is already implemented in R #269

Closed
cndesantana opened this issue Sep 23, 2015 · 8 comments
Labels
todo Triaged for implementation in some unspecified future version

Comments

@cndesantana
Copy link

The native C core of igraph supports graph union only by IDs and not
by vertex names. The R interface contains an implementation of "graph
union by vertex names", and it is selected automatically if your graph
has vertex names. However, There is no equivalent function in the Python
interface.

It would be great if you could implement graph union and graph difference by vertex names in Python, in the same way it is already implemented in R.

This issue was also discussed here: https://lists.nongnu.org/archive/html/igraph-help/2015-09/msg00083.html

Thanks

@cndesantana cndesantana changed the title To implement graph union and graph difference by vertex names in Python To implement graph union and graph difference by vertex names in Python, in the same way it is already implemented in R Sep 23, 2015
@hakeemo
Copy link

hakeemo commented Nov 29, 2016

I needed a similar functionality. I have written a proof of concept for such a function below, however this isn't extensively tested. I might polish it and make a merge request if I find time. For now it only maintains vertex attributes.
What other functionality would be needed. Also would it be acceptable to use numpy for some operations?

import igraph as ig

def union(g_1, g_2, v_attr_name):
    assert len(set(g_1.vs[v_attr_name])) == len(g_1.vs[v_attr_name]), "Merging attribute must be unique"
    assert len(set(g_2.vs[v_attr_name])) == len(g_2.vs[v_attr_name]), "Merging attribute must be unique"
    
    v_attr_1 = g_1.vs[v_attr_name]
    v_attr_2 = g_2.vs[v_attr_name]

    edge_list_by_attribute_1 = set([ tuple(sorted([v_attr_1[u], v_attr_1[v]])) for u,v in g_1.get_edgelist()])
    edge_list_by_attribute_2 = set([ tuple(sorted([v_attr_2[u], v_attr_2[v]])) for u,v in g_2.get_edgelist()])
    
    edge_list_by_attribute_merged  = edge_list_by_attribute_1.union(edge_list_by_attribute_2)
        
    v_attr_merged = sorted(list(set(g2.vs[v_attr_name]).union(set(g1.vs[v_attr_name]))))
    
    attribute_to_ind = {v_attr_merged:i for i, v_attr_merged in enumerate(v_attr_merged)}
    edge_list_merged = [ (attribute_to_ind[i], attribute_to_ind[j]) for i, j in edge_list_by_attribute_merged]
    
    graph_merged = ig.Graph(edge_list_merged)
    graph_merged.vs[v_attr_name] = v_attr_merged

    # Include attributes that are in both g_1 and g_2. If different attribute values are present in a vertex, 
    # then the one of g_1 is used
    for attr_name_other in set(g_2.vs.attributes()).intersection(set(g_1.vs.attributes())).difference([v_attr_name]):
        attr_other = dict()
        for v in g_2.vs():
            attr_other[ attribute_to_ind[v[v_attr_name]]] =  v[attr_name_other]

        for v in g_1.vs():
            attr_other[ attribute_to_ind[v[v_attr_name]]] =  v[attr_name_other]
        graph_merged.vs[attr_name_other] = [attr_other[i] for i in xrange(graph_merged.vcount())]
        
    return graph_merged

@ntamas
Copy link
Member

ntamas commented Nov 30, 2016

Thanks a lot for your work! I'll be happy to review it if you send a pull request. A few comments for the current version:

  1. Instead of len(g_1.vs[v_attr_name]) in the assertion, you can simply write g_1.vcount().
  2. Assertions get stripped by the Python interpreter when you run it in optimized mode (python -O), so I would use a standard if clause here to ensure that these checks do not get stripped.
  3. v_attr_1 and v_attr_2 could be created before the assertions so you don't need to query the attributes once again for the assertions themselves.
  4. Does it also work for directed graphs or only for undirected ones? A final version that would be merged into python-igraph would need to handle both directed and undirected graphs as well as graphs with multiple edges. (That's one of the reasons why I haven't started implementing this yet - it is more complicated to get it right than it first seems).
  5. We need to take care of all the other vertex and edge attributes in g_1 and g_2; in particular, we must keep the edge attributes from g_1 and g_2, and we must also decide what to do when there is a vertex in g_1 and g_2 with the same value for the v_attr_name vertex attribute but different values.

@hakeemo
Copy link

hakeemo commented Nov 30, 2016

Thanks for the feedback.

  1. I only wrote it with undirected in mind. I didn't work with directed graphs before, but I assume that for undirected graphs, the get_edgelist() function returns the edge tuples with the vertex indices sorted and in the directed case they are not sorted:
>>> g = ig.Graph(directed=True)
>>> g.add_vertices(3)
>>> g.add_edges([(0, 1), (1,0), (1,2)])
>>> g.get_edgelist()
[(0, 1), (1, 0), (1, 2)]
>>>
>>> g = ig.Graph()
>>> g.add_vertices(3)
>>> g.add_edges([(0, 1), (1,0), (1,2)])
>>> g.get_edgelist()
[(0, 1), (0, 1), (1, 2)]

Is it safe to assume this behaviour?
So this would mean the call to sorted in edge_list_by_attribute_1 = set([ tuple(sorted([v_attr_1[u], v_attr_1[v]])) for u,v in g_1.get_edgelist()]) is pointless.

  1. For multigraphs, I suppose that the user has to provide an edge attribute name as well to distinguish between the edges, as there would be no way to determine the intent of the user without this information. Would that be a suitable requirement of the function?

@stale
Copy link

stale bot commented Jan 22, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@cndesantana
Copy link
Author

Hi people. I think the issue is solved, right? See https://github.com/igraph/igraph/milestone/8

Btw: thanks you all for your hard work and efforts!

@vtraag vtraag transferred this issue from igraph/igraph Jan 22, 2020
@ntamas
Copy link
Member

ntamas commented Jan 22, 2020

No, it's not solved yet; there are too many corner cases to consider (see my earlier comment and I haven't had time to implement it properly.

@stale
Copy link

stale bot commented Mar 22, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Mar 22, 2020
@stale stale bot closed this as completed Apr 5, 2020
@vtraag vtraag added the todo Triaged for implementation in some unspecified future version label Apr 6, 2020
@vtraag vtraag reopened this Apr 6, 2020
@stale stale bot removed the stale label Apr 6, 2020
@ntamas
Copy link
Member

ntamas commented Mar 9, 2021

Implemented in #310 so closing.

@ntamas ntamas closed this as completed Mar 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
todo Triaged for implementation in some unspecified future version
Projects
None yet
Development

No branches or pull requests

4 participants