Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Layouts fail for directed graphs #655

Closed
bdpedigo opened this issue Feb 7, 2021 · 11 comments · Fixed by #807
Closed

[BUG] Layouts fail for directed graphs #655

bdpedigo opened this issue Feb 7, 2021 · 11 comments · Fixed by #807
Labels
bug Something isn't working discussion graspologic team discussion

Comments

@bdpedigo
Copy link
Collaborator

bdpedigo commented Feb 7, 2021

Expected Behavior

Layouts should work with directed graphs (IMO)
Perhaps not a bug in the sense that I'm not sure we promise this works for directed, but still. If we decided to only support undirected maybe a clearer error is warranted.

Actual Behavior

NetworkXNotImplemented: not implemented for directed type

Example Code

import networkx as nx
from graspologic.layouts import layout_umap

g = nx.erdos_renyi_graph(10, 0.7, directed=True)
layout_umap(g)

Full Traceback

---------------------------------------------------------------------------
NetworkXNotImplemented                    Traceback (most recent call last)
~/JHU_code/maggot/maggot_connectome/scripts/what_is_flow_rank.py in 
      3 
      4 g = nx.erdos_renyi_graph(10, 0.7, directed=True)
----> 5 layout_umap(g)

~/miniconda3/envs/maggot-revamp/lib/python3.8/site-packages/graspologic/layouts/auto.py in layout_umap(graph, min_dist, n_neighbors, max_edges, random_seed)
    173     """
    174 
--> 175     lcc_graph, tensors, labels = _node2vec_for_layout(graph, max_edges, random_seed)
    176     points = umap.UMAP(
    177         min_dist=min_dist, n_neighbors=n_neighbors, random_state=random_seed

~/miniconda3/envs/maggot-revamp/lib/python3.8/site-packages/graspologic/layouts/auto.py in _node2vec_for_layout(graph, max_edges, random_seed)
    214 ) -> Tuple[nx.Graph, np.ndarray, np.ndarray]:
    215     graph = _approximate_prune(graph, max_edges)
--> 216     graph = _largest_connected_component(graph)
    217 
    218     start = time.time()

~/miniconda3/envs/maggot-revamp/lib/python3.8/site-packages/graspologic/layouts/auto.py in _largest_connected_component(graph)
    182 
    183 def _largest_connected_component(graph: nx.Graph) -> nx.Graph:
--> 184     largest_component = max(nx.connected_components(graph), key=len)
    185     return graph.subgraph(largest_component).copy()
    186 

 in connected_components(G)

~/miniconda3/envs/maggot-revamp/lib/python3.8/site-packages/networkx/utils/decorators.py in _not_implemented_for(not_implement_for_func, *args, **kwargs)
     74         if match:
     75             msg = f"not implemented for {' '.join(graph_types)} type"
---> 76             raise nx.NetworkXNotImplemented(msg)
     77         else:
     78             return not_implement_for_func(*args, **kwargs)

NetworkXNotImplemented: not implemented for directed type

Your Environment

  • Python version: 3.8
  • graspologic version: dev

Additional Details

183 def _largest_connected_component(graph: nx.Graph) -> nx.Graph:
--> 184     largest_component = max(nx.connected_components(graph), key=len)

Just from these 2 lines on the traceback, I suspect this just needs to be adapted to check for weakly connected components. Or, use graspologic.utils.is_fully_connected from our own codebase. However in #112 we are talking about replacing the networkx-based check with a faster one from scipy. Not sure if the layouts code will always assume networkx input going forward. AFAIK there is nothing that inherently needs to be undirected for the layouts process, or am I missing something?

@bdpedigo bdpedigo added the bug Something isn't working label Feb 7, 2021
@bdpedigo bdpedigo changed the title [BUG] [BUG] Layouts fail for directed graphs Feb 7, 2021
@bdpedigo bdpedigo added the discussion graspologic team discussion label Feb 7, 2021
@bdpedigo
Copy link
Collaborator Author

bdpedigo commented Feb 7, 2021

should discuss to what extent this should support directed graphs

@daxpryce
Copy link
Contributor

daxpryce commented Feb 8, 2021

@bryantower - we should talk about this. I would imagine we should be able to make layouts work for a directed graph, but maybe I'm missing something. I know we had concerns over node2vec with directed but I'm still not convinced we actually have a problem.

Of the phases of auto layouts, we have:

  • create node2vec embedding of LCC
  • down project into 2d using umap or tsne
  • no overlap
  • leiden (must be undirected)
  • render

This specific ticket is primarily talking about how we don't have support for the right way to get a directed LCC (we're calling the undirected LCC from networkx every time regardless of type). But thinking more broadly - should we be able to conceivably do a directed layout? I would think yes, but I know you were talking about it before and I don't remember what exactly you told me about it.

@bryantower
Copy link

I think the only place where undirected is really required is the Leiden portion. We should definitely be able to support a layout for directed graphs. We use the Leiden partitioning to generate the colors.

In the path specified in this bug, we should find the weakly connected component if the graph is directed and then do the appropriate transformation to make Leiden work.

@bdpedigo
Copy link
Collaborator Author

bdpedigo commented Feb 8, 2021

if leiden is used for auto-layouts coloring I do see how that could be an issue, but dont know that it should be a breaking one - i could imagine a world where embeddings happen on the directed graph but leiden on the undirected version of that graph, cause modularity makes sense in undirected land

@bdpedigo
Copy link
Collaborator Author

bdpedigo commented Feb 8, 2021

I think the only place where undirected is really required is the Leiden portion. We should definitely be able to support a layout for directed graphs. We use the Leiden partitioning to generate the colors.

In the path specified in this bug, we should find the weakly connected component if the graph is directed and then do the appropriate transformation to make Leiden work.

Seems like we're on the same page!

@daxpryce
Copy link
Contributor

daxpryce commented Feb 8, 2021

Exactly, we just need to be able to call the right LCC function based on type. As long as we're confident the rest makes sense we should be able to make this happen easily enough.

@bryantower
Copy link

@dwaynepryce I agree.

@bdpedigo
Copy link
Collaborator Author

bdpedigo commented Feb 8, 2021

https://github.com/microsoft/graspologic/blob/e6b72994f9b71ba2cfaaa2dcf0da33ebd2483ff9/graspologic/utils/utils.py#L417

https://github.com/microsoft/graspologic/blob/e6b72994f9b71ba2cfaaa2dcf0da33ebd2483ff9/graspologic/utils/utils.py#L463

these both are probably relevant - but i was also in the middle of having someone work on #112 - the idea being that scipy's LCC check is much faster than networkx. for an embedding (say a sparse matrix is being passed to ASE) i think making that switch makes more sense. but for your use case it is probably easier to stay in networkx. perhaps we should talk about what those functions should do?

@daxpryce
Copy link
Contributor

daxpryce commented Feb 8, 2021

Would be interesting to see if the time-to-convert an nx graph into a csr matrix and run that lcc check is faster than doing it all in networkx. I'll keep you guys posted after I crunch some numbers :)

@bdpedigo
Copy link
Collaborator Author

bdpedigo commented Feb 8, 2021

thought about that also - i bet for a big graph it might be worth it

@bdpedigo
Copy link
Collaborator Author

bdpedigo commented Feb 8, 2021

FYI @kareef928 has been working on the conversion stuff so we'll keep him in the loop too!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working discussion graspologic team discussion
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants