# Find Subgraphs of NDEx Networks with Neighborhood Queries

You can find subgraphs of NDEx networks based on queries in which a list of strings is applied to a network. The result returned is a CX network containing nodes whose names match one of the strings and nodes that are "nearby" in the network, the "neighborhood" subgraph. A common use for these queries is to find subgraphs of an interactome based on a list of gene names.

This tutorial requires Python 3.6+ and the ndex2 module. 

See [the NDEx2 Client](https://github.com/ndexbio/ndex2-client) for installation instructions.

## Modules Required for this Tutorial

In [1]:
import ndex2.client as nc
import ndex2
import io
import json
from IPython.display import HTML
from time import sleep

## Configure an Anonymous NDEx Client
We create an NDEx client object to access the NDEx public server anonymously, then test the client by getting the current server status.

In [2]:
anon_ndex=nc.Ndex2("http://public.ndexbio.org")
anon_ndex.update_status()
networks = anon_ndex.status.get("networkCount")
users = anon_ndex.status.get("userCount")
groups = anon_ndex.status.get("groupCount")
print("anon client: %s networks, %s users, %s groups" % (networks, users, groups))

anon client: 191553 networks, 2781 users, 190 groups


### Load a Utility to Display Networks in the Notebook
The cytoscape-jupyter-widget can display the query result networks.
See [Cytoscape-jupyter-widget](https://github.com/cytoscape/cytoscape-jupyter-widget/blob/develop/examples/WidgetDemo1.ipynb) for installation instructions.

In [13]:
from cyjupyter import Cytoscape

In [37]:
# A convenience function to summarize the CX returned by queries
def print_cx_summary(cx):
    number_of_nodes=0
    number_of_edges=0
    for aspect in cx:
        if 'nodes' in aspect:
            number_of_nodes = len(aspect['nodes'])
        if 'edges' in aspect:
            number_of_edges = len(aspect['edges'])

    print("the network contains %s nodes and %s edges." % (number_of_nodes, number_of_edges))

## Query Types
The NDEx query service supports 5 types of query
![Query Types](images/query-types.png)

### Find a Reference Network to Query
Search for networks in NDEx from the STRING database

We will use the [STRING - Human Protein Links - High Confidence (Score > 0.7)](http://public.ndexbio.org/#/network/275bd84e-3d18-11e8-a935-0ac135e8bacf) as our example network


In [3]:
result_networks=anon_ndex.search_networks(search_string='STRING AND owner:"string"', size=10)
print("%s networks found" % (len(result_networks.get('networks'))))
print("\nNetworks:\n")
for ns in result_networks.get('networks'): print("  %s \t %s \t %s" % (ns.get('externalId'),ns.get('name'), ns.get('owner')) )

2 networks found

Networks:

  d14db454-3d18-11e8-a935-0ac135e8bacf 	 STRING - Human Protein Links 	 string
  275bd84e-3d18-11e8-a935-0ac135e8bacf 	 STRING - Human Protein Links - High Confidence (Score >= 0.7) 	 string


### Neighborhood Query

You can retrieve a ‘neighborhood’ subnetwork of a network as a CX object. The query finds the subnetwork by first 
identifying nodes that are associated with identifiers in the search_string, then traversing a specified number of 
edges starting from those nodes.  The **search_depth** parameter controls the search, defaults to 1 edge and can be no more than 3 edges. 

In [20]:
neighborhood_cx=anon_ndex.get_neighborhood('275bd84e-3d18-11e8-a935-0ac135e8bacf', 'XRN1',1, 6000)
print_cx_summary(neighborhood_cx)

the network contains 76 nodes and 621 edges.


In [21]:
Cytoscape(data=neighborhood_cx, format='cx')

Cytoscape(data=[{'numberVerification': [{'longNumber': 281474976710655}]}, {'metaData': [{'name': 'cyHiddenAtt…

### Interconnect Query
The interconnect query is only available in NDEx2 client version 3.2.0 and above.

In [39]:
interconnect_cx=anon_ndex.get_interconnectquery('275bd84e-3d18-11e8-a935-0ac135e8bacf', 
                                                         'XRN1 MDM2 CDK2 CDK6 HIF1A RAD51 BRCA1 TP53',1, 6000)

print_cx_summary(interconnect_cx)

the network contains 8 nodes and 13 edges.


In [35]:
Cytoscape(data=interconnect_cx, format='cx')

Cytoscape(data=[{'numberVerification': [{'longNumber': 281474976710655}]}, {'metaData': [{'name': 'cyHiddenAtt…

## Examples of Using Query Result Networks
We start by creating a NiceCX object from the CX.
NiceCX supports basic operations on the networks and conversion to other formats.

In [28]:
niceCX=ndex2.create_nice_cx_from_raw_cx(interconnect_cx)
niceCX.print_summary()

Name: Direct query result on network - STRING - Human Protein Links - High Confidence (Score >= 0.7)
Nodes: 22
Edges: 42
Node Attributes: 66
Edge Attributes: 630



## Convert the Network to a Pandas Dataframe

In [29]:
niceCX.to_pandas_dataframe()

Unnamed: 0,source,interaction,target
0,ARFGEF2,interacts-with,ARF1
1,ARFGEF1,interacts-with,ARF1
2,FOXO4,interacts-with,FOXO3
3,FOXO4,interacts-with,FOXO1
4,BRCA2,interacts-with,BRCA1
5,TP53,interacts-with,FOXO4
6,TP53,interacts-with,BRCA1
7,TP53,interacts-with,FOXO3
8,TP53,interacts-with,BRCA2
9,ARF6,interacts-with,ARFIP1
