<a href="https://colab.research.google.com/github/cytoscape/cytoscape-automation/blob/master/for-scripters/Python/loading-networks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Loading Networks
## Yihang Xin and Alex Pico
## 2020-11-14

In Cytoscape, network data can be loaded from a variety of sources, and in several different formats. Where you get your network data depends on your biological question and analysis plan. This tutorial outlines how to load network data from several popular sources and formats.

1. Public databases
    * NDEx
    * PSICQUIC
    * STRING/STITCH
    * WikiPathways
2. Local and remote files
3. Cytoscape apps (Biopax, KEGG and other formats)


# Installation
The following chunk of code installs the `py4cytoscape` module.

In [None]:
%%capture
!python3 -m pip install python-igraph requests pandas networkx
!python3 -m pip install py4cytoscape

If you are using a remote notebook environment such as Google Colab, please execute the cell below. (If you're running on your local notebook, you don't need to do that.)



In [None]:
import requests
exec(requests.get("https://raw.githubusercontent.com/cytoscape/jupyter-bridge/master/client/p4c_init.py").text)
IPython.display.Javascript(_PY4CYTOSCAPE_BROWSER_CLIENT_JS) # Start browser client

# Prerequisites
## In addition to this package (py4cytoscape latest version 0.0.9), you will need:
* Latest version of Cytoscape, which can be downloaded from https://cytoscape.org/download.html. Simply follow the installation instructions on screen.
* Complete installation wizard
* Launch Cytoscape
For this vignette, you’ll also need the WikiPathways app to access the WikiPathways database from within Cytoscape.

Install the WikiPathways app from http://apps.cytoscape.org/apps/wikipathways

Install the STRING app from https://apps.cytoscape.org/apps/stringapp

Install the filetransfer app from https://apps.cytoscape.org/apps/filetransfer

You can also install app inside Python notebook by running "py4cytoscape.install_app('Your App')"

## Import the required package


In [None]:
import os
import sys
import requests
import pandas as pd
import py4cytoscape as p4c
import ndex2.client as nc
import wget
from lxml import etree as ET

In [None]:
# Check Version
p4c.cytoscape_version_info()

{'apiVersion': 'v1',
 'cytoscapeVersion': '3.8.2',
 'automationAPIVersion': '1.0.0',
 'py4cytoscapeVersion': '0.0.6'}

# Networks from Public Data
Cytoscape includes a Network Search tool for easy import of public network data. In addition to core apps that are included with your Cytoscape installation (NDEx and PSICQUIC), the resources listed here will depend on which apps you have installed.

In [None]:
p4c.apps.get_installed_apps()

[{'appName': 'PSI-MI Reader',
  'version': '3.4.0',
  'description': 'null',
  'status': 'Installed'},
 {'appName': 'cyREST',
  'version': '3.11.1',
  'description': 'null',
  'status': 'Installed'},
 {'appName': 'stringApp',
  'version': '1.6.0',
  'description': 'null',
  'status': 'Installed'},
 {'appName': 'NetworkAnalyzer',
  'version': '4.4.6',
  'description': 'null',
  'status': 'Installed'},
 {'appName': 'Core Apps',
  'version': '3.7.0',
  'description': 'null',
  'status': 'Installed'},
 {'appName': 'CyCL',
  'version': '3.6.0',
  'description': 'null',
  'status': 'Installed'},
 {'appName': 'CyNDEx-2',
  'version': '3.3.1',
  'description': 'null',
  'status': 'Installed'},
 {'appName': 'FileTransfer',
  'version': '1.0',
  'description': 'null',
  'status': 'Installed'},
 {'appName': 'cyChart',
  'version': '0.3.0',
  'description': 'null',
  'status': 'Installed'},
 {'appName': 'JSON Support',
  'version': '3.7.0',
  'description': 'null',
  'status': 'Installed'},
 {'app

# NDEx
The NDEx Project provides an open-source framework where scientists and organizations can share, store, manipulate, and publish biological network knowledge.
* To search NDEx run the following code chunk. Here, we use “TP53 AND BARD1” as our search terms.

In [None]:
anon_ndex=nc.Ndex2("http://public.ndexbio.org")
anon_ndex.update_status()

In [None]:
networks = anon_ndex.search_networks(search_string='TP53 AND BARD1')
df_dict = networks["networks"]

In [None]:
ownerUUID_list = []
externalId_list = []
nodeCount_list = []
edgeCount_list = []

In [None]:
for d in df_dict:
    ownerUUID_list.append(d["ownerUUID"])
    externalId_list.append(d["externalId"])
    nodeCount_list.append(d["nodeCount"])
    edgeCount_list.append(d["edgeCount"])

In [None]:
df = pd.DataFrame(list(zip(ownerUUID_list,externalId_list,nodeCount_list,edgeCount_list)), columns =['ownerUUID', 'externalId','nodeCount','edgeCount'])
df.head()

Unnamed: 0,ownerUUID,externalId,nodeCount,edgeCount
0,301a91c6-a37b-11e4-bda0-000c29202374,5a1fcfb9-78c3-11e8-a4bf-0ac135e8bacf,30,101
1,363f49e0-4cf0-11e9-9f06-0ac135e8bacf,0d4f26c3-f912-11ea-99da-0ac135e8bacf,255,403
2,363f49e0-4cf0-11e9-9f06-0ac135e8bacf,7f6602f1-f916-11ea-99da-0ac135e8bacf,213,198
3,363f49e0-4cf0-11e9-9f06-0ac135e8bacf,c8a2cdf5-204b-11ea-bb65-0ac135e8bacf,213,198
4,363f49e0-4cf0-11e9-9f06-0ac135e8bacf,fdfc44e6-f911-11ea-99da-0ac135e8bacf,59,51


In [None]:
networkId = df["externalId"][0]

To import the network into Cytoscape, run the following code chunk.


In [None]:
p4c.cy_ndex.import_network_from_ndex(networkId)

51

# STRING/STITCH
STRING is a database of known and predicted protein-protein interactions, and STITCH stored known and predicted interactions between chemicals and proteins. Data types include:

* Genomic Context Predictions
* High-throughput Lab Experiments
* (Conserved) Co-Expression
* Automated Textmining
* Previous Knowledge in Databases

To search STRING with the disease keyword “ovarian cancer”, run the following code chunk. (The resulting network will load automatically.)

In [None]:
string_cmd_list = ['string disease query','disease="ovarian cancer"']
string_cmd = " ".join(string_cmd_list)
p4c.commands.commands_run(string_cmd)

["Loaded network 'String Network - ovarian cancer' with 100 nodes and 2238 edges"]

* Networks load with a STRING-specific style, which includes 3D protein structure diagrams.


In [None]:
p4c.network_views.export_image('ovarian_cancer', type='PNG')

{'file': 'C:\\Users\\YihangXin\\CytoscapeConfiguration\\filetransfer\\default_sandbox\\ovarian_cancer.png'}

* STRING networks also inlcude data as node/interaction attributes, that can be used to create a Style.

In [None]:
column_names = p4c.tables.get_table_column_names()
column_names.remove( 'stringdb::structures')

In [None]:
df = p4c.tables.get_table_columns(columns=column_names)
df.head()

Unnamed: 0,SUID,shared name,name,selected,stringdb::canonical name,display name,stringdb::full name,stringdb::database identifier,stringdb::description,@id,...,tissue::muscle,tissue::nervous system,tissue::pancreas,tissue::saliva,tissue::skin,tissue::spleen,tissue::stomach,tissue::thyroid gland,tissue::urine,stringdb::disease score
350,350,9606.ENSP00000377284,9606.ENSP00000377284,False,P15328,FOLR1,,9606.ENSP00000377284,Ovarian tumor-associated antigen MOv18; Binds ...,stringdb:9606.ENSP00000377284,...,1.6817,2.64464,1.81592,2.57943,1.98873,2.32688,1.93777,2.17475,1.24586,2.64105
351,351,9606.ENSP00000466834,9606.ENSP00000466834,False,O76085,ENSP00000466834,,9606.ENSP00000466834,DNA repair protein RAD51 homolog 4; Involved i...,stringdb:9606.ENSP00000466834,...,0.572677,4.20561,1.14334,0.964483,4.29394,0.53719,0.902816,0.913659,,2.73832
352,352,9606.ENSP00000331327,9606.ENSP00000331327,False,Q8IYZ5,WT1,,9606.ENSP00000331327,Wilms tumor protein; Transcription factor that...,stringdb:9606.ENSP00000331327,...,2.42609,2.44515,1.86355,1.55037,2.24419,2.65519,1.63816,1.79773,2.02184,2.60789
353,353,9606.ENSP00000011653,9606.ENSP00000011653,False,P01730,CD4,,9606.ENSP00000011653,T-cell surface antigen T4/Leu-3; Integral memb...,stringdb:9606.ENSP00000011653,...,3.08424,4.85061,4.56421,2.7666,3.47661,4.45149,3.00139,2.79453,2.56101,2.71615
354,354,9606.ENSP00000296511,9606.ENSP00000296511,False,P08758,ANXA5,,9606.ENSP00000296511,Placental anticoagulant protein 4; This protei...,stringdb:9606.ENSP00000296511,...,4.70337,4.81533,3.75453,2.95452,4.93929,3.72014,3.55828,3.1331,2.91597,2.82882


* The STRING app includes options to change interaction confidence level, expand the network etc.

In [None]:
p4c.networks.get_edge_count() #Before changing interaction confidence level

2238

In [None]:
string_cmd_list = ['string change confidence confidence=0.9 network=CURRENT']
string_cmd = " ".join(string_cmd_list)
p4c.commands.commands_run(string_cmd)

['']

In [None]:
p4c.networks.get_edge_count() #After changing interaction confidence level

443

In [None]:
p4c.network_views.export_image('before_expand', type='PNG')

{'file': 'C:\\Users\\YihangXin\\CytoscapeConfiguration\\filetransfer\\default_sandbox\\before_expand.png'}

In [None]:
string_cmd_list = ['string expand network=CURRENT']
string_cmd = " ".join(string_cmd_list)
p4c.commands.commands_run(string_cmd)

["Loaded network 'String Network - ovarian cancer' with 110 nodes and 613 edges"]

In [None]:
p4c.network_views.export_image('after_expand', type='PNG')

{'file': 'C:\\Users\\YihangXin\\CytoscapeConfiguration\\filetransfer\\default_sandbox\\after_expand.png'}

# WikiPathways

WikiPathways is a collaborative wiki platform with curated manually pathway models. It currently covers over 2,600 pathways in 25 species-specific collections.

* To search WikiPathways, call the find_pathways_by_text function with your search terms (here we use ‘statin’ as the term)

In [None]:
def find_pathways_by_text(query, species):
    base_iri = 'http://webservice.wikipathways.org/'
    request_params = {'query':query, 'species':species}
    response = requests.get(base_iri + 'findPathwaysByText', params=request_params)
    return response

In [None]:
response = find_pathways_by_text("statin", "Homo sapiens") # restrict the results to Homo sapiens

In [None]:
def find_pathway_dataframe(response):
    data = response.text
    dom = ET.fromstring(data)
    pathways = []
    NAMESPACES = {'ns1':'http://www.wso2.org/php/xsd','ns2':'http://www.wikipathways.org/webservice/'}
    for node in dom.findall('ns1:result', NAMESPACES):
        pathway_using_api_terms = {}
        for child in node:
            pathway_using_api_terms[ET.QName(child).localname] = child.text
            pathways.append(pathway_using_api_terms)
    id_list = []
    score_list = []
    url_list = []
    name_list = []
    species_list = []
    revision_list = []
    for p in pathways:
        id_list.append(p["id"])
        score_list.append(p["score"])
        url_list.append(p["url"])
        name_list.append(p["name"])
        species_list.append(p["species"])
        revision_list.append(p["revision"])
    df = pd.DataFrame(list(zip(id_list,score_list,url_list,name_list,species_list,revision_list)), columns =['id', 'score','url','name','species','revision'])
    return df

In [None]:
df = find_pathway_dataframe(response)
df.head(10)

Unnamed: 0,id,score,url,name,species,revision
0,WP430,4.985504,https://www.wikipathways.org/index.php/Pathway...,Statin Pathway,Homo sapiens,108375
1,WP430,4.985504,https://www.wikipathways.org/index.php/Pathway...,Statin Pathway,Homo sapiens,108375
2,WP430,4.985504,https://www.wikipathways.org/index.php/Pathway...,Statin Pathway,Homo sapiens,108375
3,WP430,4.985504,https://www.wikipathways.org/index.php/Pathway...,Statin Pathway,Homo sapiens,108375
4,WP430,4.985504,https://www.wikipathways.org/index.php/Pathway...,Statin Pathway,Homo sapiens,108375
5,WP430,4.985504,https://www.wikipathways.org/index.php/Pathway...,Statin Pathway,Homo sapiens,108375
6,WP3590,3.376647,https://www.wikipathways.org/index.php/Pathway...,Demo,Homo sapiens,106743
7,WP3590,3.376647,https://www.wikipathways.org/index.php/Pathway...,Demo,Homo sapiens,106743
8,WP3590,3.376647,https://www.wikipathways.org/index.php/Pathway...,Demo,Homo sapiens,106743
9,WP3590,3.376647,https://www.wikipathways.org/index.php/Pathway...,Demo,Homo sapiens,106743


In [None]:
df = df.drop_duplicates()
df = df.reset_index(drop=True)
df

Unnamed: 0,id,score,url,name,species,revision
0,WP430,4.985504,https://www.wikipathways.org/index.php/Pathway...,Statin Pathway,Homo sapiens,108375
1,WP3590,3.376647,https://www.wikipathways.org/index.php/Pathway...,Demo,Homo sapiens,106743
2,WP3539,3.2785838,https://www.wikipathways.org/index.php/Pathway...,WikiPathways Tutorial: demo_step3,Homo sapiens,106739
3,WP3418,3.26224,https://www.wikipathways.org/index.php/Pathway...,Demo_complete,Homo sapiens,106736


In [None]:
cmd_list = ['wikipathways','import-as-pathway','id="',df["id"][0],'"']
cmd = " ".join(cmd_list)
p4c.commands.commands_get(cmd) 

[]

To open the pathway as a network, run the following chunk.



In [None]:
cmd_list = ['wikipathways','import-as-network','id="',df["id"][0],'"']
cmd = " ".join(cmd_list)
p4c.commands.commands_get(cmd) 

[]

# Local and Remote Files
Cytoscape can load locally and remotely stored network data files in a variety of file formats:

- SIF: Simple interaction format
- NNF: Nested network format
- GML and XGMML formats
- CYS: Cytoscape session file
- Delimited text and Excel format

## Loading SIF files
SIF is a simple interaction format consisting of three columns of data: source, interaction and target. To learn more about the SIF format, see the Cytoscape manual.

Download galFiltered.sif and load the network via

In [None]:
sif_url = "https://cytoscape.github.io/cytoscape-tutorials/protocols/data/galFiltered.sif"
file_name = wget.download(sif_url)
file_name

'galFiltered.sif'

In [None]:
p4c.sandbox.sandbox_send_to(file_name)

{'filePath': 'C:\\Users\\YihangXin\\CytoscapeConfiguration\\filetransfer\\default_sandbox\\galFiltered.sif'}

In [None]:
p4c.networks.import_network_from_file(file_name)

{'networks': [6083], 'views': [6786]}

- To see the whole network, run

In [None]:
p4c.network_views.fit_content()

{}

## Loading XGMML files
XGMML is an XML format and can includes node and edge attributes as well as visual style properties. To learn more about the XGMML format, see the Cytoscape manual.

Download https://raw.githubusercontent.com/cytoscape/cytoscape-tutorials/gh-pages/protocols/data/BasicDataVizDemo.xgmml and load the network via

In [None]:
xgmll_url = "https://raw.githubusercontent.com/cytoscape/cytoscape-tutorials/gh-pages/protocols/data/BasicDataVizDemo.xgmml"
file_name = wget.download(xgmll_url)
file_name

'BasicDataVizDemo.xgmml'

In [None]:
p4c.sandbox.sandbox_send_to(file_name)

{'filePath': 'C:\\Users\\YihangXin\\CytoscapeConfiguration\\filetransfer\\default_sandbox\\BasicDataVizDemo.xgmml'}

In [None]:
p4c.networks.import_network_from_file(file_name)

{'networks': [7480], 'views': [7852]}