<a href="https://colab.research.google.com/github/cytoscape/cytoscape-automation/blob/master/for-scripters/Python/loading-networks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Loading Networks
## Yihang Xin and Alex Pico
## 2025-01-13

In Cytoscape, network data can be loaded from a variety of sources, and in several different formats. Where you get your network data depends on your biological question and analysis plan. This tutorial outlines how to load network data from several popular sources and formats.

1. Public databases
    * NDEx
    * PSICQUIC
    * STRING/STITCH
    * WikiPathways
2. Local and remote files
3. Cytoscape apps (Biopax, KEGG and other formats)


# Installation
The following chunk of code installs the `py4cytoscape` module.

In [1]:
%%capture
!python3 -m pip install python-igraph requests pandas networkx
!python3 -m pip install py4cytoscape
!python3 -m pip install ndex2 wget

If you are using a remote notebook environment such as Google Colab, please execute the cell below. (If you're running on your local notebook, you don't need to do that.)



In [2]:
#_PY4CYTOSCAPE = 'git+https://github.com/cytoscape/py4cytoscape@1.11.0' # optional
import requests
exec(requests.get("https://raw.githubusercontent.com/cytoscape/jupyter-bridge/master/client/p4c_init.py").text)
IPython.display.Javascript(_PY4CYTOSCAPE_BROWSER_CLIENT_JS) # Start browser client



You should consider upgrading via the '/opt/anaconda3/bin/python -m pip install --upgrade pip' command.


Loading Javascript client ... 7f352594-04a1-4cbb-9f58-8c3510ce7c15 on https://jupyter-bridge.cytoscape.org


<IPython.core.display.Javascript object>

# Prerequisites
## In addition to this package (py4cytoscape latest version 1.11.0), you will need:
* Latest version of Cytoscape, which can be downloaded from https://cytoscape.org/download.html. Simply follow the installation instructions on screen.
* Complete installation wizard
* Launch Cytoscape
For this vignette, you’ll also need the WikiPathways app to access the WikiPathways database from within Cytoscape.

Install the WikiPathways app from http://apps.cytoscape.org/apps/wikipathways

Install the STRING app from https://apps.cytoscape.org/apps/stringapp

Install the filetransfer app from https://apps.cytoscape.org/apps/filetransfer

You can also install app inside Python notebook by running "py4cytoscape.install_app('Your App')"

## Import the required package


In [3]:
import os
import sys
import requests
import pandas as pd
import py4cytoscape as p4c
import ndex2.client as nc
import wget
from lxml import etree as ET

In [4]:
# Check Version
p4c.cytoscape_version_info()

{'apiVersion': 'v1',
 'cytoscapeVersion': '3.9.0',
 'automationAPIVersion': '1.2.0',
 'py4cytoscapeVersion': '0.0.10'}

# Networks from Public Data
Cytoscape includes a Network Search tool for easy import of public network data. In addition to core apps that are included with your Cytoscape installation (NDEx and PSICQUIC), the resources listed here will depend on which apps you have installed.

In [5]:
p4c.apps.get_installed_apps()

[{'appName': 'PSI-MI Reader',
  'version': '3.4.0',
  'description': 'Core App: Provides support for reading PSI-MI files in Cytoscape',
  'status': 'Installed'},
 {'appName': 'Network Merge',
  'version': '3.9.3',
  'description': "Core App: Provides Cytoscape's Network Merge tool",
  'status': 'Installed'},
 {'appName': 'Core Apps',
  'version': '3.8.0',
  'description': None,
  'status': 'Installed'},
 {'appName': 'Biomart Web Service Client',
  'version': '3.4.0',
  'description': 'Core App: Provides support for Biomart web service in Cytoscape',
  'status': 'Installed'},
 {'appName': 'SBML Reader',
  'version': '3.4.0',
  'description': 'Core App: Provides support for reading SBML files in Cytoscape',
  'status': 'Installed'},
 {'appName': 'cyREST',
  'version': '3.12.2',
  'description': None,
  'status': 'Installed'},
 {'appName': 'aMatReader',
  'version': '1.2.0',
  'description': 'App to read adjacency matrix (.mat, .adj) files',
  'status': 'Installed'},
 {'appName': 'CX Sup

# NDEx
The NDEx Project provides an open-source framework where scientists and organizations can share, store, manipulate, and publish biological network knowledge.
* To search NDEx run the following code chunk. Here, we use “TP53 AND BARD1” as our search terms.

In [6]:
anon_ndex=nc.Ndex2("http://public.ndexbio.org")
anon_ndex.update_status()

In [7]:
networks = anon_ndex.search_networks(search_string='TP53 AND BARD1')
df_dict = networks["networks"]

In [8]:
ownerUUID_list = []
externalId_list = []
nodeCount_list = []
edgeCount_list = []

In [9]:
for d in df_dict:
    ownerUUID_list.append(d["ownerUUID"])
    externalId_list.append(d["externalId"])
    nodeCount_list.append(d["nodeCount"])
    edgeCount_list.append(d["edgeCount"])

In [10]:
df = pd.DataFrame(list(zip(ownerUUID_list,externalId_list,nodeCount_list,edgeCount_list)), columns =['ownerUUID', 'externalId','nodeCount','edgeCount'])
df.head()

Unnamed: 0,ownerUUID,externalId,nodeCount,edgeCount
0,301a91c6-a37b-11e4-bda0-000c29202374,5a1fcfb9-78c3-11e8-a4bf-0ac135e8bacf,30,101
1,363f49e0-4cf0-11e9-9f06-0ac135e8bacf,8089594b-8b63-11eb-9e72-0ac135e8bacf,255,403
2,363f49e0-4cf0-11e9-9f06-0ac135e8bacf,1cb1c04b-8b69-11eb-9e72-0ac135e8bacf,213,198
3,363f49e0-4cf0-11e9-9f06-0ac135e8bacf,6dac182e-8b63-11eb-9e72-0ac135e8bacf,59,51
4,363f49e0-4cf0-11e9-9f06-0ac135e8bacf,dfe8d348-8b64-11eb-9e72-0ac135e8bacf,240,64


In [11]:
networkId = df["externalId"][0]

To import the network into Cytoscape, run the following code chunk.


In [12]:
p4c.cy_ndex.import_network_from_ndex(networkId)

25729

# STRING/STITCH
STRING is a database of known and predicted protein-protein interactions, and STITCH stored known and predicted interactions between chemicals and proteins. Data types include:

* Genomic Context Predictions
* High-throughput Lab Experiments
* (Conserved) Co-Expression
* Automated Textmining
* Previous Knowledge in Databases

To search STRING with the disease keyword “ovarian cancer”, run the following code chunk. (The resulting network will load automatically.)

In [13]:
string_cmd_list = ['string disease query','disease="ovarian cancer"']
string_cmd = " ".join(string_cmd_list)
p4c.commands.commands_run(string_cmd)

["Loaded network 'STRING network - ovarian cancer' with 100 nodes and 2410 edges"]

* Networks load with a STRING-specific style, which includes 3D protein structure diagrams.


In [14]:
p4c.network_views.export_image('ovarian_cancer', type='PNG')

{'file': '/Users/yxin/CytoscapeConfiguration/filetransfer/default_sandbox/ovarian_cancer.png'}

* STRING networks also inlcude data as node/interaction attributes, that can be used to create a Style.

In [15]:
column_names = p4c.tables.get_table_column_names()
column_names.remove( 'stringdb::structures')

In [16]:
df = p4c.tables.get_table_columns(columns=column_names)
df.head()

Unnamed: 0,SUID,shared name,name,selected,stringdb::canonical name,display name,stringdb::full name,stringdb::database identifier,stringdb::description,@id,...,tissue::muscle,tissue::nervous system,tissue::pancreas,tissue::saliva,tissue::skin,tissue::spleen,tissue::stomach,tissue::thyroid gland,tissue::urine,stringdb::disease score
26880,26880,9606.ENSP00000434045,9606.ENSP00000434045,False,Q15743,GPR68,,9606.ENSP00000434045,Ovarian cancer G-protein coupled receptor 1; P...,stringdb:9606.ENSP00000434045,...,1.45426,4.35801,0.876775,,1.10445,1.27206,1.30433,0.581355,0.467361,2.58124
27138,27138,9606.ENSP00000263334,9606.ENSP00000263334,False,Q06710,PAX8,,9606.ENSP00000263334,Paired box protein Pax-8; Transcription factor...,stringdb:9606.ENSP00000263334,...,1.82151,2.55518,1.69049,1.7475,1.73851,1.36759,1.9104,4.66762,1.99565,3.18596
26883,26883,9606.ENSP00000252137,9606.ENSP00000252137,False,Q96DF8,DGCR14,,9606.ENSP00000252137,DiGeorge syndrome critical region gene 14; May...,stringdb:9606.ENSP00000252137,...,2.19006,3.79092,1.6211,1.47221,1.92894,2.00767,1.62299,1.34571,0.930054,3.03386
27141,27141,9606.hsa-miR-21-5p,9606.hsa-miR-21-5p,False,,hsa-miR-21-5p,,9606.hsa-miR-21-5p,,stringdb:9606.hsa-miR-21-5p,...,2.53044,2.41097,2.21449,2.20916,2.07721,2.07254,2.02119,1.83569,2.30949,2.62748
26886,26886,9606.ENSP00000265171,9606.ENSP00000265171,False,P01133,EGF,,9606.ENSP00000265171,Pro-epidermal growth factor; EGF stimulates th...,stringdb:9606.ENSP00000265171,...,3.36985,3.5284,3.29138,2.77513,3.01317,2.45774,2.66789,2.47213,2.15156,2.90074


* The STRING app includes options to change interaction confidence level, expand the network etc.

In [17]:
p4c.networks.get_edge_count() #Before changing interaction confidence level

2410

In [18]:
string_cmd_list = ['string change confidence confidence=0.9 network=CURRENT']
string_cmd = " ".join(string_cmd_list)
p4c.commands.commands_run(string_cmd)

['']

In [19]:
p4c.networks.get_edge_count() #After changing interaction confidence level

435

In [20]:
p4c.network_views.export_image('before_expand', type='PNG')

{'file': '/Users/yxin/CytoscapeConfiguration/filetransfer/default_sandbox/before_expand.png'}

In [21]:
string_cmd_list = ['string expand network=CURRENT']
string_cmd = " ".join(string_cmd_list)
p4c.commands.commands_run(string_cmd)

["Loaded network 'STRING network - ovarian cancer' with 110 nodes and 595 edges"]

In [22]:
p4c.network_views.export_image('after_expand', type='PNG')

{'file': '/Users/yxin/CytoscapeConfiguration/filetransfer/default_sandbox/after_expand.png'}

# WikiPathways

WikiPathways is a collaborative wiki platform with curated manually pathway models. It currently covers over 2,600 pathways in 25 species-specific collections.

* To search WikiPathways, call the find_pathways_by_text function with your search terms (here we use ‘statin’ as the term)

In [23]:
def find_pathways_by_text(query, species):
    base_iri = 'http://webservice.wikipathways.org/'
    request_params = {'query':query, 'species':species}
    response = requests.get(base_iri + 'findPathwaysByText', params=request_params)
    return response

In [24]:
response = find_pathways_by_text("statin", "Homo sapiens") # restrict the results to Homo sapiens

In [25]:
def find_pathway_dataframe(response):
    data = response.text
    dom = ET.fromstring(data)
    pathways = []
    NAMESPACES = {'ns1':'http://www.wso2.org/php/xsd','ns2':'http://www.wikipathways.org/webservice/'}
    for node in dom.findall('ns1:result', NAMESPACES):
        pathway_using_api_terms = {}
        for child in node:
            pathway_using_api_terms[ET.QName(child).localname] = child.text
            pathways.append(pathway_using_api_terms)
    id_list = []
    score_list = []
    url_list = []
    name_list = []
    species_list = []
    revision_list = []
    for p in pathways:
        id_list.append(p["id"])
        score_list.append(p["score"])
        url_list.append(p["url"])
        name_list.append(p["name"])
        species_list.append(p["species"])
        revision_list.append(p["revision"])
    df = pd.DataFrame(list(zip(id_list,score_list,url_list,name_list,species_list,revision_list)), columns =['id', 'score','url','name','species','revision'])
    return df

In [26]:
df = find_pathway_dataframe(response)
df.head(10)

Unnamed: 0,id,score,url,name,species,revision
0,WP430,4.65254,https://www.wikipathways.org/index.php/Pathway...,Statin inhibition of cholesterol production,Homo sapiens,119069
1,WP430,4.65254,https://www.wikipathways.org/index.php/Pathway...,Statin inhibition of cholesterol production,Homo sapiens,119069
2,WP430,4.65254,https://www.wikipathways.org/index.php/Pathway...,Statin inhibition of cholesterol production,Homo sapiens,119069
3,WP430,4.65254,https://www.wikipathways.org/index.php/Pathway...,Statin inhibition of cholesterol production,Homo sapiens,119069
4,WP430,4.65254,https://www.wikipathways.org/index.php/Pathway...,Statin inhibition of cholesterol production,Homo sapiens,119069
5,WP430,4.65254,https://www.wikipathways.org/index.php/Pathway...,Statin inhibition of cholesterol production,Homo sapiens,119069
6,WP3590,3.3518085,https://www.wikipathways.org/index.php/Pathway...,Demo,Homo sapiens,106743
7,WP3590,3.3518085,https://www.wikipathways.org/index.php/Pathway...,Demo,Homo sapiens,106743
8,WP3590,3.3518085,https://www.wikipathways.org/index.php/Pathway...,Demo,Homo sapiens,106743
9,WP3590,3.3518085,https://www.wikipathways.org/index.php/Pathway...,Demo,Homo sapiens,106743


In [27]:
df = df.drop_duplicates()
df = df.reset_index(drop=True)
df

Unnamed: 0,id,score,url,name,species,revision
0,WP430,4.65254,https://www.wikipathways.org/index.php/Pathway...,Statin inhibition of cholesterol production,Homo sapiens,119069
1,WP3590,3.3518085,https://www.wikipathways.org/index.php/Pathway...,Demo,Homo sapiens,106743
2,WP3539,3.2534084,https://www.wikipathways.org/index.php/Pathway...,WikiPathways Tutorial: demo_step3,Homo sapiens,106739
3,WP3418,3.2370086,https://www.wikipathways.org/index.php/Pathway...,Demo_complete,Homo sapiens,106736


In [28]:
cmd_list = ['wikipathways','import-as-pathway','id="',df["id"][0],'"']
cmd = " ".join(cmd_list)
p4c.commands.commands_get(cmd) 

[]

To open the pathway as a network, run the following chunk.



In [29]:
cmd_list = ['wikipathways','import-as-network','id="',df["id"][0],'"']
cmd = " ".join(cmd_list)
p4c.commands.commands_get(cmd) 

[]

# Local and Remote Files
Cytoscape can load locally and remotely stored network data files in a variety of file formats:

- SIF: Simple interaction format
- NNF: Nested network format
- GML and XGMML formats
- CYS: Cytoscape session file
- Delimited text and Excel format

## Loading SIF files
SIF is a simple interaction format consisting of three columns of data: source, interaction and target. To learn more about the SIF format, see the Cytoscape manual.

Download galFiltered.sif and load the network via

In [30]:
sif_url = "https://cytoscape.github.io/cytoscape-tutorials/protocols/data/galFiltered.sif"
file_name = wget.download(sif_url)
file_name

'galFiltered.sif'

In [31]:
p4c.sandbox.sandbox_send_to(file_name)

{'filePath': '/Users/yxin/CytoscapeConfiguration/filetransfer/default_sandbox/galFiltered.sif'}

In [32]:
p4c.networks.import_network_from_file(file_name)

{'networks': [48334], 'views': [50464]}

- To see the whole network, run

In [33]:
p4c.network_views.fit_content()

{}

## Loading XGMML files
XGMML is an XML format and can includes node and edge attributes as well as visual style properties. To learn more about the XGMML format, see the Cytoscape manual.

Download https://raw.githubusercontent.com/cytoscape/cytoscape-tutorials/gh-pages/protocols/data/BasicDataVizDemo.xgmml and load the network via

In [34]:
xgmll_url = "https://raw.githubusercontent.com/cytoscape/cytoscape-tutorials/gh-pages/protocols/data/BasicDataVizDemo.xgmml"
file_name = wget.download(xgmll_url)
file_name

'BasicDataVizDemo.xgmml'

In [35]:
p4c.sandbox.sandbox_send_to(file_name)

{'filePath': '/Users/yxin/CytoscapeConfiguration/filetransfer/default_sandbox/BasicDataVizDemo.xgmml'}

In [36]:
p4c.networks.import_network_from_file(file_name)

{'networks': [52212], 'views': [54088]}