<a href="https://colab.research.google.com/github/cytoscape/cytoscape-automation/blob/master/for-scripters/Python/importing-network-from-table.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Importing Network From Table


## Yihang Xin and Alex Pico
## 2020-12-16

In addition to importing networks in network file formats, such as sif and xgmml, Cytoscape also supports importing networks from tabular data. In this vignette, the data table represents protein-protein interaction data from a mass-spectrometry experiment.



# Installation
The following chunk of code installs the `py4cytoscape` module.

In [None]:
%%capture
!python3 -m pip install python-igraph requests pandas networkx
!python3 -m pip install py4cytoscape

If you are using a remote notebook environment such as Google Colab, please execute the cell below. (If you're running on your local notebook, you don't need to do that.)



In [None]:
import requests
exec(requests.get("https://raw.githubusercontent.com/cytoscape/jupyter-bridge/master/client/p4c_init.py").text)
IPython.display.Javascript(_PY4CYTOSCAPE_BROWSER_CLIENT_JS) # Start browser client

# Prerequisites
In addition to this package (py4cytoscape version 0.0.9), you will need:

* Latest version of Cytoscape, which can be downloaded from https://cytoscape.org/download.html. Simply follow the installation instructions on screen.

* Complete installation wizard

* Launch Cytoscape

You can also install app inside Python notebook by running "py4cytoscape.install_app('Your App')"

# Import the required package¶


In [None]:
import os
import sys
import pandas as pd
import py4cytoscape as p4c

# Setup Cytoscape


In [None]:
p4c.cytoscape_version_info()

{'apiVersion': 'v1',
 'cytoscapeVersion': '3.8.2',
 'automationAPIVersion': '1.0.0',
 'py4cytoscapeVersion': '0.0.6'}

# Background
The data used for this protocol represents interactions between human and HIV proteins by Jäger et al (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3310911/). In this quantitative AP-MS experiment, a relatively small number of bait proteins were used to pull down a larger set of prey proteins.



# Import Network


First we need to read in the example data file:



In [None]:
apms_data = pd.read_csv("https://raw.githubusercontent.com/cytoscape/cytoscape-automation/master/for-scripters/R/notebooks/AP-MS/ap-ms-demodata_simple.csv")

In [None]:
apms_data.head()

Unnamed: 0,Bait,Prey,HEKScore,AP-MS Score
0,GAG,THRAP3,0.807,0.563
1,GAG,SEPSECS,0.814,0.507
2,GAG,IVNS1ABP,0.753,0.506
3,GAG,DDX49,0.824,0.412
4,GAG,PRMT1,0.758,0.397


Now we can create a data frame for the network edges (interactions) using the imported data. We can also add the AP-MS score from the data as an edge attribute:

In [None]:
edge_data = {'source':apms_data["Bait"],
             'target':apms_data["Prey"],
             'AP-MS Score':apms_data["AP-MS Score"]
            }
edges = pd.DataFrame(data=edge_data, columns=['source', 'target','AP-MS Score'])
edges.head()

Unnamed: 0,source,target,AP-MS Score
0,GAG,THRAP3,0.563
1,GAG,SEPSECS,0.507
2,GAG,IVNS1ABP,0.506
3,GAG,DDX49,0.412
4,GAG,PRMT1,0.397


Finally, we use the edge data fram to create the network. Note that we don’t need to define a data frame for nodes, as all nodes in this case are represented in the edge data frame.



In [None]:
p4c.create_network_from_data_frames(edges=edges, title='apms network', collection="apms collection")

Applying default style...
Applying preferred layout


{'networkSUID': 13412}

The imported network consists of multiple smaller subnetworks, each representing a bait node and its associated prey nodes.



# Loading Data


There is one other column of data for the “Prey” proteins that we want to load into this network, the “HEKScore”.

In this data, the Prey nodes are repeated for each interactions with a Bait node, so the data contains different values for the same attribute (for example HEKScore), for each Prey node. During import, the last value imported will overwrite prior values and visualizations using this attribute thus only shows the last value.

In [None]:
p4c.load_table_data(apms_data[["Prey","HEKScore"]], data_key_column='Prey')

'Success: Data loaded in defaultnode table'