<a href="https://colab.research.google.com/github/bdemchak/cytoscape-jupyter/blob/main/gangsu/basic%20protocol%201.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This is a work-in-progress reproduction of the [Biological Network Exploration with Cytoscape 3](https://pubmed.ncbi.nlm.nih.gov/25199793/) Basic Protocol 1, which loads a s. cervesiae network, filters out unneeded nodes, lays out the resulting network, and then creates a dendogram display.

While much of it works, there are compromises, mainly due to Cytoscape features that aren't at full strength yet.

---
#Setup data files, py4cytoscape and Cytoscape connection
---
**NOTE: To run this notebook, you must manually start Cytoscape first -- don't proceed until you have started Cytoscape.**

---
##Setup: Import source data files

For now, the network file is pre-positioned in a sandbox on the workstation. 

It really should be in a web location that gets loaded into the sandbox at run-time. If we could find a way to have a URL directly to this file, we could get get Cytoscape to load it directly via a CyREST POST v1/networks call (assuming we can get Cytoscape itself to load directly from a URL). 

Failing that, there's no reason why this Python script can't load the network into its own file system, then transfer it to the Cytoscape sandbox so Cytoscape can then import it.

The code in this cell actually does load the table data from a URL. This Python script parses the table into a dataframe and then loads the dataframe into Cytoscape. We're going to this trouble because we can't figure out how to get Cytoscape to import a table file using commands_post() ... which means we can't figure out how to create a Command that does this. Once we figure this out, we can pre-position the table data file in the same sandbox as we're using for the network file ... or transfer it to the sandbox early in this Python script.

In [1]:
!rm GDS112_full.soft BIOGRID-ORGANISM-Saccharomyces_cerevisiae-3.2.105.mitab
# !wget -q --no-check-certificate https://www.dropbox.com/s/8wc8o897tsxewt1/BIOGRID-ORGANISM-Saccharomyces_cerevisiae-3.2.105.mitab?dl=0
# !mv BIOGRID-ORGANISM-Saccharomyces_cerevisiae-3.2.105.mitab?dl=0 BIOGRID-ORGANISM-Saccharomyces_cerevisiae-3.2.105.mitab
!wget -q --no-check-certificate https://www.dropbox.com/s/r15azh0xb53smu1/GDS112_full.soft?dl=0
!mv GDS112_full.soft?dl=0 GDS112_full.soft
!rm -r output/
!ls -l 
OUTPUT_DIR = 'output/'

rm: cannot remove 'BIOGRID-ORGANISM-Saccharomyces_cerevisiae-3.2.105.mitab': No such file or directory
rm: cannot remove 'output/': No such file or directory
total 5416
-rw-r--r-- 1 root root 5536880 Jan 19 00:11 GDS112_full.soft
drwxr-xr-x 2 root root    4096 Jan 18 22:35 logs
drwxr-xr-x 1 root root    4096 Jan  6 18:10 sample_data


---
##Setup: Fetch latest py4cytoscape




**Note that you can fetch from a specific github branch by adding "@<branch>" to the "py4cytocape" at the end of the github URL.**

For example, to get branch 0.0.5: git+https://github.com/cytoscape/py4cytoscape@0.0.5

In [2]:
!pip uninstall -y py4cytoscape

#!pip install py4cytoscape
!pip install git+https://github.com/cytoscape/py4cytoscape@0.0.8
#!pip install git+https://github.com/cytoscape/py4cytoscape

Uninstalling py4cytoscape-0.0.7:
  Successfully uninstalled py4cytoscape-0.0.7
Collecting git+https://github.com/cytoscape/py4cytoscape@0.0.8
  Cloning https://github.com/cytoscape/py4cytoscape (to revision 0.0.8) to /tmp/pip-req-build-up1zbab9
  Running command git clone -q https://github.com/cytoscape/py4cytoscape /tmp/pip-req-build-up1zbab9
  Running command git checkout -b 0.0.8 --track origin/0.0.8
  Switched to a new branch '0.0.8'
  Branch '0.0.8' set up to track remote branch '0.0.8' from 'origin'.
Building wheels for collected packages: py4cytoscape
  Building wheel for py4cytoscape (setup.py) ... [?25l[?25hdone
  Created wheel for py4cytoscape: filename=py4cytoscape-0.0.7-cp36-none-any.whl size=143002 sha256=8c51112fc344ac8c3fa37ee6360242d724fd3848d7d21228d06d9ce679e152e3
  Stored in directory: /tmp/pip-ephem-wheel-cache-iqcm_v8r/wheels/50/fb/ad/2ef86b83249494e3b5793a114c7b3640f4c5f926fbfc9c23c8
Successfully built py4cytoscape
Installing collected packages: py4cytoscape
Suc

---
##Setup: Set up Cytoscape connection


In [3]:
import IPython
import py4cytoscape as p4c
print(f'Loading Javascript client ... {p4c.get_browser_client_channel()} on {p4c.get_jupyter_bridge_url()}')
browser_client_js = p4c.get_browser_client_js(True)
IPython.display.Javascript(browser_client_js) # Start browser client


Loading Javascript client ... 56872ce1-520f-422f-b0e5-82e8e8f41fff on https://jupyter-bridge.cytoscape.org


<IPython.core.display.Javascript object>

---
#Sanity test to verify Cytoscape connection


In [4]:
p4c.cytoscape_version_info()


{'apiVersion': 'v1',
 'automationAPIVersion': '1.0.0',
 'cytoscapeVersion': '3.9.0-SNAPSHOT',
 'jupyterBridgeVersion': '0.0.2',
 'py4cytoscapeVersion': '0.0.7'}

---
# Set pre-filled sandbox as Cytoscape's current sandbox

This Python script uses just the network (.MITAB) in that sandbox ... the data table is loaded by this script into a dataframe and transferred to Cytoscape directly.

**Sandboxing is explained in https://py4cytoscape.readthedocs.io/en/latest/concepts.html#sandboxing**

In [5]:
gangsu_sandbox = p4c.sandbox_set('GangSu_sandbox', copy_samples=False, reinitialize=False)
gangsu_sandbox

'C:\\Users\\CyDeveloper\\CytoscapeConfiguration\\filetransfer\\GangSu_sandbox'

# Load the s. cerevisiae MITAB network into Cytoscape

Note that the import_network_from_file function (incorrectly) throws an exception, so we explicitly ignore the exception.

**Note:** Once CYTOSCAPE-12782 is fixed, this should be converted to directly load this network using a cloud URL.

**Note**: Once CYTOSCAPE-12772 is fixed, we can remove the try-block in this cell.

In [6]:
from requests import HTTPError
p4c.close_session(False)

try:
  p4c.import_network_from_file('BIOGRID-ORGANISM-Saccharomyces_cerevisiae-3.2.105.mitab')
except:  
  pass
if p4c.get_network_count() != 1:
  raise Exception('Failed to load network')
net_suid = p4c.get_network_suid()
net_suid



In commands_post(): {'status': 500, 'type': 'urn:cytoscape:ci:cyrest-core:v1:handle-json-command:errors:3', 'message': 'Task returned invalid json.', 'link': 'file:/C:/Users/CyDeveloper/CytoscapeConfiguration/3/framework-cytoscape.log'}


10445349

# Merge the gene expression data into the node table

For Cytoscape 3.9.0 and later, call Cytoscape to merge the gene expression data into the node attribute table. 

For pre-Cytoscape 3.9.0, do most of the work in Pandas and then import the dataframe into the node attribute table. Explicitly set the Gene ID as a string even though it's originally parsed as a number. To Cytoscape, the string will be compatible the 'name' column already in the BIOGRID network. The Gene ID column in the dataframe is matched to the network's name column.



**Note:** ... add a table import function in Cytoscape Automation so the sandbox can be used.

Even even better ... fix Table Import so it can download the table directly ... that requires CYTOSCAPE-12782 being addressed.

In [7]:
try:
  p4c.verify_supported_versions(cytoscape='3.9')
  soft_file_path = p4c.sandbox_get_file_info('GDS112_full.soft')['filePath']
  res = p4c.commands_post(f'table import file startLoadRow="83" keyColumnIndex="10" file="{soft_file_path}"')
  print(res)
except:
  import pandas as df
  GDS112_full = df.read_csv('GDS112_full.soft', skiprows=82, sep='\t')
  GDS112_full.dropna(subset=['Gene ID'], inplace=True)
  GDS112_full['Gene ID'] = df.to_numeric(GDS112_full['Gene ID'], downcast='integer')
  GDS112_full = GDS112_full.astype({'Gene ID': 'string'})
  print(GDS112_full.dtypes)
  p4c.load_table_data(GDS112_full, data_key_column='Gene ID')

  GDS112_full

{'mappedTables': [10445320, 10445358]}


# Create a filter to remove nodes having no Gene Symbol

**Note:** To work properly, CYTOSCAPE-12776 must be fixed. This decouples to creation of the filter from its execution. Currently, the filter may or may not execute, depending on the size of the network. After CYTOSCAPE-12776 is fixed, the caller or py4cytoscape will need to explicitly apply the filter. 

In [8]:
p4c.create_column_filter('SymbolOK', 'Gene symbol', '[A-Z0-9]*', 'REGEX')

No edges selected.


{'edges': None,
 'nodes': ['851938',
  '851655',
  '856448',
  '856849',
  '855675',
  '855003',
  '850646',
  '851420',
  '852688',
  '856633',
  '851492',
  '852722',
  '854453',
  '854908',
  '856321',
  '855375',
  '851040',
  '853057',
  '856825',
  '853062',
  '850917',
  '855381',
  '856562',
  '851663',
  '854774',
  '853295',
  '854465',
  '853232',
  '851664',
  '851658',
  '854259',
  '852173',
  '853827',
  '856917',
  '852991',
  '851514',
  '854853',
  '855742',
  '850662',
  '850417',
  '850618',
  '855784',
  '856727',
  '854867',
  '854764',
  '852383',
  '850993',
  '856628',
  '854059',
  '856243',
  '850819',
  '855202',
  '855426',
  '850322',
  '852643',
  '856305',
  '853357',
  '854011',
  '852212',
  '855102',
  '854003',
  '852391',
  '852470',
  '856315',
  '852366',
  '856093',
  '852695',
  '853966',
  '850539',
  '853126',
  '853017',
  '850768',
  '856369',
  '852598',
  '856451',
  '852566',
  '855081',
  '851814',
  '850443',
  '854424',
  '852579',
  '

# Execute the filter to select all named nodes

The filter should have been applied at creation time, but there appears to be a bug in Cytoscape where the "Apply" checkbox is turned off when a network is imported. So, we do this explicitly here instead.

In [9]:
p4c.apply_filter('SymbolOK')

No edges selected.


{'edges': None,
 'nodes': ['851938',
  '851655',
  '856448',
  '856849',
  '855675',
  '855003',
  '850646',
  '851420',
  '852688',
  '856633',
  '851492',
  '852722',
  '854453',
  '854908',
  '856321',
  '855375',
  '851040',
  '853057',
  '856825',
  '853062',
  '850917',
  '855381',
  '856562',
  '851663',
  '854774',
  '853295',
  '854465',
  '853232',
  '851664',
  '851658',
  '854259',
  '852173',
  '853827',
  '856917',
  '852991',
  '851514',
  '854853',
  '855742',
  '850662',
  '850417',
  '850618',
  '855784',
  '856727',
  '854867',
  '854764',
  '852383',
  '850993',
  '856628',
  '854059',
  '856243',
  '850819',
  '855202',
  '855426',
  '850322',
  '852643',
  '856305',
  '853357',
  '854011',
  '852212',
  '855102',
  '854003',
  '852391',
  '852470',
  '856315',
  '852366',
  '856093',
  '852695',
  '853966',
  '850539',
  '853126',
  '853017',
  '850768',
  '856369',
  '852598',
  '856451',
  '852566',
  '855081',
  '851814',
  '850443',
  '854424',
  '852579',
  '

# Create a subnetwork containing only named nodes

This could take several minutes

In [10]:
new_suid = p4c.create_subnetwork()
new_suid

11833247

# Get rid of the original network, which isn't needed anymore

In [11]:
p4c.delete_network(net_suid)
net_suid = new_suid

# Layout the subnetwork in case it wasn't already

In [12]:
p4c.layout_network('force-directed')


{}

# Install clusterMaker2 if it hasn't already been installed

In [13]:
p4c.install_app('clusterMaker2')

{}


{}

# Create the hierarchical clustering and dendogram

This returns a large data structure that describes the dendogram.

It also creates a dendogram window that's designed for GUI manipulation. It's unclear this can be controlled or used by automation calls.

**Note:** Having the dendogram is important, and so is having the data that created it. When CSD-420 is addressed, it will be possible to snapshot the dendogram and perform other operations with it.

In [14]:
p4c.commands_post('cluster hierarchical showUI=true clusterAttributes=false nodeAttributeList="GSM1029,GSM1030,GSM1032,GSM1033,GSM1034"')

[{'nodeOrder': [{'nodeName': '850532', 'suid': 10467847},
   {'nodeName': '851759', 'suid': 10460167},
   {'nodeName': '850377', 'suid': 10885684},
   {'nodeName': '854203', 'suid': 10470670},
   {'nodeName': '854229', 'suid': 10449483},
   {'nodeName': '851317', 'suid': 10488607},
   {'nodeName': '852514', 'suid': 10537096},
   {'nodeName': '854092', 'suid': 10451455},
   {'nodeName': '854105', 'suid': 10570912},
   {'nodeName': '854061', 'suid': 10643353},
   {'nodeName': '851822', 'suid': 10486291},
   {'nodeName': '851616', 'suid': 10542469},
   {'nodeName': '853875', 'suid': 10456402},
   {'nodeName': '850890', 'suid': 10883104},
   {'nodeName': '851906', 'suid': 10454056},
   {'nodeName': '856738', 'suid': 10477639},
   {'nodeName': '853145', 'suid': 10540345},
   {'nodeName': '853968', 'suid': 10550206},
   {'nodeName': '851808', 'suid': 10660774},
   {'nodeName': '854186', 'suid': 10601230},
   {'nodeName': '853507', 'suid': 10557349},
   {'nodeName': '853765', 'suid': 10598203

#Use BiNGO for enrichment analysis

The BiNGO app doesn't have automation entrypoints, so this analysis isn't possible right now. Is there a different app that can do this?

**NOTE:** We need CSD-421 fixed because we don't have any analysis right now, which is very important.