---
#Setup data files, py4cytoscape and Cytoscape connection
---
**NOTE: To run this notebook, you must manually start Cytoscape first -- don't proceed until you have started Cytoscape.**

---
##Setup: Import source data files

In [6]:
!rm GDS112_full.soft BIOGRID-ORGANISM-Saccharomyces_cerevisiae_S288c-3.4.129.mitab
!wget -q --no-check-certificate https://www.dropbox.com/s/9g2nenijehidy0g/BIOGRID-ORGANISM-Saccharomyces_cerevisiae_S288c-3.4.129.mitab?dl=0
!mv BIOGRID-ORGANISM-Saccharomyces_cerevisiae_S288c-3.4.129.mitab?dl=0 BIOGRID-ORGANISM-Saccharomyces_cerevisiae_S288c-3.4.129.mitab
!wget -q --no-check-certificate https://www.dropbox.com/s/r15azh0xb53smu1/GDS112_full.soft?dl=0
!mv GDS112_full.soft?dl=0 GDS112_full.soft
!rm -r output/
!ls -l 
OUTPUT_DIR = 'output/'

rm: cannot remove 'GDS112_full.soft': No such file or directory
total 229480
-rw-r--r-- 1 root root 229445088 Nov 10 23:27 BIOGRID-ORGANISM-Saccharomyces_cerevisiae_S288c-3.4.129.mitab
-rw-r--r-- 1 root root   5536880 Nov 10 23:27 GDS112_full.soft
drwxr-xr-x 1 root root      4096 Nov  6 17:30 sample_data


---
##Setup: Fetch latest py4cytoscape




**Note that you can fetch from a specific github branch by adding "@<branch>" to the "py4cytocape" at the end of the github URL.**

For example, to get branch 0.0.5: git+https://github.com/cytoscape/py4cytoscape@0.0.5

In [8]:
!pip uninstall -y py4cytoscape

!pip install py4cytoscape
#!pip install git+https://github.com/cytoscape/py4cytoscape@0.0.5
#!pip install git+https://github.com/cytoscape/py4cytoscape

Collecting py4cytoscape
[?25l  Downloading https://files.pythonhosted.org/packages/36/89/e3bf0ba869f99c5695a53fd540f259738aa8bd806c8741ab628fb64471ba/py4cytoscape-0.0.6-py3-none-any.whl (139kB)
[K     |████████████████████████████████| 143kB 2.8MB/s 
[?25hCollecting python-igraph
[?25l  Downloading https://files.pythonhosted.org/packages/20/6e/3ac2fc339051f652d4a01570d133e4d15321aaec929ffb5f49a67852f8d9/python_igraph-0.8.3-cp36-cp36m-manylinux2010_x86_64.whl (3.2MB)
[K     |████████████████████████████████| 3.2MB 8.8MB/s 
Collecting texttable>=1.6.2
  Downloading https://files.pythonhosted.org/packages/06/f5/46201c428aebe0eecfa83df66bf3e6caa29659dbac5a56ddfd83cae0d4a4/texttable-1.6.3-py2.py3-none-any.whl
Installing collected packages: texttable, python-igraph, py4cytoscape
Successfully installed py4cytoscape-0.0.6 python-igraph-0.8.3 texttable-1.6.3


---
##Setup: Set up Cytoscape connection


In [9]:
import IPython
import py4cytoscape as p4c
print(f'Loading Javascript client ... {p4c.get_browser_client_channel()} on {p4c.get_jupyter_bridge_url()}')
browser_client_js = p4c.get_browser_client_js(False)
IPython.display.Javascript(browser_client_js) # Start browser client


Loading Javascript client ... 96356a35-814f-48e5-9e82-f2f8b02ab8ae on https://jupyter-bridge.cytoscape.org


<IPython.core.display.Javascript object>

---
#Sanity tests to verify Cytoscape connection


---
##Sanity test: Cytoscape version


In [10]:
p4c.cytoscape_version_info()


{'apiVersion': 'v1',
 'automationAPIVersion': '1.0.0',
 'cytoscapeVersion': '3.8.2',
 'jupyterBridgeVersion': '0.0.2',
 'py4cytoscapeVersion': '0.0.6'}

---
## Sanity test: Cytoscape's sandbox path

**Sandboxing is explained in https://py4cytoscape.readthedocs.io/en/latest/concepts.html#sandboxing**

In [11]:
p4c.sandbox_get_file_info('.')

{'filePath': 'C:\\Users\\CyDeveloper\\CytoscapeConfiguration\\filetransfer\\default_sandbox',
 'isFile': False,
 'modifiedTime': '2020-11-10 15:32:21.0721'}

In [13]:
p4c.sandbox_send_to('GDS112_full.soft')

{'filePath': 'C:\\Users\\CyDeveloper\\CytoscapeConfiguration\\filetransfer\\default_sandbox\\GDS112_full.soft'}

Get the gene expression data into a data frame that has Gene ID as a string ... that's what will fit with the 'name' column already in the BIOGRID network

ToDo:
1) Consider how to use read_csv to force Gene ID to string ... see here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html ... dtype parameter
2) *do* drop null GeneIDs
3) Consider how to use read_csv to avoid having to do wget in the first place

In [47]:
import pandas as df
GDS112_full = df.read_csv('GDS112_full.soft', skiprows=82, sep='\t')
GDS112_full.dropna(subset=['Gene ID'], inplace=True)
GDS112_full['Gene ID'] = df.to_numeric(GDS112_full['Gene ID'], downcast='integer')
GDS112_full = GDS112_full.astype({'Gene ID': 'string'})
print(GDS112_full.dtypes)


GDS112_full



ID_REF                    object
IDENTIFIER                object
GSM1029                  float64
GSM1030                  float64
GSM1032                  float64
GSM1033                  float64
GSM1034                  float64
Gene title                object
Gene symbol               object
Gene ID                   string
UniGene title            float64
UniGene symbol           float64
UniGene ID               float64
Nucleotide Title         float64
GI                       float64
GenBank Accession        float64
Platform_CLONEID         float64
Platform_ORF              object
Platform_SPOTID           object
Chromosome location      float64
Chromosome annotation     object
GO:Function               object
GO:Process                object
GO:Component              object
GO:Function ID            object
GO:Process ID             object
GO:Component ID           object
dtype: object


Unnamed: 0,ID_REF,IDENTIFIER,GSM1029,GSM1030,GSM1032,GSM1033,GSM1034,Gene title,Gene symbol,Gene ID,UniGene title,UniGene symbol,UniGene ID,Nucleotide Title,GI,GenBank Accession,Platform_CLONEID,Platform_ORF,Platform_SPOTID,Chromosome location,Chromosome annotation,GO:Function,GO:Process,GO:Component,GO:Function ID,GO:Process ID,GO:Component ID
24,25,TFC3,-0.663,0.144,0.605,0.696,0.659,transcription factor TFIIIC subunit TFC3,TFC3,851262,,,,,,,,YAL001C,,,"Chromosome I, NC_001133.9 (147594..151166, com...","DNA binding///contributes_to DNA binding, bend...",5S class rRNA transcription from RNA polymeras...,mitochondrion///mitochondrion///colocalizes_wi...,GO:0003677///contributes_to GO:0008301///contr...,GO:0042791///GO:0042791///GO:0071168///GO:0006...,GO:0005739///GO:0005739///colocalizes_with GO:...
25,26,EFB1,0.678,0.343,0.844,-0.072,-0.084,translation elongation factor 1 subunit beta,EFB1,851260,,,,,,,,YAL003W,,,"Chromosome I, NC_001133.9 (142174..143160)",guanyl-nucleotide exchange factor activity///t...,maintenance of translational fidelity///negati...,eukaryotic translation elongation factor 1 com...,GO:0005085///GO:0003746,GO:1990145///GO:0032232///GO:0006449///GO:0006...,GO:0005853///GO:0005853
26,27,SSA1,-0.956,-0.026,1.441,0.854,0.025,Hsp70 family ATPase SSA1,SSA1,851259,,,,,,,,YAL005C,,,"Chromosome I, NC_001133.9 (139503..141431, com...",ATP binding///ATPase activity///nucleotide bin...,SRP-dependent cotranslational protein targetin...,cell wall///colocalizes_with chaperonin-contai...,GO:0005524///GO:0016887///GO:0000166///GO:0000...,GO:0006616///GO:0072318///GO:0002181///GO:0043...,GO:0005618///colocalizes_with GO:0005832///GO:...
27,28,FUN14,-0.435,-0.247,0.662,0.688,0.192,Fun14p,FUN14,851225,,,,,,,,YAL008W,,,"Chromosome I, NC_001133.9 (136914..137510)",molecular_function,mitochondrion organization///phospholipid home...,integral component of membrane///integral comp...,GO:0003674,GO:0007005///GO:0055091,GO:0016021///GO:0031307///GO:0016020///GO:0005...
28,29,MDM10,-0.505,0.169,0.823,0.457,0.208,Mdm10p,MDM10,851223,,,,,,,,YAL010C,,,"Chromosome I, NC_001133.9 (134184..135665, com...",molecular_function,establishment of mitochondrion localization///...,ERMES complex///ERMES complex///integral compo...,GO:0003674,GO:0051654///GO:0000002///GO:0070096///GO:0070...,GO:0032865///GO:0032865///GO:0016021///GO:0031...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9201,9202,VMA13,0.162,-0.202,0.649,0.132,0.042,H(+)-transporting V1 sector ATPase subunit H,VMA13,856148,,,,,,,,YPR036W,,,"Chromosome XVI, NC_001148.4 (643836..645272)","hydrolase activity, acting on acid anhydrides,...",ATP hydrolysis coupled proton transport///ion ...,fungal-type vacuole membrane///integral compon...,GO:0016820///GO:0046961///GO:0046961,GO:0015991///GO:0006811///GO:0015992///GO:0006...,GO:0000329///GO:0016021///GO:0016020///GO:0005...
9203,9204,TIP41,0.084,-0.088,0.262,0.118,0.039,Tip41p,TIP41,856153,,,,,,,,YPR040W,,,"Chromosome XVI, NC_001148.4 (647305..648375)",molecular_function,negative regulation of signal transduction///s...,cytoplasm///cytoplasm///nucleus///nucleus,GO:0003674,GO:0009968///GO:0007165///GO:0007165,GO:0005737///GO:0005737///GO:0005634///GO:0005634
9205,9206,ANT1,0.163,-0.314,0.011,0.350,0.472,Ant1p,ANT1,856246,,,,,,,,YPR128C,,,"Chromosome XVI, NC_001148.4 (791218..792204, c...",adenine nucleotide transmembrane transporter a...,ATP transport///fatty acid beta-oxidation///fa...,cytoplasm///integral component of membrane///i...,GO:0000295///GO:0000295,GO:0015867///GO:0006635///GO:0006635///GO:0006...,GO:0005737///GO:0016021///GO:0016021///GO:0005...
9207,9208,RPS23B,0.849,0.124,-0.872,-1.023,-0.432,ribosomal 40S subunit protein S23B,RPS23B,856250,,,,,,,,YPR132W,,,"Chromosome XVI, NC_001148.4 (794965..795767)",structural constituent of ribosome///structura...,maturation of SSU-rRNA from tricistronic rRNA ...,cytoplasm///cytosolic small ribosomal subunit/...,GO:0003735///GO:0003735,GO:0000462///GO:0006450///GO:0006450///GO:0006412,GO:0005737///GO:0022627///GO:0005622///GO:0030...


In [48]:
p4c.load_table_data(GDS112_full, data_key_column='Gene ID')

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data_subset[col] = col_val


'Success: Data loaded in defaultnode table'