# Automation Cytoscape network creation using RCy3 package
We are going to use the RCy3 package in R to automate the workflow for network creation.
We are using multiple gene epxression datasets acquired from different studies. These datasets contain gene expression profiles of patients with various diseases. 
The diseases which are going to be used in this notebook are: lung-cancer (LC), breast-cancer (BC), metabolically unhealthy obesity (MUO), rheumatoid arthritis (RA), dilated cardiomyopathy (DCM), ishemic cardiomyopathy (ICM) and systemis lupus erythematosus (SLE).

The network will consist of pathways, pathway clusters and the genes which occur in these pathways. These pathways are acquired based on genes associated with inflammation which were retrieved from [DisGeNET](http://www.disgenet.org/home/). 

In [38]:
# check wd
getwd()

In [39]:
# load library
library(RCy3)

### We first start with opening cytoscape and checking if we are connected

In [40]:
# check if cytoscape is open and check version
cytoscapePing()
cytoscapeVersionInfo()

### Network creation
To begin with, we will load in our node and edge files to create our network.

We have to clean these files so they are usable for network creation.

In [41]:
# load in data of network
nodes <- as.data.frame(read.table(file.path(getwd(), "data-output", "node_table_final.txt"), header = T, sep = "\t", stringsAsFactors = FALSE))
edges <- as.data.frame(read.table(file.path(getwd(), "data-output", "edge_table_final.txt"), header = T, sep = "\t", stringsAsFactors = FALSE))

# clean up data
colnames(nodes)[1] <- "id"
nodes$id <- as.character(nodes$id)
colnames(edges)[c(1,2)] <- c("source", "target") 
edges$interaction <- "interacts"
edges$target <- as.character(edges$target)

head(nodes)
head(edges)

id,type
Cytokines,Process
Inflammation,Process
NFkB,Process
Angiogenesis,Process
Metabolism,Process
Complement,Process


source,target,interaction
Cytokines,GNB1,interacts
Cytokines,PTPN11,interacts
Cytokines,SHC1,interacts
Cytokines,PIK3R1,interacts
Cytokines,UBC,interacts
Cytokines,UBA52,interacts


Now we can create our network in Cytoscape

In [42]:
# create network from the data frames and map column for entrezgene IDs
createNetworkFromDataFrames(nodes, edges, title = "MyNetwork", collection = "MyCollection")

Loading data...
Applying default style...
Applying preferred layout...


We have to load in our gene expression data file, and merge this file with the table in cytoscape.

In [43]:
# load data set with gene expression values (logFC, p.value)
expr_data <- read.table(file.path(getwd(), "data-output", "merged_data_final.txt"), header = T, sep ="\t")

# load data into cytoscape
loadTableData(expr_data, data.key.colum = "hgnc_symbol", table.key.column = "shared name")

# check if tables are well merged
nodeTable <- getTableColumns(table = "node")
head(nodeTable)

Unnamed: 0,SUID,shared name,name,selected,id,type,entrezgene,hgnc_symbol,logFC_BC,PValue_BC,logFC_MUO,PValue_MUO,logFC_RA,PValue_RA,logFC_DCM,PValue_DCM
62,62,Cytokines,Cytokines,False,Cytokines,Process,,,,,,,,,,
63,63,Inflammation,Inflammation,False,Inflammation,Process,,,,,,,,,,
64,64,NFkB,NFkB,False,NFkB,Process,,,,,,,,,,
65,65,Angiogenesis,Angiogenesis,False,Angiogenesis,Process,,,,,,,,,,
66,66,Metabolism,Metabolism,False,Metabolism,Process,,,,,,,,,,
67,67,Complement,Complement,False,Complement,Process,,,,,,,,,,


### Manual visualization 
Before we start to visualize our network, we have to perform some manual stuff! 

To analyze the network we have to go to Cytoscape. First Open cytoscape go to Tools then NetworkAnalyzer then Network Analysis then Analyze Network. Treat the network as unidirected and click OK. We analyzed the network now we want to change the layout of the network. Go to Layout then yFiles Organic Layout. Close this message box when finished, and return to R.

### Automated visualization
Now everthing is loaded into cytoscape and the manual stuff is done, we can working on the automatization of the visual style.

For every dataset we would like to create a own network. We will do this using the clone network function.

When we have a network available for every dataset we have, we can set a visual style which maps the gene expression (logFC) and significance (p-value) for every gene node in the network per dataset. Fill in for the Automation function y, z and p. Whereas; y is the negative logFC cutoff, z is the positive logFC cutoff and p is the p-value.

In [44]:
# Fill in for the Automation function y, z and p. 
# Whereas negLogFC is the negative logFC cutoff, posLogFC is the positive logFC cutoff and pValue is the p-value.
pValue <- 0.05
negLogFC <- -0.26
posLogFC <- 0.26

In [49]:
Automation <- function(y,z,p,disease) {
    
# clone network
cloneNetwork(network = "MyNetwork")
setCurrentNetwork(network="MyNetwork_1")
renameNetwork(disease)
 
# lock width and height of nodes
lockNodeDimensions(TRUE)
    
# Map logFC and pValue per disease for every respective network and add visual style to every network
setCurrentNetwork(network = disease)
style.name = paste("style_", disease, sep = "")
  
# map logFC of gene expression data per disease on respective networks
mappings <- list(nodeFill <- mapVisualProperty("node fill color", table.column = paste("logFC_", disease , sep = ""), 
                                    mapping.type = "continuous",
                                    c(y,0,z),
                                    c("#0000FF", "#FFFFFF", "#FF0000")),
  
# map significance per disease on respective networks
nodeBorder <- mapVisualProperty("node border paint", table.column = paste("PValue_", disease, sep = ""),
                                    mapping.type = "continuous",
                                    table.column.values = c(0.00, p, 1.00), 
                                    c("#00FF00", "#FFFFFF", "#FFFFFF")),
  
# map node shapes
nodeShape <- mapVisualProperty("Node Shape", table.column = "type", 
                                    mapping.type = "discrete", 
                                    table.column.values = c("Gene", "Process", "InflGene"),
                                    c("DIAMOND", "ELLIPSE", "VEE")),
  
label <- mapVisualProperty("Node Label", table.column = "shared name",
                             mapping.type = "passthrough"))

defaults <- list(NODE_FILL_COLOR = "#999999",
                NODE_BORDER_PAINT = "#999999",
                NODE_BORDER_WIDTH = 7,
                NODE_LABEL_FONT_SIZE = 18,
                NETWORK_TITLE = disease)
  
# create and set unique visual style per disease dataset
createVisualStyle(style.name, defaults, mappings = mappings)
setVisualStyle(style.name)

# save disease networks
setCurrentNetwork(network = disease)
fitContent()
  
png.file <- file.path(getwd(), "data-output", "images", paste0(z, "_gene_expr_network_", disease, ".png"))
exportImage(png.file, type = "png", resolution=600, zoom=500)

# create subnetwork of significant differentially expressed genes
setCurrentNetwork(network = disease)
  
createColumnFilter(filter.name = "expr genes",     column = paste("logFC_", disease, sep = "") , c(y,z), "IS_NOT_BETWEEN")
createColumnFilter(filter.name = "sig genes",      column = paste("PValue_", disease, sep = ""), p, "LESS_THAN")
sigexpr <- createCompositeFilter('combined filter', filter.list = c("sig genes", "expr genes"), type = "ALL")
  
process <- createColumnFilter(filter.name = "Pathway filter", column = "type", "Process", "IS")
  
selectNodes(nodes = c(process$nodes, sigexpr$nodes), by.col = "shared name")
  
createSubnetwork(nodes = "selected", subnetwork.name = paste0("subnetwork ", disease, " genes"))
    
setCurrentNetwork(network = paste0("subnetwork ", disease, " genes"))
fitContent()
  
png.file <- file.path(getwd(), "data-output", "images", paste0(z, "_subgene_expr_network_", disease, ".png"))
exportImage(png.file, type = "png", resolution=600, zoom=500)

# create subnetwork of significant differentially expressed inflammation genes
setCurrentNetwork(network = disease)
  
createColumnFilter(filter.name = "Infl filter",    column = "type", "InflGene", "IS")
createColumnFilter(filter.name = "expr genes",     column = paste("logFC_", disease, sep = "") , c(y,z), "IS_NOT_BETWEEN")
createColumnFilter(filter.name = "sig genes",      column = paste("PValue_", disease, sep = ""), p, "LESS_THAN")
sigexpr2 <- createCompositeFilter('combined filter', filter.list = c("Infl filter", "sig genes", "expr genes"), type = "ALL")
  
process <- createColumnFilter(filter.name = "Pathway filter", column = "type", "Process", "IS")
  
selectNodes(nodes = c(process$nodes, sigexpr2$nodes), by.col = "shared name")
  
createSubnetwork(nodes = "selected", subnetwork.name = paste0("subnetwork ", disease, " inflGenes"))
    
setCurrentNetwork(network = paste0("subnetwork ", disease, " inflGenes"))
fitContent()
  
png.file <- file.path(getwd(), "data-output", "images", paste0(z, "_subinflgene_expr_network_", disease, ".png"))
exportImage(png.file, type = "png", resolution=600, zoom=500)

}

In [50]:
# fill in abbreviations used for disease gene expression daatasets, same abbreviation used for logFC_x and PValue_x
diseases <- c("BC","MUO","RA","DCM")

# loop over function for every disease dataset
for(i in 1:length(diseases)) {
    print(diseases[i])
    Automation(y = negLogFC, z = posLogFC, p = pValue, disease = diseases[i])
}

# save Cytoscape session
session.file <- file.path(getwd(), "data-output", "networks", paste0(posLogFC, "_gene_expr_networks.cys"))
saveSession(session.file)

[1] "BC"
[1] "MUO"
[1] "RA"
[1] "DCM"


In [51]:
# information about session
sessionInfo()

R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17134)

Matrix products: default

locale:
[1] LC_COLLATE=Dutch_Netherlands.1252  LC_CTYPE=Dutch_Netherlands.1252   
[3] LC_MONETARY=Dutch_Netherlands.1252 LC_NUMERIC=C                      
[5] LC_TIME=Dutch_Netherlands.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RCy3_2.2.6           RevoUtils_11.0.1     RevoUtilsMath_11.0.0

loaded via a namespace (and not attached):
 [1] igraph_1.2.2        graph_1.60.0        Rcpp_1.0.0         
 [4] magrittr_1.5        BiocGenerics_0.28.0 uuid_0.1-2         
 [7] R6_2.3.0            httr_1.4.0          tools_3.5.1        
[10] parallel_3.5.1      R.oo_1.22.0         htmltools_0.3.6    
[13] digest_0.6.18       crayon_1.3.4        RJSONIO_1.3-1.1    
[16] IRdisplay_0.7.0     repr_0.19.1         base64enc_0.1-3    
[19] R.utils_2.7.0       curl_3.3    