# Automation Cytoscape network creation using RCy3 package
We are going to use the RCy3 package in R to automate the workflow for network creation.
We are using multiple gene epxression datasets acquired from different studies. These datasets contain gene expression profiles of patients with various diseases. 
The diseases which are going to be used in this notebook are: lung-cancer (LC), breast-cancer (BC), metabolically unhealthy obesity (MUO), rheumatoid arthritis (RA), Rett syndrome frontal cortex (Rett_FC), Rett syndrome temporal cortex (Rett_TC) and systemis lupus erythematosus (SLE).

The network will consist of pathways, pathway clusters and the genes which occur in these pathways. These pathways are acquired based on genes associated with inflammation which were retrieved from [DisGeNET](http://www.disgenet.org/home/) and [GeneCards](https://www.genecards.org/). 

### The following step only works in RStudio. If working in another environment, please set the working directory properly and check if the working directory is correct. 

In [None]:
# set wd to where script file is saved
setwd(dirname(rstudioapi::callFun("getActiveDocumentContext")$path))

In [1]:
# check wd
getwd()

In [3]:
# load library
library(RCy3)

### We first start with opening cytoscape and checking if we are connected

In [4]:
# check if cytoscape is open and check version
cytoscapePing()
cytoscapeVersionInfo()

### Network creation
To begin with, we will load in our node and edge files to create our network.

In [7]:
# load in data of network
nodes <- as.data.frame(read.table(file.path(getwd(), "Network", "nodes_network_final.txt"), header = T, sep = "\t", stringsAsFactors = FALSE))
edges <- as.data.frame(read.table(file.path(getwd(), "Network", "edges_network_final.txt"), header = T, sep = "\t", stringsAsFactors = FALSE))

We have to clean these files so they are usable for network creation.

In [8]:
# clean up data
colnames(nodes)[1] <- "id"
nodes$id <- as.character(nodes$id)
edges <- edges[,c(-3,-4)]
colnames(edges)[c(1,2)] <- c("source", "target") 
edges$interaction <- "interacts"
edges$target <- as.character(edges$target)

Now we can create our network in Cytoscape

In [9]:
# create network from the data frames
createNetworkFromDataFrames(nodes, edges, title = "MyNetwork", collection = "MyCollection")

Loading data...
Applying default style...
Applying preferred layout...


We have to load in our gene expression data file, and merge this file with the table in cytoscape.

In [5]:
# load data set with gene expression values (logFC, p.value)
expr_data <- read.table(file.path(getwd(), "expr_data", "merged_data_final.txt"), header = T, sep ="\t")

# load data into cytoscape
loadTableData(expr_data, data.key.colum = "entrezgene", table.key.column = "shared name")

# check if tables are well merged
nodeTable <- getTableColumns(table = "node")
head(nodeTable)

Unnamed: 0,SUID,shared name,id,Type,entrezgene,hgnc_symbol,logFC_BC,PValue_BC,logFC_LC,PValue_LC,...,logFC_RA,PValue_RA,logFC_RETT_FC,PValue_RETT_FC,logFC_RETT_TC,PValue_RETT_TC,logFC_SLE,PValue_SLE,name,selected
22035,22035,7432,7432,InflGene,7432,VIP,-3.91755,1.94e-05,-0.348927,0.6848322,...,-0.158545,0.1114853,-0.3023721,0.5062073,-1.39849,0.006897598,1.219865,2.08e-14,7432,False
22109,22109,335,335,InflGene,335,APOA1,1.737478,0.05827234,-2.883187,0.002499149,...,-0.03512625,0.678017,0.07255186,0.4617053,0.1747566,0.08947839,0.8043408,0.006304733,335,False
21870,21870,3557,3557,InflGene,3557,IL1RN,0.2388055,0.7944324,-0.4649065,0.4449938,...,0.2861675,0.01075267,0.3152021,0.01112743,0.1361192,0.197996,-0.7500728,1.46e-08,3557,False
22382,22382,3106,3106,InflGene,3106,HLA-B,0.7111358,0.1971166,-0.3765966,0.5696242,...,0.6345852,0.000859463,0.0,0.0,0.0,0.0,-0.8482147,0.0242846,3106,False
22405,22405,721,721,InflGene,721,C4B,4.491728,0.007113589,-1.689584,0.006411302,...,0.0,0.0,0.120021,0.181651,0.01946219,0.8231107,1.045949,0.001240478,721,False
21916,21916,4312,4312,InflGene,4312,MMP1,4.000859,0.01006965,4.081474,0.008611911,...,7.151851,2.62e-07,0.1109809,0.3236545,0.080401,0.4709361,0.8922701,0.01458923,4312,False


### Visualization 
Before we start to visualize our network, we have to perform some manual stuff! 

In [10]:
# first we will get rid off the manual stuff
system('CMD /C "ECHO To analyze the network we have to go to Cytoscape. First Open cytoscape go to Tools then NetworkAnalyzer then Network Analysis then Analyze Network. Treat the network as unidirected and click OK. We analyzed the network now we want to change the layout of the network. Go to Layout then yFiles Organic Layout. Close this message box when finished, and return to R. && PAUSE"', 
       invisible=FALSE, wait=FALSE)

Now everthing is loaded into cytoscape and the manual stuff is done, we can working on the automatization of the visual style.

In [11]:
# lock width and height of nodes
lockNodeDimensions(TRUE)

For every dataset we would like to create a own network. We will do this using the clone network function.

In [12]:
# we have 7 gene expression sets so we repeat this function 8 times, we will use a repeat loop for this
diseases <- c("BC","LC","MUO","RA","RETT_FC","RETT_TC","SLE")

x <- 1
repeat {
  cloneNetwork(network = "MyNetwork")
  setCurrentNetwork(network="MyNetwork_1")
  renameNetwork(diseases[x])
  x = x + 1
  if (x == length(diseases)+1){
  print("Done!")
  break}
}

Now we have a network available for every dataset we have, we can set a visual style which maps the gene expression (logFC) and significance (p-value) for every gene node in the network per dataset.

In [None]:
# Map logFC and pValue per disease for every respective network and add visual style to every network
x <- 1
repeat {
  setCurrentNetwork(network = diseases[x])
  
  style.name = paste("style_", diseases[x], sep = "")
  
  # map logFC of gene expression data per disease on respective networks
  mappings <- list(nodeFill <- mapVisualProperty("node fill color", table.column = paste("logFC_", diseases[x] , sep = ""), 
                                    mapping.type = "continuous",
                                    c(-0.58,0 ,0.58),
                                    c("#0000FF", "#FFFFFF", "#FF0000")),
  
  # map significance per disease on respective networks
  nodeBorder <- mapVisualProperty("node border paint", table.column = paste("PValue_", diseases[x], sep = ""),
                                    mapping.type = "continuous",
                                    table.column.values = c(0.00, 0.05, 1.00), 
                                    c("#00FF00", "#FFFFFF", "#FFFFFF")),
  
  # map node shapes
  nodeShape <- mapVisualProperty("Node Shape", table.column = "Type", 
                                    mapping.type = "discrete", 
                                    table.column.values = c("Gene", "Pathway", "Cluster", "InflGene"),
                                    c("DIAMOND", "ELLIPSE", "ROUND_RECTANGLE", "VEE")),
  
  label <- mapVisualProperty("Node Label", table.column = "shared name",
                             mapping.type = "passthrough"))

  defaults <- list(NODE_FILL_COLOR = "#999999",
                   NODE_BORDER_PAINT = "#999999",
                   NODE_BORDER_WIDTH = 7,
                   NODE_LABEL_FONT_SIZE = 18,
                   NETWORK_TITLE = diseases[x])
  
  # create and set unique visual style per disease dataset
  createVisualStyle(style.name, defaults, mappings = mappings)
  setVisualStyle(style.name)

  x = x + 1
  if (x == length(diseases)+1){
    print("Done!")
    break}
}

Before we are going to select nodes (which we cant deselect...), we are going to save images of the main networks. We do this before selecting nodes so that our images don't contain the yellow highlighted selected nodes. Prior to this step, make sure there are no images saved with the same name, otherwise this part will not work!

In [None]:
# save disease networks
x <- 1
repeat {
  setCurrentNetwork(network = diseases[x])
  fitContent()
  
  png.file <- file.path(getwd(), "Images", "networks", paste("gene_expr_network", diseases[x], ".png", sep = ""))
  exportImage(png.file, type = "png", resolution=600, zoom=500)
  
  x = x + 1
  if (x == length(diseases)+1){
    print("Done!")
    break}
}

### Subnetwork creation
We are also interested in the significant differentially expressed genes and inflammation associated genes. So lets make subnetworks out of these properties!

In [None]:
# create subnetwork of significant differentially expressed genes
x <- 1
repeat {
  setCurrentNetwork(network = diseases[x])
  
  createColumnFilter(filter.name = "expr genes",     column = paste("logFC_", diseases[x], sep = "") , c(-0.58,0.58), "IS_NOT_BETWEEN")
  createColumnFilter(filter.name = "sig genes",      column = paste("PValue_", diseases[x], sep = ""), 0.05, "LESS_THAN")
  sigexpr <- createCompositeFilter('combined filter', filter.list = c("sig genes", "expr genes"), type = "ALL")
  
  pathways <- createColumnFilter(filter.name = "Pathway filter", column = "Type", "Pathway", "IS")
  clusters <- createColumnFilter(filter.name = "Cluster filter", column = "Type", "Cluster", "IS")
  
  selectNodes(nodes = c(pathways$nodes, clusters$nodes, sigexpr$nodes), by.col = "shared name")
  
  createSubnetwork(nodes = "selected", subnetwork.name = paste("subnetwork ", diseases[x], " genes", sep = ""))
  
  x = x + 1
  if (x == length(diseases)+1){
    print("Done!")
    break}
}

In [None]:
# create subnetwork of significant differentially expressed inflammation genes
x <- 1
repeat {
  setCurrentNetwork(network = diseases[x])
  
  createColumnFilter(filter.name = "Infl filter",    column = "Type", "InflGene", "IS")
  createColumnFilter(filter.name = "expr genes",     column = paste("logFC_", diseases[x], sep = "") , c(-0.58,0.58), "IS_NOT_BETWEEN")
  createColumnFilter(filter.name = "sig genes",      column = paste("PValue_", diseases[x], sep = ""), 0.05, "LESS_THAN")
  sigexpr <- createCompositeFilter('combined filter', filter.list = c("Infl filter", "sig genes", "expr genes"), type = "ALL")
  
  pathways <- createColumnFilter(filter.name = "Pathway filter", column = "Type", "Pathway", "IS")
  clusters <- createColumnFilter(filter.name = "Cluster filter", column = "Type", "Cluster", "IS")
  
  selectNodes(nodes = c(pathways$nodes, clusters$nodes, sigexpr$nodes), by.col = "shared name")
  
  createSubnetwork(nodes = "selected", subnetwork.name = paste("subnetwork ", diseases[x], " inflGenes",sep = ""))
  
  x = x + 1
  if (x == length(diseases)+1){
    print("Done!")
    break}
}

### Saving everything
We are almost finished! We already saved the images of the main networks for every disease, but we still want to save the images of the subnetworks. So lets do that, and after we will save the cytoscape session so nothing is lost.

In [None]:
# save subnetworks genes
subgenes <- c("subnetwork BC genes","subnetwork LC genes","subnetwork MUO genes",
              "subnetwork RA genes","subnetwork RETT_FC genes",
              "subnetwork RETT_TC genes","subnetwork SLE genes")
x <- 1
repeat {
  setCurrentNetwork(network = subgenes[x])
  fitContent()
  
  png.file <- file.path(getwd(), "Images", "subGeneNetworks", paste("subgene_expr_network", diseases[x], ".png", sep = ""))
  exportImage(png.file, type = "png", resolution=600, zoom=500)
  
  x = x + 1
  if (x == length(diseases)+1){
    print("Done!")
    break}
}

In [None]:
# save subnetworks inflammation genes
subInflGenes <- c("subnetwork BC inflGenes","subnetwork LC inflGenes","subnetwork MUO inflGenes",
              "subnetwork RA inflGenes","subnetwork RETT_FC inflGenes",
              "subnetwork RETT_TC inflGenes","subnetwork SLE inflGenes")
x <- 1
repeat {
  setCurrentNetwork(network = subInflGenes[x])
  fitContent()
  
  png.file <- file.path(getwd(), "Images", "subInflGeneNetworks", paste("subinflgene_expr_network", diseases[x], ".png", sep = ""))
  exportImage(png.file, type = "png", resolution=600, zoom=500)
  
  x = x + 1
  if (x == length(diseases)+1){
    print("Done!")
    break}
}

In [None]:
# save Cytoscape session file
session.file <- file.path(getwd(), "Network", "gene_expr_networks.cys")
saveSession(session.file)