# Automation Cytoscape network creation using RCy3 package
We are going to use the RCy3 package in R to automate the workflow for STRING network creation. We are using multiple gene epxression datasets acquired from different studies. These datasets contain gene expression profiles of patients with various diseases. The diseases which are going to be used in this notebook are: lung-cancer (LC), breast-cancer (BC), metabolically unhealthy obesity (MUO), rheumatoid arthritis (RA), dilated cardiomyopathy (DCM), ishemic cardiomyopathy (ICM) and systemis lupus erythematosus (SLE).

The network will consist of pathways, pathway clusters and the genes which occur in these pathways. These pathways are acquired based on genes associated with inflammation which were retrieved from DisGeNET and GeneCards.

In [1]:
# check working directory
getwd()

In [2]:
# load libraries
library(RCy3)

### We first start with opening cytoscape and checking if we are connected.
Then we will load all our necessary files.

In [3]:
# check connecitojn with cytoscape
cytoscapePing()

In [4]:
# load data
inflgenes <- read.table(file.path(getwd(), "merged_infl_genes.txt"), header = T, sep = "\t")
pwgenes <- read.table(file.path(getwd(), "pwgenes.txt"), header = T, sep = "\t")
expr_data <- read.table(file.path(getwd(), "merged_data_final.txt"), header = T, sep = "\t")

# merge data files
inflgenes <- inflgenes[c(-1,-3,-4,-5)]
pwgenes <- pwgenes[c(-1,-3)]
colnames(pwgenes)[1] <- "entrezgene"
genes <- unique(rbind(inflgenes, pwgenes))

# type genes to either inflammation genes or genes
genes$type <- "Gene"
genes$type[genes$entrezgene %in% inflgenes$entrezgene] <- "InflGene"

Lets install STRING in Cytoscape before we move on.

In [5]:
# install string
installApp('STRINGapp') 
if("string" %in% commandsHelp("")) print("Success: the STRING app is installed") else print("Warning: STRING app is not installed. Please install the STRING app before proceeding.")

[1] "Available namespaces:"
[1] "Success: the STRING app is installed"


### Network creation
Lets create the network using STRING command API.

In [6]:
# create string network from genes and set visual style to default
string_cmd <- paste('string protein query taxonID=9606 cutoff=0.9 query="',paste(genes$entrezgene, collapse=","),'"',sep="")
commandsGET(string_cmd)

setVisualStyle("default")
setNodeLabelMapping(table.column = "display name")

In [7]:
# delete unnecessary columns
deleteTableColumn(column = "description", table = "node")
deleteTableColumn(column = "full name", table = "node")
deleteTableColumn(column = "database identifier", table = "node")
deleteTableColumn(column = "name space", table = "node")
deleteTableColumn(column = "@id", table = "node")
deleteTableColumn(column = "node type", table = "node")
deleteTableColumn(column = "enhancedLabel Passthrough", table = "node")
deleteTableColumn(column = "STRING style", table = "node")
deleteTableColumn(column = "species", table = "node")
deleteTableColumn(column = "sequence", table = "node")
deleteTableColumn(column = "namespace", table = "node")
deleteTableColumn(column = "canonical", table = "node")
deleteTableColumn(column = "canonical name", table = "node")
deleteTableColumn(column = "compartment cytosol", table = "node")
deleteTableColumn(column = "compartment chloroplast", table = "node")
deleteTableColumn(column = "compartment cytoskeleton", table = "node")
deleteTableColumn(column = "compartment extracellular", table = "node")
deleteTableColumn(column = "compartment endosome", table = "node")
deleteTableColumn(column = "compartment endoplasmic reticulum", table = "node")
deleteTableColumn(column = "compartment golgi apparatus", table = "node")
deleteTableColumn(column = "compartment lysosome", table = "node")
deleteTableColumn(column = "compartment mitochondrion", table = "node")
deleteTableColumn(column = "compartment nucleus", table = "node")
deleteTableColumn(column = "compartment vacuole", table = "node")
deleteTableColumn(column = "compartment peroxisome", table = "node")
deleteTableColumn(column = "compartment plasma membrane", table = "node")
deleteTableColumn(column = "image", table = "node")
deleteTableColumn(column = "interactor score", table = "node")
deleteTableColumn(column = "target development level", table = "node")
deleteTableColumn(column = "target family", table = "node")
deleteTableColumn(column = "tissue adrenal gland", table = "node")
deleteTableColumn(column = "tissue atrium", table = "node")
deleteTableColumn(column = "tissue blood", table = "node")
deleteTableColumn(column = "tissue bone", table = "node")
deleteTableColumn(column = "tissue bone marrow", table = "node")
deleteTableColumn(column = "tissue eye", table = "node")
deleteTableColumn(column = "tissue gall bladder", table = "node")
deleteTableColumn(column = "tissue heart", table = "node")
deleteTableColumn(column = "tissue intestine", table = "node")
deleteTableColumn(column = "tissue kidney", table = "node")
deleteTableColumn(column = "tissue left atrium", table = "node")
deleteTableColumn(column = "tissue left ventricle", table = "node")
deleteTableColumn(column = "tissue liver", table = "node")
deleteTableColumn(column = "tissue lung", table = "node")
deleteTableColumn(column = "tissue lymph node", table = "node")
deleteTableColumn(column = "tissue muscle", table = "node")
deleteTableColumn(column = "tissue nervous system", table = "node")
deleteTableColumn(column = "tissue pancreas", table = "node")
deleteTableColumn(column = "tissue right atrium", table = "node")
deleteTableColumn(column = "tissue right ventricle", table = "node")
deleteTableColumn(column = "tissue saliva", table = "node")
deleteTableColumn(column = "tissue skin", table = "node")
deleteTableColumn(column = "tissue spleen", table = "node")
deleteTableColumn(column = "tissue stomach", table = "node")
deleteTableColumn(column = "tissue thyroid gland", table = "node")
deleteTableColumn(column = "tissue urine", table = "node")
deleteTableColumn(column = "tissue ventricle", table = "node")

## One gene did not get mapped properly to an HGNC-symbol, namely: TRPV1 (ID:ENSP00000459962). Lets do this manually in the node table, column 'display name'!

We have to load in our gene expression data file, and merge this file with the table in cytoscape.

In [8]:
# add gene expression data
loadTableData(expr_data, data.key.colum = "hgnc_symbol", table.key.column = "display name")

# add gene type
loadTableData(genes, data.key.colum = "entrezgene", table.key.column = "query term")

Now we will cluster the network using community clusters (gLay).

In [9]:
# cluster network using gLay clustering
glay_cmd <- paste('cluster glay clusterAttribute="__glayCluster" createGroups=false restoreEdges=false selectedOnly=false showUI=true undirectedEdges=true', sep="")
commandsGET(glay_cmd)

### Visualization 
Before we start to visualize our network, we have to perform some manual stuff!
Go to 'Tools' -> 'Network Analyzer' -> 'Network Analysis' -> 'Analyze Network...' -> 'OK'

Now lets clone the network that every disease dataset hads its own network and then we will map the gene expression on the respective networks and visualize it.

Now we have a network available for every dataset we have, we can set a visual style which maps the gene expression (logFC) and significance (p-value) for every gene node in the network per dataset.

In [10]:
# create network for every disease and map gene expression data
# lock width and height of nodes
lockNodeDimensions(TRUE)

# create for every gene expression set a own network, using the clone network function
# we have 7 gene expression sets so we repeat this function 8 times
diseases <- c("BC","LC","MUO","RA","DCM","ICM","SLE")

x <- 1
repeat {
  cloneNetwork(network = "String Network--clustered")
  setCurrentNetwork(network="String Network--clustered_1")
  renameNetwork(diseases[x])
  x = x + 1
  if (x == length(diseases)+1){
    print("Done!")
    break}
}

# Map logFC and pValue per disease for every respective network and add visual style to every network
x <- 1
repeat {
  setCurrentNetwork(network = diseases[x])
  
  style.name = paste("style_", diseases[x], sep = "")
  
  # map logFC of gene expression data per disease on respective networks
  mappings <- list(nodeFill <- mapVisualProperty("node fill color", table.column = paste("logFC_", diseases[x] , sep = ""), 
                                                 mapping.type = "continuous",
                                                 c(-1.00,0 ,1.00),
                                                 c("#0000FF", "#FFFFFF", "#FF0000")),
                   
                   # map significance per disease on respective networks
                   nodeBorder <- mapVisualProperty("node border paint", table.column = paste("PValue_", diseases[x], sep = ""),
                                                   mapping.type = "continuous",
                                                   table.column.values = c(0.00, 0.05, 1.00), 
                                                   c("#00FF00", "#FFFFFF", "#FFFFFF")),
                   
                   # map node shapes
                   nodeShape <- mapVisualProperty("Node Shape", table.column = "type", 
                                                  mapping.type = "discrete", 
                                                  table.column.values = c("Gene", "InflGene"),
                                                  c("DIAMOND", "VEE")),
                   
                   label <- mapVisualProperty("Node Label", table.column = "display name",
                                              mapping.type = "passthrough"))
  
  defaults <- list(NODE_FILL_COLOR = "#999999",
                   NODE_BORDER_PAINT = "#999999",
                   NODE_BORDER_WIDTH = 7,
                   NODE_LABEL_FONT_SIZE = 18,
                   NETWORK_TITLE = diseases[x])
  
  # create and set unique visual style per disease dataset
  createVisualStyle(style.name, defaults, mappings = mappings)
  setVisualStyle(style.name)
  
  x = x + 1
  if (x == length(diseases)+1){
    print("Done!")
    break}
}

[1] "Done!"
[1] "Done!"


Before we are going to select nodes (which we cant deselect...), we are going to save images of the main networks. We do this before selecting nodes so that our images don't contain the yellow highlighted selected nodes. Prior to this step, make sure there are no images saved with the same name, otherwise this part will not work!

In [11]:
# save disease networks
x <- 1
repeat {
  setCurrentNetwork(network = diseases[x])
  fitContent()
  
  png.file <- file.path(getwd(), "images", "networks", paste("gene_expr_network", diseases[x], ".png", sep = ""))
  exportImage(png.file, type = "png", resolution=600, zoom=500)
  
  x = x + 1
  if (x == length(diseases)+1){
    print("Done!")
    break}
}

[1] "Done!"


### Subnetwork creation
We are also interested in the significant differentially expressed genes and inflammation associated genes. So lets make subnetworks out of these properties!

In [12]:
# create subnetwork of significant differentially expressed genes
x <- 1
repeat {
  setCurrentNetwork(network = diseases[x])
  
  createColumnFilter(filter.name = "expr genes",     column = paste("logFC_", diseases[x], sep = "") , c(-1.00,1.00), "IS_NOT_BETWEEN")
  createColumnFilter(filter.name = "sig genes",      column = paste("PValue_", diseases[x], sep = ""), 0.05, "LESS_THAN")
  sigexpr <- createCompositeFilter('combined filter', filter.list = c("sig genes", "expr genes"), type = "ALL")
  
  selectNodes(nodes = sigexpr$nodes, by.col = "display name")
  
  createSubnetwork(nodes = "selected", subnetwork.name = paste("subnetwork ", diseases[x], " genes", sep = ""))
  
  x = x + 1
  if (x == length(diseases)+1){
    print("Done!")
    break}
}

# create subnetwork of significant differentially expressed inflammation genes
x <- 1
repeat {
  setCurrentNetwork(network = diseases[x])
  
  createColumnFilter(filter.name = "Infl filter",    column = "type", "InflGene", "IS")
  createColumnFilter(filter.name = "expr genes",     column = paste("logFC_", diseases[x], sep = "") , c(-1.00,1.00), "IS_NOT_BETWEEN")
  createColumnFilter(filter.name = "sig genes",      column = paste("PValue_", diseases[x], sep = ""), 0.05, "LESS_THAN")
  sigexpr <- createCompositeFilter('combined filter', filter.list = c("Infl filter", "sig genes", "expr genes"), type = "ALL")
  
  
  selectNodes(nodes = sigexpr$nodes, by.col = "display name")
  
  createSubnetwork(nodes = "selected", subnetwork.name = paste("subnetwork ", diseases[x], " inflGenes",sep = ""))
  
  x = x + 1
  if (x == length(diseases)+1){
    print("Done!")
    break}
}

[1] "Done!"
[1] "Done!"


### Saving everything
We are almost finished! We already saved the images of the main networks for every disease, but we still want to save the images of the subnetworks. So lets do that, and after we will save the cytoscape session so nothing is lost.

In [13]:
# save subnetworks genes
subgenes <- c("subnetwork BC genes","subnetwork LC genes","subnetwork MUO genes",
              "subnetwork RA genes","subnetwork DCM genes",
              "subnetwork ICM genes","subnetwork SLE genes")
x <- 1
repeat {
  setCurrentNetwork(network = subgenes[x])
  fitContent()
  
  png.file <- file.path(getwd(), "images", "subGeneNetworks", paste("subgene_expr_network", diseases[x], ".png", sep = ""))
  exportImage(png.file, type = "png", resolution=600, zoom=500)
  
  x = x + 1
  if (x == length(diseases)+1){
    print("Done!")
    break}
}

# save subnetworks inflammation genes
subInflGenes <- c("subnetwork BC inflGenes","subnetwork LC inflGenes","subnetwork MUO inflGenes",
                  "subnetwork RA inflGenes","subnetwork DCM inflGenes",
                  "subnetwork ICM inflGenes","subnetwork SLE inflGenes")
x <- 1
repeat {
  setCurrentNetwork(network = subInflGenes[x])
  fitContent()
  
  png.file <- file.path(getwd(), "images", "subInflGeneNetworks", paste("subinflgene_expr_network", diseases[x], ".png", sep = ""))
  exportImage(png.file, type = "png", resolution=600, zoom=500)
  
  x = x + 1
  if (x == length(diseases)+1){
    print("Done!")
    break}
}

# save Cytoscape session file
session.file <- file.path(getwd(), "network", "string_networks.cys")
saveSession(session.file)

[1] "Done!"
[1] "Done!"
