CoReg is a computational tool for identifying co-regulatory genes in large-scale networks. We provide CoReg as an R package, which can be applied in any organisms with genome-scale regulatory network data.
To install CoReg through github, first you will need to install 'devtools' package for R:
install.packages("devtools")
Load the devtools package:
library(devtools)
Install CoReg package through github:
install_github("LiLabAtVT/CoReg")
This tutorial gives an example of identifying co-regulatory modules and performing rewiring simulation in Arabidopsis transcription network using CoReg package
Load CoReg package in R:
library(CoReg)
Load the arabidopsis example network(use networkFromFile()
function if you want to load you own network for analysis):
data(athNet)
Below is an example of using networkFromFile()
to load other network. Note that network file should be formatted as two-column edge list. First column represents the ID of transcription factors and second column represents the ID of target genes.
athNet<-networkFromFile("araNet.csv",",")
Alternatively, you can convert a two-column data frame into igraph graph object using networkFromEdgeList()
. The first column of the data frame should contain the ID of transcription factors and second column their corresponding target genes.
networkFromEdgeList(edgelist)
Identify co-regulatory modules in arabidopsis network:
athRes<-CoReg(athNet)
Show module finding result
athRes$module
Show similarity matrix
athRes$similarity_matrix
rewSim()
function performs rewiring simulation on Arabidopsis network and get rewiring recall score. CoReg will be compared to other three module-finding methods: label propagation, edge betweeness and walk trap. 50 nodes will be duplicated and rewiring probabilities will be 0.3 and 0.5.
simRes<-rewSim(athNet,nDup = 50, dDup = 10, c(0.3,0.5),c("lp","wt","eb"),2)
Show simulation result
print(simRes)
simRes$evalResult
Plot the rewiring recall score curves
plot(simRes)
computeAuROC()
function computes simulation-based AUC value and ROC curves. The network simulation process is the same as networkSim()
. The AUC and ROC for CoReg + each of three similarity indices are computed:
auROCres <- computeAuROC(athNet,nDup=50,dDup=10,rewProb=0.5,simMethods=c("jaccard","geometric","invlogweighted","wt"))
Show AUC values
auROCres$AUC
Plot ROC curves
plot(auROCres)
netSimAndEval
calls function generateSimNet()
to generate simulated network(s) with pre-specified modular structure, and then different module-finding algorithms are run to identify modules. The correlation between pre-specified modules and algorithm identified modules is calculated using NMI score. The following example generates a simulated network with 5 pre-specified modules. Each module has 10 regulators and each regulator has 20 targets. There are 100 other nodes in the network which do not have outgoing edges (auxiliary nodes). The co-regulation probability for the simulated network is 0.5. After network is constructed, four clustering methods (as specified by testMethods
) will be run to identify modules in the simulated network.
re<-netSimAndEval(10,5,20,100,0.5,testMethods=c("coregJac","lp","wt","eb"))
See the summary of simulated network(s)
summary(re)
Plot the evaluation result
plot(re)
CoReg also provides the functionality of generating the simulated network for other use. The simulated network is returned as an edge list, represented by a two column matrix in R. In the example below, we use the same paramters as we have in the previous example to generate a simulated network
re<-generateSimNet(10,5,20,100,0.5)
# This is the edge list
re$el
# This is the ground-truth module partition
re$modulePartition
Please cite the following paper if CoReg is used in your publication:
Song Q, Grene R, Heath LS, Li S: Identification of regulatory modules in genome scale transcription regulatory networks. BMC Syst Biol 2017, 11:140.
https://bmcsystbiol.biomedcentral.com/articles/10.1186/s12918-017-0493-2