TGPred v.0.1.0

TGPred is a R package including six efficient methods for predicting target genes of a transcription factor by integrating statistics, machine learning, and optimization

HuberNet: Huber loss function along with Network-based penalty function;
HuberLasso: Huber loss function along with Lasso penalty function;
HuberENET: Huber loss function along with Elastic Net penalty function;
MSEENET: Mean square error loss function along with Elastic Net penalty function;
MSELasso: Mean square error loss function along with Lasso penalty function;
MSENet: Mean square error loss function along with Network-based penalty function.
APGD: The Accelerated Proximal Gradient Descent (APGD) algorithm to solve the above six penalized regression models.

We also have Python version, please see the following link for the guideline of Python version https://github.com/tobefuture/TGPred.

Installation

You can install the released version of TGPred from Github with:

devtools::install_github("xueweic/TGPred")

Reference

Xuewei Cao⁺, Ling Zhang⁺, Mingxia Zhao, Cheng He, Kui Zhang, Sanzhen Liu, Qiuying Sha*, Hairong Wei*. TGPred：Efficient methods for predicting target genes of a transcription factor by integrating statistics, machine learning, and optimization.

_{⁺ These authors have contributed equally to this work}

Any questions? xueweic_AT_mtu_DOT_edu, lingzhan_AT_mtu_DOT_edu

Guideline

1. Simulated data

Step 1: Construct the network structure from either Hierarchical Network or Barabasi-Albert Network in simulation studies.

In Hierarchical Network, the number of genes must be the integer times 100.
In Barabasi-Albert Network, the number of genes must be the integer times 10.

library(TGPred)
N_genes = 200
Adj = ConstructNetwork(N_genes, "HN")
Adj = ConstructNetwork(N_genes, "BAN")

Step 2: Calculate Laplacian matrix and symmetric normalized Laplacian matrix from an adjacency matrix.

Sigma1 = GraphicalModel(Adj)
# - Laplacian matrix
res <- CalculateLaplacian(Adj)
L <- res$L
L_norm <- res$L_norm

Step 3: Simulate y and X from a given network structure (Adjacency matrix and Laplacian matrix).

N_sample <- 300
res = SimulationData(N_sample, N_genes, Adj, Sigma1, "BAN", beta0 = 1)
y = res$y
X = res$X

or

res = SimulationData(N_sample, N_genes, Adj, Sigma1, "HN", beta0 = 1)

2. Estimate Regression Coefficients by APGD or CVX

Calculate the estimated regression coefficients $\hat{\beta}$ using one of methods solving by APGD or CVX for a given set of $\alpha_0$ and $\lambda_0$ .

HuberNet: Huber loss function along with Network-based penalty function.

lambda0 = 200
alpha0 = 0.5
beta_hat_APGD <- HuberNet_Beta(X, y, Adj, lambda0, alpha0, method="APGD", if.scale=TRUE)
plot(beta_hat_APGD)

# install.packages("CVXR") before your library it.
library("CVXR")
beta_hat_CVX <- HuberNet_Beta(X, y, Adj, lambda0, alpha0, method="CVX", if.scale=TRUE)
plot(beta_hat_CVX)

HuberENET: Huber loss function along with Elastic Net penalty function.

lambda0 = 200
alpha0 = 0.5
beta_hat_APGD <- HuberENET_Beta(X, y, lambda0, alpha0, method="APGD", if.scale=TRUE)
library("CVXR")
beta_hat_CVX <- HuberENET_Beta(X, y, lambda0, alpha0, method="CVX", if.scale=TRUE)

HuberLasso: Huber loss function along with Lasso penalty function.

lambda0 = 200
beta_hat_APGD <- HuberLasso_Beta(X, y, lambda0, method="APGD", if.scale=TRUE)
library("CVXR")
beta_hat_CVX <- HuberLasso_Beta(X, y, lambda0, method="CVX", if.scale=TRUE)

MSEENET: Mean square error loss function along with Elastic Net penalty function.

lambda0 = 200
alpha0 = 0.5
beta_hat_APGD <- MSEENET_Beta(X, y, lambda0, alpha0, method="APGD", if.scale=TRUE)
library("CVXR")
beta_hat_CVX <- MSEENET_Beta(X, y, lambda0, alpha0, method="CVX", if.scale=TRUE)

MSELasso: Mean square error loss function along with Lasso penalty function.

lambda0 = 200
beta_hat_APGD <- MSELasso_Beta(X, y, lambda0, method="APGD", if.scale=TRUE)
library("CVXR")
beta_hat_CVX <- MSELasso_Beta(X, y, lambda0, method="CVX", if.scale=TRUE)

MSENet: Mean square error loss function along with Network-based penalty function.

lambda0 = 200
alpha0 = 0.5
beta_hat_APGD <- MSENet_Beta(X, y, Adj, lambda0, alpha0, method="APGD", if.scale=TRUE)
library("CVXR")
beta_hat_CVX <- MSENet_Beta(X, y, Adj,lambda0, alpha0, method="CVX", if.scale=TRUE)

3. Calculate Selection Probabilities by APGD

To avoid selecting the optimal tuning parameters $\lambda$ and $\alpha$ , we can use half-sample resampling method to calculate the selection probabilities of each predictor. The grid of $\lambda$ s for a given $\alpha$ from the proposed Huber or MSE loss functions can be calculated by

alpha <- 0.5
n_lambda <- 10
ratio <- 0.01
lambda_set <- Lambda_grid(X, y, n_lambda, alpha, loss_func = "Huber", ratio)
lambda_set <- Lambda_grid(X, y, n_lambda, alpha, loss_func = "MSE", ratio)

The selection probability can be calculated by each method.

## HuberNet
alphas <- seq(0.1,0.9,0.1)
n_lambda <- 10
B0 <- 100
ratio <- 0.01
SP_HuberNet = HuberNet_SP(X, y, Adj ,alphas, n_lambda, ratio, B=B0, gamma=1000, niter=2000, timer=FALSE)

## HuberENET
SP_HuberENET = HuberENET_SP(X, y, alphas, n_lambda, ratio, B=B0, gamma=1000, niter=2000, timer=FALSE)

## MSENet
SP_Net = MSENet_SP(X, y, Adj ,alphas, n_lambda, ratio, B=B0, gamma=1000, niter=2000, timer=FALSE)

## MSEENET
SP_ENET = MSEENET_SP(X, y, alphas, n_lambda, ratio, B=B0, gamma=1000, niter=2000, timer=FALSE)

## HuberLasso
n_lambda <- 50
SP_HuberLasso = HuberLasso_SP(X, y, n_lambda, ratio, B=B0, gamma=1000, niter=2000, timer=FALSE)

## MSELasso
SP_Lasso = MSELasso_SP(X, y, n_lambda, ratio, B=B0, gamma=1000, niter=2000, timer=FALSE)

4. Real Data Analysis

Example datasets

Sample datasets (TF, PWG, Annotation) for analyzing using HuberNet and Net;
Sample datasets (TF, PWG) for analyzing using HuberLasso, Lasso, HuberENET, and ENET.

Example code for HuberNet penalized regression.

library(TGPred)
Sample_data <- TGPred::Sample_data
X <- Sample_data$PWG
y <- Sample_data$TF
Annotation <- Sample_data$Annotation

## obtain adjacency matrix from Annotation file
Annotation <- data.matrix(Annotation)
Adj <- CalculateAdj(Annotation)

## Estimate Regression Coefficients by APGD or CVX
lambda0 = 10
alpha0 = 0.5
beta_hat_APGD <- HuberNet_Beta(X, y, Adj, lambda0, alpha0, method="APGD", gamma=1000, niter=2000, crit_beta=1e-4, crit_obj=1e-8)
library("CVXR")
beta_hat_CVX <- HuberNet_Beta(X, y, Adj, lambda0, alpha0, method="CVX")

## Calculate Selection Probabilities by APGD. (90 alpha-lambda pairs and 100 resampling for each pair.)
alphas <- seq(0.1,0.9,0.1)
n_lambda <- 10
B0 <- 100
ratio <- 0.01
SP_HuberNet = HuberNet_SP(X, y, Adj ,alphas, n_lambda, ratio, B=B0, gamma=1000, niter=2000, timer=FALSE)

You Own datasets

Name		Name	Last commit message	Last commit date
Latest commit History 309 Commits
R		R
data		data
man		man
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.md		README.md
TGPred.Rproj		TGPred.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

R

R

data

data

man

man

DESCRIPTION

DESCRIPTION

NAMESPACE

NAMESPACE

README.md

README.md

TGPred.Rproj

TGPred.Rproj

Repository files navigation

TGPred v.0.1.0

Installation

Reference

Guideline

1. Simulated data

2. Estimate Regression Coefficients by APGD or CVX

3. Calculate Selection Probabilities by APGD

4. Real Data Analysis

About

Releases 1

Packages

Languages

xueweic/TGPred

Folders and files

Latest commit

History

Repository files navigation

TGPred v.0.1.0

Installation

Reference

Guideline

1. Simulated data

2. Estimate Regression Coefficients by APGD or CVX

3. Calculate Selection Probabilities by APGD

4. Real Data Analysis

About

Resources

Stars

Watchers

Forks

Languages