This package provide functions that solve the following multi-response, group-sparse regularized Cox regression:
Where is the coefficient vector of the kth (out of K) responses, and are the coefficients of the jth variable on the K responses.
For genetics data in PLINK2 format, we provide a screening procedure similar to the one in this paper.
Currently mrcox only supports linux and 64-bit intel processors. It requires Intel's Math Kernel Library (MKL). I suspect you can still get it to run on AMD processors but performance could be significantly worse. To install MKL:
- Register and download MKL from https://software.intel.com/content/www/us/en/develop/tools/math-kernel-library/choose-download/linux.html
- Under Choose Product to Download, select Intel Math Kernel Library for linux
- Untar the downloaded file, run
./install.sh
. After installation is done,source intel/mkl/bin/mklvars.sh intel64
.
You will also need to install the R dependencies of mrcox (Rcpp, RcppEigen).
- zstd(>=1.4.4). It can be built from source or simply available from conda, pip or brew
- PLINK2
library(devtools)
install_github("chrchang/plink-ng", subdir="/2.0/cindex")
install_github("chrchang/plink-ng", subdir="/2.0/pgenlibr")
Once dependencies are installed, run the following in R
devtools::install_github("rivas-lab/multisnpnet-Cox")
library(mrcox)
# Simulate some data
n = 1000
p = 5000
X = matrix(rnorm(n*p), n, p)
y1 = rexp(n) * exp(X %*% rbinom(p, 1, 0.1))
y2 = rexp(n) * exp(X %*% rbinom(p, 1, 0.1))
s1 = rbinom(n, 1, 0.3)
s2 = rbinom(n, 1, 0.3)
y_list = list(y1, y2)
s_list = list(s1, s2)
# Initialize coefficient matrix at 0
B = matrix(0.0, p, 2)
# Compute residual at B
res = get_residual(X, y_list, s_list, B)
# Compute the gradient of B
g = t(X) %*% res
# Get the dual norm and lambda sequence
alpha = sqrt(2)
dnorm = get_dual_norm(g, alpha)
lambda_max = max(dnorm)
lambda_min = 0.001 * lambda_max
lambda_seq = exp(seq(from = log(lambda_max), to = log(lambda_min), length.out = 100))
# Fit a model
fit = solve_aligned(X, y_list, s_list, lambda_seq, lambda_seq * alpha)
fit = fit[['result']] # fit also contains the residuals at each solution