Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can combine multiple covariates and load as single covariates file #70

Open
shameem356 opened this issue Nov 17, 2021 · 4 comments
Open

Comments

@shameem356
Copy link

shameem356 commented Nov 17, 2021

Hello Team rMVP,

First of all thank you so much for your wonderful software.
I would like to clarify some doubt regarding multiple covariates. I have Five PCs in PC.txt file (pc1, pc2, pc3, pc4, pc5)and three scaling factor value in scf.txt file (scf1, scf2, scf3). Can I combine these two file into a single file (pc_scf.txt) and load as covaries file using below command ? if no, how can Iuse PC.txt and scf.txt file as covariates file ?

note : pc_scf.txt file is having 8 column ( pc1, pc2, pc3, pc4, pc5, scf1, scf2, scf3 )
MVP.Data.PC("pc_scf.txt", out="mvp.pc_scf",sep='\t')
Covariates_PC <- bigmemory::as.matrix(attach.big.matrix("mvp.pc_scf"))

@hyacz
Copy link
Collaborator

hyacz commented Nov 17, 2021

Hello,
You can simply use read.table to read your pc_scf.txt, and then use the model.matrix function to encode them, you can refer to the following code:

cv <- model.matrix(~as.numeric(pc1)+as.numeric(pc2)+as.numeric(pc3)+as.numeric(pc4)+as.numeric(pc5)+as.factor(scf1)+as.factor(scf2)+as.factor(scf3), data=pc_scf)

MVP(..., CV.GLM=cv, CV.MLM=cv, CV.FarmCPU=cv, nPC.GLM=0, nPC.MLM=0, nPC.FarmCPU=0, ...)

when you have calculated the PCs and put them into the CV.<model> parameter of the model, please set nPC.<model> to 0 to prevent the MVP from automatically adding PCs. MVP.Data.PC is used for principal component analysis, and its role is to obtain PCs from genotypes.

@shameem356
Copy link
Author

hello @hyacz ,
Thank you so much for your quick reply and code.I have updated my code as below based on your suggestion. Looking forward to see your suggestion.

converting plink file to rmvp

library(rMVP)
MVP.Data(fileBed="199sample_HF",
filePhe=NULL,
fileKin=TRUE,
filePC=FALSE,
#priority="speed",
#maxLine=10000,
out="mvp.199sample_HF"
)

running FarmCPU GWAS

genotype <- attach.big.matrix("mvp.199sample_HF.geno.desc")
phenotype <- read.table("179s_pheno.csv",head=TRUE)
map <- read.table("mvp.199sample_HF.geno.map" , head = TRUE)
Kinship <- attach.big.matrix("mvp.199sample_HF.kin.desc")
pc_scf<- read.table("179s_PC5_scf_for_mvp.csv",head=TRUE)
cv <- model.matrix(~as.numeric(PC1)+as.numeric(PC2)+as.numeric(PC3)+as.numeric(PC4)+as.numeric(PC5)+as.factor(SCF_Red)+as.factor(SCF_Green)+as.factor(SCF_Blue), data=pc_scf)

for(i in 2:ncol(phenotype)){
imMVP <- MVP(
phe=phenotype[, c(1, i)],
geno=genotype,
map=map,
K=Kinship,
CV.FarmCPU=cv,
nPC.FarmCPU=0,
priority="speed",
ncpus=16,
vc.method="BRENT",
maxLoop=10,
method.bin="FaST-LMM",
#permutation.threshold=TRUE,
#permutation.rep=100,
threshold=0.05,
method=c("FarmCPU")
)
gc()
}

@shameem356
Copy link
Author

@hyacz ,

By running the above code , the log file is showing that 'Number of provided covariates of FarmCPU: 540'. 179s_PC5_scf_for_mvp.csv is having 179 samples ( 5 pcs+ 3 scaling factor value, 172* 8=1432 values ). I would like to know why 'Number of provided covariates of FarmCPU' is showing 540 ?

@hyacz
Copy link
Collaborator

hyacz commented Nov 18, 2021

Then the number of covariates mentioned in the log depends on the number of columns of variable cv. There are 3 factors (SCF_Red, SCF_Green, SCF_Blue). Since they have multiple levels, after processing by the model.matrix function, the number of columns in cv will be 540.

I'm not sure if I understand your data correctly. If SCF is a categorical variable, this is ok. If SCF is a quantitative variable, then as.numeric(SCF) should be used instead of as.factor(SCF) in model.matrix.

in addition, it should be noted that the order of individuals in cv needs to be consistent with the phenotype and genotype.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants