Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in running MVP: Error in SetAll.bm(x, value) #11

Closed
qgao2 opened this issue Nov 21, 2017 · 22 comments
Closed

Error in running MVP: Error in SetAll.bm(x, value) #11

qgao2 opened this issue Nov 21, 2017 · 22 comments
Labels
input error Wrong input data type or format

Comments

@qgao2
Copy link

qgao2 commented Nov 21, 2017

Hi Xiaolei,
I got geno, pheno data and map ready with MVP.Data function for the run, but when I start the run, I got the following error:

[1] "Input data has 198 individuals, 109914 markers"
[1] "Principal Component Analysis Start ..."
means for first 10 snps:
[1] 0 0 0 0 0 0 0 0 0 0
Error in SetAll.bm(x, value) :
Matrix dimensions do not agree with big.matrix instance set size.

Any idea why this error occurred? Thanks!

@qgao2
Copy link
Author

qgao2 commented Nov 21, 2017

pheno data: 198 x 2
geno data : 109914 x 198
map file: 109914 x 3

@qgao2
Copy link
Author

qgao2 commented Nov 21, 2017

genotype3 <- attach.big.matrix('test_hapmap_final.geno.desc')
phenotype3 <- read.table('test_hapmap_final.phe', header = TRUE)
map5 <- read.table('test_hapmap_final.map', header = TRUE)

imMVP3 <- MVP(
phe=phenotype3,
geno=genotype3,
map=map5,
#K=Kinship,
#CV.GLM=Covariates,
#CV.MLM=Covariates,
#CV.FarmCPU=Covariates,
nPC.GLM=5,
nPC.MLM=3,
nPC.FarmCPU=3,
perc=1,
priority="speed",
ncpus=5,
vc.method="EMMA",
maxLoop=10,
method.bin="FaST-LMM",#"FaST-LMM","EMMA", "static"
#permutation.threshold=TRUE,
#permutation.rep=100,
threshold=0.05,
method=c("GLM", "MLM", "FarmCPU")
)

@XiaoleiLiuBio
Copy link
Collaborator

Could you please try

imMVP3 <- MVP(
phe=phenotype3,
geno=genotype3,
map=map5,
priority="speed",
ncpus=5,
method=c("GLM")
)

Please let me know if there is any error message.

@qgao2
Copy link
Author

qgao2 commented Nov 24, 2017

Hi Xiaolei,
I tried with the script you suggested, didn't see any errors.
'GLM ACCOMPLISHED'

@qgao2
Copy link
Author

qgao2 commented Nov 24, 2017

Plus, if I use the Kinship and Covariates generated by MVP.Data, and them to K, CV.GLM, CV.MLM,CV.FarmCPU, and skip the nPC.GLM, nPC.MLM arguments, no error messages.

@qgao2
Copy link
Author

qgao2 commented Nov 24, 2017

ok, I was wrong. I do get some warning messages from previous run. I am not sure what went wrong in there.
#----------------------------------MVP ACCOMPLISHED----------------------------------#
Warning messages:
1: In MVP.Specify(GI = GM, GP = GP, bin.size = b, inclosure.size = s) :
NAs introduced by coercion
2: In ID.GP/bin.size :
longer object length is not a multiple of shorter object length
3: In MVP.Specify(GI = GM, GP = GP, bin.size = b, inclosure.size = s) :
NAs introduced by coercion
4: In max(numeric.chr, na.rm = TRUE) :
no non-missing arguments to max; returning -Inf
5: In max(numeric.chr, na.rm = TRUE) :
no non-missing arguments to max; returning -Inf

@XiaoleiLiuBio
Copy link
Collaborator

Hi,

Thank you very much for your kind help. Could you please help me to find where is the problem, PC calculation or kinship calculation?

imMVP3 <- MVP(
phe=phenotype3,
geno=genotype3,
map=map5,
CV.MLM=Covariates,
priority="speed",
ncpus=5,
vc.method="EMMA",
method=c("MLM")
)

Best,
Xiaolei

@qgao2
Copy link
Author

qgao2 commented Nov 25, 2017

Thank you, Xiaolei. I tried with several runs. Seems like it is FarmCPU related error. please see No. 4.
by the way, is there any limitation for total SNP number in MVP?

  1. GWAS_MVP_mydata_test_GLM <- MVP(
    phe=phenotype3,
    geno=genotype3,
    map=map5,
    priority="speed",
    ncpus=5,
    method=c("GLM")
    )
    [1] "Input data has 198 individuals, 109914 markers"
    [1] "GWAS Start..."
    [1] "General Linear Model (GLM) Start ..."
    [1] "GLM ACCOMPLISHED"
    [1] "Visualization Start..."
    [1] "Phenotype distribution Plotting..."
    [1] "SNP_Density Plotting..."
    [1] "Circular_Manhattan Plotting SW.GLM..."
    [1] "Rectangular_Manhattan Plotting SW.GLM..."
    [1] "Q_Q Plotting SW.GLM..."
    #----------------------------------MVP ACCOMPLISHED----------------------------------#

  2. GWAS_MVP_mydata_test_MLM_PC <- MVP(
    phe=phenotype3,
    geno=genotype3,
    map=map5,
    CV.MLM=Covariates_mydata,
    priority="speed",
    ncpus=5,
    vc.method="EMMA",
    method=c("MLM")
    )
    [1] "Input data has 198 individuals, 109914 markers"
    [1] "GWAS Start..."
    [1] "Mixed Linear Model (MLM) Start ..."
    [1] "Calculating Kinship..."
    [1] "Variance components..."
    [1] "Variance Components Estimation is Done!"
    [1] "Eigen-Decomposition..."
    [1] "Eigen-Decomposition is Done!"
    [1] "MLM ACCOMPLISHED"
    [1] "Visualization Start..."
    [1] "Phenotype distribution Plotting..."
    [1] "SNP_Density Plotting..."
    [1] "Circular_Manhattan Plotting SW.MLM..."
    [1] "Rectangular_Manhattan Plotting SW.MLM..."
    [1] "Q_Q Plotting SW.MLM..."
    #----------------------------------MVP ACCOMPLISHED----------------------------------#

  3. GWAS_MVP_mydata_test_MLM_kinship <- MVP(
    phe=phenotype3,
    geno=genotype3,
    map=map5,
    K=Kinship_mydata,
    CV.MLM=Covariates_mydata,
    priority="speed",
    ncpus=5,
    vc.method="EMMA",
    method=c("MLM")
    )
    #------------------------------------------------------------------------------------------#
    [1] "Input data has 198 individuals, 109914 markers"
    [1] "GWAS Start..."
    [1] "Mixed Linear Model (MLM) Start ..."
    [1] "Variance components..."
    [1] "Variance Components Estimation is Done!"
    [1] "Eigen-Decomposition..."
    [1] "Eigen-Decomposition is Done!"
    [1] "MLM ACCOMPLISHED"
    [1] "Visualization Start..."
    [1] "Phenotype distribution Plotting..."
    [1] "SNP_Density Plotting..."
    [1] "Circular_Manhattan Plotting SW.MLM..."
    [1] "Rectangular_Manhattan Plotting SW.MLM..."
    [1] "Q_Q Plotting SW.MLM..."
    #----------------------------------MVP ACCOMPLISHED----------------------------------#

  4. GWAS_MVP_mydata_test_farmCPU <- MVP(
    phe=phenotype3,
    geno=genotype3,
    map=map5,
    K=Kinship_mydata,
    CV.FarmCPU=Covariates_mydata,
    priority="speed",
    ncpus=6,
    vc.method="EMMA",
    maxLoop=10,
    method.bin="FaST-LMM",#"FaST-LMM","EMMA", "static"
    threshold=0.05,
    method=c("FarmCPU")
    )
    1] "FarmCPU ACCOMPLISHED"
    [1] "Visualization Start..."
    [1] "Phenotype distribution Plotting..."
    [1] "SNP_Density Plotting..."
    [1] "Circular_Manhattan Plotting SW.FarmCPU..."
    [1] "Rectangular_Manhattan Plotting SW.FarmCPU..."
    [1] "Q_Q Plotting SW.FarmCPU..."
    #----------------------------------MVP ACCOMPLISHED----------------------------------#
    Warning messages:
    1: In if (seqQTN.save != 0 & seqQTN.save != -1 & !is.null(seqQTN)) seqQTN = union(seqQTN, :
    the condition has length > 1 and only the first element will be used
    2: In if (seqQTN.save != 0 & seqQTN.save != -1 & !is.null(seqQTN)) seqQTN = union(seqQTN, :
    the condition has length > 1 and only the first element will be used
    3: In if (seqQTN.save != 0 & seqQTN.save != -1 & !is.null(seqQTN)) seqQTN = union(seqQTN, :
    the condition has length > 1 and only the first element will be used
    4: In if (seqQTN.save != 0 & seqQTN.save != -1 & !is.null(seqQTN)) seqQTN = union(seqQTN, :
    the condition has length > 1 and only the first element will be used

@XiaoleiLiuBio
Copy link
Collaborator

Thank you very much for your information. Please try following codes again and send me the error message

imMVP3 <- MVP(
phe=phenotype3,
geno=genotype3,
map=map5,
nPC.GLM=3,
perc=1,
priority="speed",
ncpus=5,
method=c("GLM")
)

Best,
Xiaolei

@qgao2
Copy link
Author

qgao2 commented Nov 27, 2017

Sure, Xiaolei, I ran the codes you suggested, and don't see any errors.

GWAS_MVP_mydata_test_GLM_PC <- MVP(
phe=phenotype_mydata,
geno=genotype_mydata,
map=map_mydata,
#K=Kinship_mydata,
#CV.FarmCPU=Covariates_mydata,
nPC.GLM=3,
#nPC.MLM=3,
#nPC.FarmCPU=3,
perc=1,
priority="speed",
ncpus=5,
#vc.method="EMMA",
#maxLoop=10,
#method.bin="FaST-LMM",#"FaST-LMM","EMMA", "static"
#permutation.threshold=TRUE,
#permutation.rep=100,
#threshold=0.05,
method=c("GLM")
)

[1] "Input data has 198 individuals, 109914 markers"
[1] "Principal Component Analysis Start ..."
means for first 10 snps:
[1] 0 0 0 0 0 0 0 0 0 0
[1] "GWAS Start..."
[1] "General Linear Model (GLM) Start ..."

[1] "GLM ACCOMPLISHED"
[1] "Visualization Start..."
[1] "Phenotype distribution Plotting..."
[1] "PCA plot2d..."
[1] "SNP_Density Plotting..."
[1] "Circular_Manhattan Plotting SW.GLM..."
[1] "Rectangular_Manhattan Plotting SW.GLM..."
[1] "Q_Q Plotting SW.GLM..."
#----------------------------------MVP ACCOMPLISHED----------------------------------#

@XiaoleiLiuBio
Copy link
Collaborator

Thank you for your response. There is no problem when calculate PCs. Could you please try

imMVP3 <- MVP(
phe=phenotype3,
geno=genotype3,
map=map5,
#K=Kinship,
#CV.GLM=Covariates,
#CV.MLM=Covariates,
#CV.FarmCPU=Covariates,
nPC.GLM=5,
nPC.MLM=3,
#nPC.FarmCPU=3,
perc=1,
priority="speed",
ncpus=5,
vc.method="EMMA",
maxLoop=10,
#method.bin="FaST-LMM",#"FaST-LMM","EMMA", "static"
#permutation.threshold=TRUE,
#permutation.rep=100,
threshold=0.05,
method=c("GLM", "MLM")
)

and

imMVP3 <- MVP(
phe=phenotype3,
geno=genotype3,
map=map5,
#K=Kinship,
#CV.GLM=Covariates,
#CV.MLM=Covariates,
#CV.FarmCPU=Covariates,
nPC.GLM=5,
nPC.MLM=3,
nPC.FarmCPU=3,
perc=1,
priority="speed",
ncpus=5,
vc.method="EMMA",
maxLoop=10,
method.bin="FaST-LMM",#"FaST-LMM","EMMA", "static"
#permutation.threshold=TRUE,
#permutation.rep=100,
threshold=0.05,
method=c("GLM", "MLM", "FarmCPU")
)

@qgao2
Copy link
Author

qgao2 commented Nov 28, 2017

Xiaolei, thanks for your help.
yes, the PC calculation looks fine. I got the same error when I ran the both codes above.

================================================================

[1] "Input data has 198 individuals, 109914 markers"
[1] "Principal Component Analysis Start ..."
 means for first 10 snps:
 [1] 0 0 0 0 0 0 0 0 0 0
Error in SetAll.bm(x, value) : 
  Matrix dimensions do not agree with big.matrix instance set size.

===============

imMVP3 <- MVP(
phe=phenotype3,
geno=genotype3,
map=map5,
#K=Kinship,
#CV.GLM=Covariates,
#CV.MLM=Covariates,
#CV.FarmCPU=Covariates,
nPC.GLM=5,
nPC.MLM=3,
#nPC.FarmCPU=3,
perc=1,
priority="speed",
ncpus=5,
vc.method="EMMA",
maxLoop=10,
#method.bin="FaST-LMM",#"FaST-LMM","EMMA", "static"
#permutation.threshold=TRUE,
#permutation.rep=100,
threshold=0.05,
method=c("GLM", "MLM")
)

@XiaoleiLiuBio
Copy link
Collaborator

Thank you for helping me doing so many tests. Could you please select partial markers and generate a demo code that I can repeat the error? Then my team will debug it. Thanks again for your help.

@qgao2
Copy link
Author

qgao2 commented Dec 1, 2017 via email

@XiaoleiLiuBio
Copy link
Collaborator

Please send the data to xiaoleiliu@mail.hzau.edu.cn. Thank you.

@XiaoleiLiuBio
Copy link
Collaborator

Thank you for sharing the data with me. As MVP uses genotype data in 'big.matrix' format which is a data image, so I can not use the data to repeat the error. But from the message in "MVP_run_qgao2.txt", there is only warning message and no error message.

@qgao2
Copy link
Author

qgao2 commented Dec 15, 2017

Thank you, Xiaolei.
I just run the pipeline with my whole SNP data set (total 609914 markers). As before, I make the kin and pc files with MVPdata function, and then run the following script:

for(i in 2:ncol(phenotype_allsnps)){
imMVP_allsnps <- MVP(
phe=phenotype_allsnps[, c(1, i)],
geno=genotype_allsnps,
map=map_allsnps,
K=kinship_allsnps,
CV.GLM=covariates_allsnps,
CV.MLM=covariates_allsnps,
CV.FarmCPU=covariates_allsnps,
#nPC.GLM=5,
#nPC.MLM=3,
#nPC.FarmCPU=3,
#perc=1,
priority="speed",
ncpus=5,
vc.method="EMMA",
maxLoop=10,
method.bin="FaST-LMM",#"FaST-LMM","EMMA", "static"
#permutation.threshold=TRUE,
#permutation.rep=100,
threshold=0.05,
method=c("GLM", "MLM", "FarmCPU")
)
}

However, the GLM and MLM accomplished, but not FarmCPU

[1] "Input data has 198 individuals, 609914 markers"
[1] "GWAS Start..."
[1] "General Linear Model (GLM) Start ..."

[1] "GLM ACCOMPLISHED"
[1] "Mixed Linear Model (MLM) Start ..."
[1] "Variance components..."
[1] "Variance Components Estimation is Done!"
[1] "Eigen-Decomposition..."
[1] "Eigen-Decomposition is Done!"

[1] "MLM ACCOMPLISHED"
[1] "FarmCPU Start ..."
[1] "Current loop: 1 out of maximum of 10"
[1] "seqQTN"
NULL
[1] "scanning..."
[1] "number of covariates in current loop is:"
[1] 5

[1] "Current loop: 2 out of maximum of 10"
[1] "Optimizing Pseudo QTNs..."
Error in if (!is.null(snp.pool) && var(snp.pool) == 0) { :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In MVP.Specify(GI = GM, GP = GP, bin.size = b, inclosure.size = s) :
NAs introduced by coercion
2: In ID.GP/bin.size :
longer object length is not a multiple of shorter object length
3: In MVP.Specify(GI = GM, GP = GP, bin.size = b, inclosure.size = s) :
NAs introduced by coercion
4: In MVP.Specify(GI = GM, GP = GP, bin.size = bin, inclosure.size = inc) :
NAs introduced by coercion
5: In MVP.Specify(GI = GM, GP = GP, bin.size = bin, inclosure.size = inc) :
NAs introduced by coercion

@qgao2
Copy link
Author

qgao2 commented Dec 15, 2017

**Error in if (!is.null(snp.pool) && var(snp.pool) == 0) { :
missing value where TRUE/FALSE needed

What does this error mean?

@XiaoleiLiuBio
Copy link
Collaborator

snp.pool represents SNPs selected to build a k matrix and used for estimating maximum likelihood using FaST-LMM method. The error means '!is.null(snp.pool) && var(snp.pool) == 0' is a missing value, but the 'if' needs a TRUE or FALSE. Please send me the data that can generate the problem, I will test and debug it. Please send me the genotype, phenotype, covariates data together.

@XiaoleiLiuBio
Copy link
Collaborator

Thank you for sharing me the data and pointing out the issue. Finally, I find the reason, it is the map file. In your hapmap data, there is an “Ha” prefix before the Chromosome number in the third column. Please remove the "Ha" and then mvp.farmcpu will work.

@qgao2
Copy link
Author

qgao2 commented Dec 26, 2017

Thank you so much, Xiaolei!
I reformat the chromosome column, and it works!

Also I got some warning message as following.
Warning messages:
1: In ID.GP/bin.size :
longer object length is not a multiple of shorter object length
2: In ID.GP/bin.size :
longer object length is not a multiple of shorter object length
3: In if (seqQTN.save != 0 & seqQTN.save != -1 & !is.null(seqQTN)) seqQTN = union(seqQTN, :
the condition has length > 1 and only the first element will be used
4: In ID.GP/bin.size :
longer object length is not a multiple of shorter object length
5: In if (seqQTN.save != 0 & seqQTN.save != -1 & !is.null(seqQTN)) seqQTN = union(seqQTN, :
the condition has length > 1 and only the first element will be used
6: In ID.GP/bin.size :
longer object length is not a multiple of shorter object length
7: In if (seqQTN.save != 0 & seqQTN.save != -1 & !is.null(seqQTN)) seqQTN = union(seqQTN, :
the condition has length > 1 and only the first element will be used

@hyacz
Copy link
Collaborator

hyacz commented Sep 12, 2018

Closed because there is no activity for too long. if you have any related new messages, feel free to open it.

@hyacz hyacz closed this as completed Sep 12, 2018
@hyacz hyacz added the input error Wrong input data type or format label Sep 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
input error Wrong input data type or format
Projects
None yet
Development

No branches or pull requests

3 participants