-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in svd(x, nu = 0, nv = k) : infinite or missing values in 'x' #34
Comments
I guess the error is caused by some missing values are remaining in the genotype file. If the 'allele' column in the HapMap file is 'NA', the error will occur. You can check if the HapMap file has the 'NA' information and we will update 'MVP.Data' function to skip those markers with 'NA' allele. |
Thank you for your fast reply! I appreciate all of your work on this amazing package. What I've tried now:
Still the same error.
0.0148802036697822 -0.174287100840609 -0.0481350609384221 -0.0224252021750267 -0.114742327963779 -0.0139827875349924 0.0288828570311105 -0.160927486070197 -0.031365668593198 0.0326738853578806 -0.112547120658613 -0.0256821848442263 0.0220493250426905 -0.0524834505823422 -0.0261171667960449 .... and changed the code: library(rMVP) MVP.Data(fileHMP="Input.hmp.hmp.txt", filePhe="Phenotype_no_Pheno_Indo.csv", fileKin=FALSE, filePC="SnPrelate_PCAs.tsv", out="mvp.vcf", priority="memory", ) genotype <- attach.big.matrix("mvp.vcf.geno.desc") phenotype <- read.table("mvp.vcf.phe",head=TRUE) map <- read.table("mvp.vcf.map" , head = TRUE) Covariates <- attach.big.matrix("mvp.vcf.imp.pc.desc") for(i in 2:ncol(phenotype)){ imMVP <- MVP( phe=phenotype[, c(1, i)], geno=genotype, map=map, #K=Kinship, CV.GLM=Covariates, CV.MLM=Covariates, CV.FarmCPU=Covariates, nPC.GLM=0, nPC.MLM=0, nPC.FarmCPU=0, perc=1, file='tiff', priority="memory",, ncpus=12, vc.method="EMMA", maxLoop=10, method.bin="FaST-LMM",#"FaST-LMM","EMMA", "static" #permutation.threshold=TRUE, #permutation.rep=100, threshold=0.05, method=c("FarmCPU", 'MLM') ) } That gave me a different error: Error in if (nrow < 1 | ncol < 1) stop("A big.matrix must have at least one row and one column") : argument is of length zero Calls: MVP -> big.matrix -> filebacked.big.matrix Execution halted I checked all *.desc files and the input all has >1 row and column? new("big.matrix.descriptor", description = list(sharedType = "FileBacked", filename = "mvp.vcf.geno.bin", dirname = "./", totalRows = 3714063L, totalCols = 83L, rowOffset = c(0, 3714063), colOffset = c(0, 83), nrow = 3714063, ncol = 83, rowNames = NULL, colNames = NULL, type = "char", separated = FALSE)) new("big.matrix.descriptor", description = list(sharedType = "FileBacked", filename = "mvp.vcf.imp.geno.bin", dirname = "./", totalRows = 3714063L, totalCols = 83L, rowOffset = c(0, 3714063), colOffset = c(0, 83), nrow = 3714063, ncol = 83, rowNames = NULL, colNames = NULL, type = "char", separated = FALSE)) new("big.matrix.descriptor", description = list(sharedType = "FileBacked", filename = "mvp.vcf.imp.pc.bin", dirname = "./", totalRows = 83L, totalCols = 3L, rowOffset = c(0, 83), colOffset = c(0, 3), nrow = 83, ncol = 3, rowNames = NULL, colNames = NULL, type = "double", separated = FALSE))
MVP.Data(fileHMP="Input.hmp.hmp.txt", filePhe="Phenotype_no_Pheno_Indo.csv", fileKin=FALSE, filePC=FALSE, SNP.impute=NULL, out="mvp.vcf", priority="memory", ) Back to the old error: Error in svd(x, nu = 0, nv = k) : infinite or missing values in 'x' Calls: MVP -> MVP.PCA -> prcomp -> prcomp.default -> svd Execution halted And now I'm out of ideas :) Edit: some more things I tried:
and again I'm out of ideas Edit2: I should say that I got the same errors using the VCF file, converting to HapMap was my first idea and now I keep on playing with the HapMap file - this isn't a HapMap bug Edit3: Is it possible that the imputation doesn't work 'right'? I added this piece to the MVP.PCA method just before the prcomp function is called: for(col in 1:ncol(big.geno)) { thissum <- sum(big.geno[1:83, col], na.rm=TRUE) if ( thissum == 0 ) { print(col) } } Before I added the na.rm=T lots of NAs popped up, and the NAs are what causes the error in prcomp() - the version on GitHub and which I have installed doesn't seem to use big.PCA for the PCA? PCs <- prcomp(big.geno)$x[, 1:pcs.keep] return(list(PCs = PCs)) big.PCA seems to work with NA values. If I replace the function call in MVP.PCA: #PCs <- prcomp(big.geno)$x[, 1:pcs.keep] PCs <- big.PCA(big.geno)$PCs[, 1:pcs.keep] This happens in the STDOUT: [1] "Input data has 83 individuals, 3713722 markers" [1] "Principal Component Analysis Start..." means for first 10 snps: [1] 0 0 0 0 0 0 0 0 0 0 [1] "GWAS Start..." [1] "Mixed Linear Model (MLM) Start ..." [1] "Calculating Kinship..." [1] "Z assignment..." Hooray it works now :) |
Hi, Thank you for using our software and sharing the details of the issue!
|
Ah I understand about bigpca - would you like me to send you my data so you can reproduce the issue more easily? I can put it onto cloudstor and send you a link via email |
yes, it will be very helpful ! my email is |
In the meantime, my 'fixed' MVP run also crashed: [1] "Input data has 83 individuals, 3713722 markers" [1] "Principal Component Analysis Start..." means for first 10 snps: [1] 0 0 0 0 0 0 0 0 0 0 [1] "GWAS Start..." [1] "Mixed Linear Model (MLM) Start ..." [1] "Calculating Kinship..." [1] "Z assignment..." [1] "Assignment done!" Error in cbind(matrix(1, n), CV) : number of rows of matrices must match (see arg 2) Calls: MVP -> MVP.MLM -> cbind Execution halted I used 0.99.13, the latest version on the releases page: https://github.com/XiaoleiLiuBio/rMVP/releases Is there a chance 0.99.14 fixed this? It's not on the releases I send you a link to the data via email |
Thanks a lot! I have received your email.
After I fix this bug, I will send you a development version. |
Thank you :) |
I updated the code and now MVP.Data will only keep a set of Package:rMVP_0.99.15.tar.gz Sample Code:
|
Thank you very much!! [1] "Principal Component Analysis Start..." [1] "GWAS Start..." [1] "Mixed Linear Model (MLM) Start ..." [1] "Calculating Kinship..." [1] "Z assignment..." [1] "Assignment done!" [1] "Variance components..." [1] "Variance Components Estimation is Done!" [1] "Variance components is Done!" [1] "Eigen Decomposition..." [1] "Eigen-Decomposition is Done!" [...] It works now :) |
I'm getting this same error, even though I'm using MVP 0.99.15. Do you know why? Thanks for taking a look.
Here is the output:
Here is my Session Info showing I'm running the 0.99.15:
|
hi, naglemi M <- attach.big.matrix(paste0(outname, |
I ran MVP.Data with rMVP 0.99.15 and do not see the imp.geno.desc file among my outputs. How can I obtain this? By the way, is there a good way for me to use MVP to obtain PCs without imputation? I think that NAs in my dataset are more often due to indels than missing data and such.
|
rMVP will determine if your data is missing and then decide whether to impute it. If there is no missing data in your data, the You can use the following code to confirm if there is a missing in your genotype file. > genotype <- attach.big.matrix(genoPath)
> rMVP:::hasNA(genotype)
[1] FALSE If the result is rMVP does not allow NAs in Genotype when computing PCs, you can delete these SNP. generally, I won't mix SNP and InDel together for GWAS. If you are interested in InDel, you can analyze it separately. noInDel and InDel can be encoded as 0 and 1 respectively. |
I receive the following error from hasNA. Do you know the meaning of this? Thanks.
|
sorry, i made a mistake in example code, you should give a pointer to hasNA function instead of S4 object. rMVP:::hasNA(M@address) |
Interestingly, there appear to be no NA
In this case, what might be causing the error during PCA? |
I tried this code MVP.Data( genotype <- attach.big.matrix("mvp.geno.desc") # Note: read the ".imp" genotype for(i in 2:ncol(phenotype)){
I can't see any imp file in my output folder. Can someone please help :) Regards |
Hi,
I'm running an analysis with rMVP with a HapMap formatted SNP file, and it crashes in the PCA step.
The file was converted to HapMap from VCF via Tassel's export function. The same file works fine in GAPIT.
My analysis script:
The STDOUT:
The STDERR:
Have you seen this before?
The text was updated successfully, but these errors were encountered: