Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in if (any(i < 0L)) { : missing value where TRUE/FALSE needed #66

Closed
vkp3 opened this issue May 24, 2021 · 7 comments · Fixed by #68
Closed

Error in if (any(i < 0L)) { : missing value where TRUE/FALSE needed #66

vkp3 opened this issue May 24, 2021 · 7 comments · Fixed by #68

Comments

@vkp3
Copy link

vkp3 commented May 24, 2021

Hello,

I have had several issues that mimic the error (below) when trying to run GENESIS on a particular chromosome. It says there's some issue with an if statement, as stated below. In addition, it states some NA's were introduced.

Everything else seems to run fine and I have R compiled with Intel MKL libraries for matrix calculations, but for certain chromosomes / tests, I get the following error. It has been difficult to pinpoint the root of the error, but I am wondering if the authors or others could shed light on this error and how it may be avoided. I also wonder if the warning and the error are related.

...
Iteration 704 of 1086 completed
Iteration 715 of 1086 completed

Error in if (any(i < 0L)) { : missing value where TRUE/FALSE needed
Calls: assocTestAggregate ... .local -> .meanImpute -> [<- -> [<- -> [<- -> [<- -> int2i

In addition: Warning message:
In int2i(as.integer(i), n) : NAs introduced by coercion to integer range
Execution halted

Thank you,
Vamsee

@smgogarten
Copy link
Collaborator

I had not heard of this error before, but it's been reported elsewhere and diagnosed as a problem with the matrix being too large. Perhaps one of your aggregate units contains too many variants; can you try splitting up the largest ones and see if that fixes the problem?

@vkp3
Copy link
Author

vkp3 commented May 27, 2021

Thank you for your response and for the link. Your suggestion that the number of variants might be too large seems accurate. My aggregate unit is a gene, and I am testing all coding sequence variants < MAF 0.01 (no additional filters) from exome sequencing data from the UK Biobank (N=~180,000) for any given gene-based aggregate test. So, I suppose the matrix of [variants x samples] for one of the genes might cause issues with matrix calculations.

Could you elaborate on how I would split the largest ones - do you mean split the gene into two aggregate units and run them separately as different units? If so, is there 1) a way to combine the units together after running each split, and 2) is there some way the code could skip over any units that fail to run such that the other genes can finish? I also wonder if these issues were faced with the TOPMED data if the sample sizes are comparable to the UK Biobank data.

GENESIS has been a joy to use otherwise, so kudos to your quick response.

@smgogarten
Copy link
Collaborator

We were able to reproduce this error for a matrix with >2^31 elements, and added a fix in version 2.23.3 of GENESIS.

@vkp3
Copy link
Author

vkp3 commented Jul 6, 2021

Thanks for the bug fix. I am unfortunately still experiencing an error that seems to be due to the same issue.

For a given gene (TTN), I am attempting to include 13,349 variants across 180256 individuals.
I created a single range SeqVarRangeIterator defined by:

> iterator <- SeqVarRangeIterator(seqData, variantRanges=GRanges(seqnames=c(2),ranges=IRanges(start=178525989, end=178830802), strand='-'), verbose=T)
# of selected variants: 13,349
> assoc <- assocTestAggregate(iterator, nullmod.fe, test='SMMAT',genome.build='hg38',weight.beta=weight.beta,verbose=T)
# of selected samples: 180,256
Error in .local(x, y, ...) : negative length vectors are not allowed

As you can see, I get the following error:

Error in .local(x, y, ...) : negative length vectors are not allowed

which would indicate (from doing some research for others who've ran into similar errors) this is due to to the data frame size being > 2^31 -1 (https://stackoverflow.com/questions/42479854/merge-error-negative-length-vectors-are-not-allowed)

Could you please confirm you can reproduce this error?

Thanks again

@smgogarten
Copy link
Collaborator

I was able to reproduce this, still looking into the cause and possible fixes.

@smgogarten smgogarten reopened this Jul 20, 2021
@smgogarten
Copy link
Collaborator

@mconomos the error is coming from this line: https://github.com/UW-GAC/GENESIS/blob/master/R/nullModelTestPrep.R#L53

tcrossprod(x, y) where x and y are both dgeMatrix objects with 1 column each, and nrow(x) * nrow(y) > 2^31

@smgogarten
Copy link
Collaborator

Fixed in 5d6a7ef and 49a2215

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants