Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash on generating descriptive stats #105

Closed
mbassalbioinformatics opened this issue Mar 6, 2019 · 6 comments
Closed

Crash on generating descriptive stats #105

mbassalbioinformatics opened this issue Mar 6, 2019 · 6 comments

Comments

@mbassalbioinformatics
Copy link

Hi

So when running the pipeline, i seem to be getting a crash when attempting to generate the describe stats as follows...

Descriptive statistics...
[1] "I am loading useful packages for plotting..."
[1] "2019-03-05 23:05:59 EST"
Error in if (any(i < 0L)) { : missing value where TRUE/FALSE needed
Calls: <Anonymous> ... as -> .class1 -> .TM.repl.i.mat -> [<- -> [<- -> int2i
In addition: Warning message:
In int2i(as.integer(i), n) : NAs introduced by coercion to integer range
Execution halted

thoughts/suggestions??

@cziegenhain
Copy link
Collaborator

Hi,

We need a bit more information to troubleshoot this.
For instance: what kind of data are you processing, the YAML file, full verbose of the run

@mbassalbioinformatics
Copy link
Author

So this was a ddseq3 run. Ive attached the terminal output and the yaml files (inc postmap). I can share the rds with you in confidence to help sort this out. Just need an email address to send the dl link.

N706-PBMC-CD34CD45-1-5-chip2.postmap.yaml.txt
N706-PBMC-CD34CD45-1-5-chip2.yaml.txt
analysis_dump.txt

Thanks!

@gokceneraslan
Copy link
Contributor

I have the same issue. I tracked it down to countGenes function and it only happens with the inex matrix:

image

That's because the inex matrix is too large, so indexing leads to integer overflow. You can reproduce it with

image

This is also the same error as in https://bitbucket.org/hrue/r-inla/issues/1/logical-indexing-for-large-matrices-fails, but no idea how to solve this without breaking up the matrix into smaller pieces and counting genes separately.

@cziegenhain
Copy link
Collaborator

cziegenhain commented Mar 6, 2019

Hi @gokceneraslan - thanks for tracking the error down so quickly and the fix.

@mbassalbioinformatics : in addition to updating zUMIs with this fix, you should double check your ddseq settings. I dont think its reasonable to expect that many cell barcodes? If I remember correctly, ddseq should be run with the frameshift-correction in the read1 settings
correct_frameshift: TAGCCATCGCATTGC

Feel free to reopen the issue if further things arise!

@gokceneraslan
Copy link
Contributor

@cziegenhain thanks for merging the PR. Could you please check if everything still works with the ExampleData dataset? I think fix is correct, but it's better to be on the safe side.

Actually, the proper way would be to add unit tests using the https://github.com/r-lib/testthat package, otherwise whole codebase becomes so fragile...

@cziegenhain
Copy link
Collaborator

Thanks again for the PR, I double-checked and example data runs as expected.
Appreciate your input!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants