New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ArrayOutOfBoundsException in vcf2adam (spark2_2.11-0.22.0) on UK10K VCFs (VCFv4.1) #1494

Closed
p-yang opened this Issue Apr 17, 2017 · 7 comments

Comments

Projects
3 participants
@p-yang

p-yang commented Apr 17, 2017

Running vcf2adam on UK10K VCF files fails on an ArrayOutOfBoundsException midway through the run. Here's the call I used:

./bin/adam-submit --master yarn-client --num-executors 100 --executor-cores 4 \
--executor-memory 32g --driver-memory 32g -- \
vcf2adam /data/UK10K/EGAD00001000740 /data/UK10K/bigADAMTest/results

and the resultant exception here

Running with -stringency LENIENT allowed the run to continue without the exception appearing.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Apr 17, 2017

Member

Hi @Veryku! Thanks for posting the exception. What version/commit of ADAM did this occur on?

Member

fnothaft commented Apr 17, 2017

Hi @Veryku! Thanks for posting the exception. What version/commit of ADAM did this occur on?

@p-yang

This comment has been minimized.

Show comment
Hide comment
@p-yang

p-yang Apr 17, 2017

This was runnable version adam-distribution-spark2_2.11-0.22.0/

p-yang commented Apr 17, 2017

This was runnable version adam-distribution-spark2_2.11-0.22.0/

@p-yang

This comment has been minimized.

Show comment
Hide comment
@p-yang

p-yang Apr 17, 2017

I also just remembered that when running ADAM on an older version (adam-core_2.10-0.19.0) through Gnocchi, vcf2adam with the same call works. Taner has a hunch it may have something to do with the updated annotations logic.

p-yang commented Apr 17, 2017

I also just remembered that when running ADAM on an older version (adam-core_2.10-0.19.0) through Gnocchi, vcf2adam with the same call works. Taner has a hunch it may have something to do with the updated annotations logic.

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Apr 19, 2017

Member

It is hard to follow the stack trace; maybe a Number=G genotype field (GL, for example) had the wrong number of values?

Member

heuermh commented Apr 19, 2017

It is hard to follow the stack trace; maybe a Number=G genotype field (GL, for example) had the wrong number of values?

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft May 12, 2017

Member

Pinging this. @Veryku do you have any updates?

Member

fnothaft commented May 12, 2017

Pinging this. @Veryku do you have any updates?

@p-yang

This comment has been minimized.

Show comment
Hide comment
@p-yang

p-yang May 13, 2017

Unfortunately no, I haven't been dealing with VCF2ADAM in the recent weeks, as processing the UK10K data on lenient stringency provided the results I needed at the time.

p-yang commented May 13, 2017

Unfortunately no, I haven't been dealing with VCF2ADAM in the recent weeks, as processing the UK10K data on lenient stringency provided the results I needed at the time.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft May 13, 2017

Member

Sounds good. I am going to close this for now, but please reopen if it reappears.

Member

fnothaft commented May 13, 2017

Sounds good. I am going to close this for now, but please reopen if it reappears.

@fnothaft fnothaft closed this May 13, 2017

@heuermh heuermh modified the milestone: 0.23.0 Jul 22, 2017

@heuermh heuermh added this to Completed in Release 0.23.0 Jan 4, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment