Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

liftovervcf error for large vcf files with genotypes #1041

Closed
chenwenan opened this issue Dec 27, 2017 · 4 comments
Closed

liftovervcf error for large vcf files with genotypes #1041

chenwenan opened this issue Dec 27, 2017 · 4 comments

Comments

@chenwenan
Copy link

Bug Report

Affected tool(s)

picard

Affected version(s)

2.17.0

Description

When running picard LiftoverVcf on a vcf file of 45G with genotypes, I have the following error:

Runtime.totalMemory()=21361459200
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:121)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:421)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at htsjdk.tribble.util.ParsingUtils.split(ParsingUtils.java:276)
at htsjdk.variant.vcf.AbstractVCFCodec.decodeLine(AbstractVCFCodec.java:275)
at htsjdk.variant.vcf.AbstractVCFCodec.decode(AbstractVCFCodec.java:262)
at htsjdk.variant.vcf.AbstractVCFCodec.decode(AbstractVCFCodec.java:64)
at htsjdk.tribble.TabixFeatureReader$FeatureIterator.readNextRecord(TabixFeatureReader.java:180)
at htsjdk.tribble.TabixFeatureReader$FeatureIterator.next(TabixFeatureReader.java:215)
at htsjdk.tribble.TabixFeatureReader$FeatureIterator.next(TabixFeatureReader.java:155)
at picard.vcf.LiftoverVcf.doWork(LiftoverVcf.java:271)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:268)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:98)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:108)

Steps to reproduce

Expected behavior

I guess the large vcf files could be handled by storing only part of the variants in memory

Actual behavior

out of memory error

@yfarjoun
Copy link
Contributor

I think that the problem here is using the wrong value for MAX_RECORDS_IN_RAM....am I correct?

@chenwenan
Copy link
Author

I think I simply used the default without specifying this parameter. Will this parameter help?

@yfarjoun
Copy link
Contributor

yes, for VCF sorting it's best to use a smaller value. Try 10000.

@yfarjoun yfarjoun reopened this Feb 21, 2018
@merceneryinbox
Copy link

In my case such explicit reducing not effective with large vcf...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants