Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VariantQC fails with VCF including IUPAC ambiguous DNA bases #63

Closed
rspreafico opened this issue Apr 25, 2020 · 4 comments
Closed

VariantQC fails with VCF including IUPAC ambiguous DNA bases #63

rspreafico opened this issue Apr 25, 2020 · 4 comments

Comments

@rspreafico
Copy link

Hi there

The flu reference genome makes use of IUPAC ambiguous bases (e.g. R). Calling variants results in things like R -> A variants being called, due to the variant caller not supporting IUPAC ambiguous bases. These variants can be ignored, and they don't trigger any failure of the variant caller. However, VariantQC terminates with an error when it finds variants like this. Would it be possible to handle such variants so that VariantQC does not terminate?

Thanks
Roberto

@rspreafico
Copy link
Author

Output:

***********************************************************************

UNEXPECTED ERROR: The provided VCF file is malformed at approximately line number 151: unparsable vcf record with allele R, for input source: sample_name.vcf.gz

***********************************************************************
Please, search for this error in our issue tracker or post a new one:
htsjdk.tribble.TribbleException: The provided VCF file is malformed at approximately line number 151: unparsable vcf record with allele R, for input source: sample_name.vcf.gz
	at htsjdk.variant.vcf.AbstractVCFCodec.generateException(AbstractVCFCodec.java:887)
	at htsjdk.variant.vcf.AbstractVCFCodec.checkAllele(AbstractVCFCodec.java:678)
	at htsjdk.variant.vcf.AbstractVCFCodec.parseAlleles(AbstractVCFCodec.java:640)
	at htsjdk.variant.vcf.AbstractVCFCodec.parseVCFLine(AbstractVCFCodec.java:443)
	at htsjdk.variant.vcf.AbstractVCFCodec.decodeLine(AbstractVCFCodec.java:384)
	at htsjdk.variant.vcf.AbstractVCFCodec.decode(AbstractVCFCodec.java:328)
	at htsjdk.variant.vcf.AbstractVCFCodec.decode(AbstractVCFCodec.java:48)
	at htsjdk.tribble.TabixFeatureReader$FeatureIterator.readNextRecord(TabixFeatureReader.java:173)
	at htsjdk.tribble.TabixFeatureReader$FeatureIterator.next(TabixFeatureReader.java:205)
	at htsjdk.tribble.TabixFeatureReader$FeatureIterator.next(TabixFeatureReader.java:149)
	at java.util.Iterator.forEachRemaining(Iterator.java:116)
	at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
	at org.broadinstitute.hellbender.engine.VariantWalker.traverse(VariantWalker.java:102)
	at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1048)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:139)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210)
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
	at com.github.discvrseq.Main.main(Main.java:50)

@bbimber
Copy link
Contributor

bbimber commented Apr 28, 2020

Thanks for reporting this. The failure is pretty low in HTSJDK (the library that parses/reads the VCF), but I will look into what's possible.

@snower2010
Copy link

Hi @bbimber , we came to the same problem. I just wondering if there is any way to avoid this error?
Many thanks!

@bbimber
Copy link
Contributor

bbimber commented Aug 15, 2020

@snower2010 @rspreafico Sorry it took so long to look into this. While I agree IUPAC bases would be useful, the HTSJDK API just doesnt support them. It's also not part of the VCF 4.2 spec (https://samtools.github.io/hts-specs/VCFv4.2.pdf).

@bbimber bbimber closed this as completed Aug 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants