You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On git head after merging #1346 and #1348, the next error is
$ time ./bin/adam-submit \
vcf2adam -only_variants \
ExAC.0.3.GRCh38.vcf.gz ExAC.0.3.GRCh38.variants.adam
Using ADAM_MAIN=org.bdgenomics.adam.cli.ADAMMain
Using SPARK_SUBMIT=/usr/local/bin/spark-submit
...
2017-01-16 17:18:06 ERROR Utils:91 - Aborting task
htsjdk.tribble.TribbleException: The provided VCF file is malformed at approximately
line number 15364: Duplicate allele added to VariantContext: C
at htsjdk.variant.vcf.AbstractVCFCodec.generateException(AbstractVCFCodec.java:779)
at htsjdk.variant.vcf.AbstractVCFCodec.parseVCFLine(AbstractVCFCodec.java:356)
at htsjdk.variant.vcf.AbstractVCFCodec.decodeLine(AbstractVCFCodec.java:279)
at htsjdk.variant.vcf.AbstractVCFCodec.decode(AbstractVCFCodec.java:257)
at org.seqdoop.hadoop_bam.VCFRecordReader.nextKeyValue(VCFRecordReader.java:144)
at org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:199)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply$mcV$sp(PairRDDFunctions.scala:1123)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply(PairRDDFunctions.scala:1123)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$4.apply(PairRDDFunctions.scala:1123)
at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1341)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1131)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12.apply(PairRDDFunctions.scala:1102)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
The text was updated successfully, but these errors were encountered:
I need to write something to go through the file and update the CSQ variant annotations to the ANN specification, so when I do that I'll filter out the invalid data as well.
On git head after merging #1346 and #1348, the next error is
The text was updated successfully, but these errors were encountered: