-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BaseRecalibratorSpark crash: too many open files #6578
Comments
Huh. It sounds like there must be a file handle leak we didn't know about. I wonder why you're seeing it now for the first time... |
...I didn't run the spark version for quite some time, and it might also depend on the size of the BAM file being processed. |
I meant more "why anyone is seeing it now for the first time" not you specifically :) Are you running a particularly giant bam file? |
Not really it is "just" 50697M |
That's big but not crazy. |
its from WGS |
Not sure if/how I've long suspected that we have a file handle leak somewhere, since I encounter it when running tests locally, but have never been able to track it down. |
4.1.4.1 didn't have the problem, at least I do not remember to have seen it. HTH |
Hi,
I'm trying to run BaseRecalibratorSpark (gatk-4.1.7.0) but I'm hitting a problem where the process carshe with a "too many open files" error.
Here is my command:
ulimit -n 4096
/usr/local/bioinf/gatk/gatk-4.1.7.0/gatk BaseRecalibratorSpark
--java-options '-Xmx64G'
--tmp-dir /local/scratch/rieder
-I Normal_fixed.bam
-R GRCh38.d1.vd1.fa
-L S07604514_Regions_merged_padded.interval_list
-O P45507_normal_bqsr.table
--known-sites Homo_sapiens_assembly38.dbsnp138.vcf
--known-sites Homo_sapiens_assembly38.known_indels.vcf
--known-sites Mills_and_1000G_gold_standard.indels.hg38.vcf
--spark-master local[8]
--conf 'spark.executor.cores=8'
--conf 'spark.local.dir=/local/scratch/rieder'
Here is the (hopefully) relevant log extract:
20/04/29 01:51:51 INFO TaskSetManager: Finished task 578.0 in stage 0.0 (TID 578) in 2720 ms on localhost (executor driver) (576/1585)
20/04/29 01:51:51 INFO NewHadoopRDD: Input split: file:/data/projects/2019/NeoAG/VCF-phasing/work/b8/d8dc550b7ba5f57d935d04e27b756a/Normal_fixed.bam:19562233856+33554432
01:51:51.374 INFO FeatureManager - Using codec VCFCodec to read file file:///local/scratch/rieder/spark-bb59423b-0368-4de5-85e0-e6641fb25380/userFiles-a91d5958-33f5-4685-bf9d-c8fc0924f7c6/Homo_sapiens_assembly38.dbsnp138.vcf
01:51:51.431 INFO FeatureManager - Using codec VCFCodec to read file file:///local/scratch/rieder/spark-bb59423b-0368-4de5-85e0-e6641fb25380/userFiles-a91d5958-33f5-4685-bf9d-c8fc0924f7c6/Homo_sapiens_assembly38.known_indels.vcf
01:51:51.451 INFO FeatureManager - Using codec VCFCodec to read file file:///local/scratch/rieder/spark-bb59423b-0368-4de5-85e0-e6641fb25380/userFiles-a91d5958-33f5-4685-bf9d-c8fc0924f7c6/Mills_and_1000G_gold_standard.indels.hg38.vcf
01:51:51.457 INFO FeatureManager - Using codec VCFCodec to read file file:///local/scratch/rieder/spark-bb59423b-0368-4de5-85e0-e6641fb25380/userFiles-a91d5958-33f5-4685-bf9d-c8fc0924f7c6/Homo_sapiens_assembly38.dbsnp138.vcf
01:51:51.507 INFO BaseRecalibrationEngine - The covariates being used here:
01:51:51.507 INFO BaseRecalibrationEngine - ReadGroupCovariate
01:51:51.507 INFO BaseRecalibrationEngine - QualityScoreCovariate
01:51:51.507 INFO BaseRecalibrationEngine - ContextCovariate
01:51:51.507 INFO BaseRecalibrationEngine - CycleCovariate
01:51:51.517 INFO FeatureManager - Using codec VCFCodec to read file file:///local/scratch/rieder/spark-bb59423b-0368-4de5-85e0-e6641fb25380/userFiles-a91d5958-33f5-4685-bf9d-c8fc0924f7c6/Homo_sapiens_assembly38.known_indels.vcf
20/04/29 01:51:51 ERROR Executor: Exception in task 581.0 in stage 0.0 (TID 581)
org.broadinstitute.hellbender.exceptions.GATKException: Error initializing feature reader for path /local/scratch/rieder/spark-bb59423b-0368-4de5-85e0-e6641fb25380/userFiles-a91d5958-33f5-4685-bf9d-c8fc0924f7c6/Homo_sapiens_assembly38.known_indels.vcf
at org.broadinstitute.hellbender.engine.FeatureDataSource.getTribbleFeatureReader(FeatureDataSource.java:383)
at org.broadinstitute.hellbender.engine.FeatureDataSource.getFeatureReader(FeatureDataSource.java:335)
at org.broadinstitute.hellbender.engine.FeatureDataSource.(FeatureDataSource.java:282)
at org.broadinstitute.hellbender.engine.FeatureDataSource.(FeatureDataSource.java:238)
at org.broadinstitute.hellbender.engine.FeatureDataSource.(FeatureDataSource.java:222)
at org.broadinstitute.hellbender.utils.spark.JoinReadsWithVariants.openFeatureSource(JoinReadsWithVariants.java:63)
at org.broadinstitute.hellbender.utils.spark.JoinReadsWithVariants.lambda$null$0(JoinReadsWithVariants.java:44)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
at org.broadinstitute.hellbender.utils.spark.JoinReadsWithVariants.lambda$join$60e5b476$1(JoinReadsWithVariants.java:44)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:801)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:801)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: htsjdk.tribble.TribbleException$MalformedFeatureFile: Unable to parse header with error: /local/scratch/rieder/spark-bb59423b-0368-4de5-85e0-e6641fb25380/userFiles-a91d5958-33f5-4685-bf9d-c8fc0924f7c6/Homo_sapiens_assembly38.known_indels.vcf: Too many open files, for input source: /local/scratch/rieder/spark-bb59423b-0368-4de5-85e0-e6641fb25380/userFiles-a91d5958-33f5-4685-bf9d-c8fc0924f7c6/Homo_sapiens_assembly38.known_indels.vcf
at htsjdk.tribble.TribbleIndexedFeatureReader.readHeader(TribbleIndexedFeatureReader.java:263)
at htsjdk.tribble.TribbleIndexedFeatureReader.(TribbleIndexedFeatureReader.java:102)
at htsjdk.tribble.TribbleIndexedFeatureReader.(TribbleIndexedFeatureReader.java:127)
at htsjdk.tribble.AbstractFeatureReader.getFeatureReader(AbstractFeatureReader.java:121)
at org.broadinstitute.hellbender.engine.FeatureDataSource.getTribbleFeatureReader(FeatureDataSource.java:380)
... 39 more
How many files does it need to open? As user I can not ulimit to more than 4096 files.
Best
Dietmar
The text was updated successfully, but these errors were encountered: