You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using Scala version 2.11.12, OpenJDK 64-Bit Server VM, 1.8.0_252
Branch HEAD
No matter how high I set the max file descriptors (even to 1M)
mwiewior@Mareks-MacBook-Pro ~ % ulimit -a
-t: cpu time (seconds) unlimited
-f: file size (blocks) unlimited
-d: data seg size (kbytes) unlimited
-s: stack size (kbytes) 8192
-c: core file size (blocks) 0
-v: address space (kbytes) unlimited
-l: locked-in-memory size (kbytes) unlimited
-u: processes 2048
-n: file descriptors 1000000
I'm keep on getting the following error:
20/06/06 14:56:35 ERROR Utils: Aborting task
org.broadinstitute.hellbender.exceptions.UserException$CouldNotReadInputFile: Couldn't read file file:///private/var/folders/5s/v5t08tmd42z_2m2c30vqf6kc0000gn/T/spark-556aa7a2-4d88-4bae-ad16-36d5af920fa9/userFiles-aeb68992-3215-4897-8f8a-040396296185/Homo_sapiens_assembly18.fasta. Error was: Fasta index file could not be opened: /private/var/folders/5s/v5t08tmd42z_2m2c30vqf6kc0000gn/T/spark-556aa7a2-4d88-4bae-ad16-36d5af920fa9/userFiles-aeb68992-3215-4897-8f8a-040396296185/Homo_sapiens_assembly18.fasta.fai
at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.(CachingIndexedFastaSequenceFile.java:159)
at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.(CachingIndexedFastaSequenceFile.java:125)
at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.(CachingIndexedFastaSequenceFile.java:110)
at org.broadinstitute.hellbender.engine.ReferenceFileSource.(ReferenceFileSource.java:35)
at org.broadinstitute.hellbender.engine.spark.LocusWalkerSpark.lambda$getAlignmentsFunction$a99dbf6a$1(LocusWalkerSpark.java:113)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$1$1.apply(JavaRDDLike.scala:125)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$1$1.apply(JavaRDDLike.scala:125)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130)
at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:129)
at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394)
at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:141)
at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83)
at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: htsjdk.samtools.SAMException: Fasta index file could not be opened: /private/var/folders/5s/v5t08tmd42z_2m2c30vqf6kc0000gn/T/spark-556aa7a2-4d88-4bae-ad16-36d5af920fa9/userFiles-aeb68992-3215-4897-8f8a-040396296185/Homo_sapiens_assembly18.fasta.fai
at htsjdk.samtools.reference.FastaSequenceIndex.(FastaSequenceIndex.java:74)
at htsjdk.samtools.reference.IndexedFastaSequenceFile.(IndexedFastaSequenceFile.java:98)
at htsjdk.samtools.reference.ReferenceSequenceFileFactory.getReferenceSequenceFile(ReferenceSequenceFileFactory.java:139)
at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.(CachingIndexedFastaSequenceFile.java:148)
... 24 more
Caused by: java.nio.file.FileSystemException: /private/var/folders/5s/v5t08tmd42z_2m2c30vqf6kc0000gn/T/spark-556aa7a2-4d88-4bae-ad16-36d5af920fa9/userFiles-aeb68992-3215-4897-8f8a-040396296185/Homo_sapiens_assembly18.fasta.fai: Too many open files
Is there any additional parameter I can set to overcome this issue?
Any ideas welcome.
Thanks,
Marek
The text was updated successfully, but these errors were encountered:
Mayby it is related to #6578 and #5316 ? However, I tested versrions: 4.1.7, 4.1.6 4.1.5, 4.1.4.1 and the problem persists together with spark 2.4.3 and spark 2.4.5.
mwiewior
pushed a commit
to mwiewior/gatk
that referenced
this issue
Jun 7, 2020
The proposed PR doesn't fix the problem completely. When running local[1] it fails with the same error(too many open files) after finishing ~400/555 partitions - without the fix it failed after ~25. When running in parallel(local[n], n>1) it starts hanging after processing around ~40 partitions.
Hello,
I'm trying to run SparkPileup from the most recent GATK 4.1.7 on a WES sample like this:
gatk PileupSpark --spark-runner SPARK --spark-master local[{threads}] --conf "spark.driver.memory=22g" -I $DATA/NA12878.proper.wes.md.bam -R $DATA/Homo_sapiens_assembly18.fasta -O /tmp/gatk4s_{threads}.pileup'
mwiewior@Mareks-MacBook-Pro ~ % spark-submit --version
Welcome to
____ __
/ / ___ / /
\ / _ / _ `/ __/ '/
// .__/_,// //_\ version 2.4.5
//
Using Scala version 2.11.12, OpenJDK 64-Bit Server VM, 1.8.0_252
Branch HEAD
No matter how high I set the max file descriptors (even to 1M)
mwiewior@Mareks-MacBook-Pro ~ % ulimit -a
-t: cpu time (seconds) unlimited
-f: file size (blocks) unlimited
-d: data seg size (kbytes) unlimited
-s: stack size (kbytes) 8192
-c: core file size (blocks) 0
-v: address space (kbytes) unlimited
-l: locked-in-memory size (kbytes) unlimited
-u: processes 2048
-n: file descriptors 1000000
I'm keep on getting the following error:
20/06/06 14:56:35 ERROR Utils: Aborting task
org.broadinstitute.hellbender.exceptions.UserException$CouldNotReadInputFile: Couldn't read file file:///private/var/folders/5s/v5t08tmd42z_2m2c30vqf6kc0000gn/T/spark-556aa7a2-4d88-4bae-ad16-36d5af920fa9/userFiles-aeb68992-3215-4897-8f8a-040396296185/Homo_sapiens_assembly18.fasta. Error was: Fasta index file could not be opened: /private/var/folders/5s/v5t08tmd42z_2m2c30vqf6kc0000gn/T/spark-556aa7a2-4d88-4bae-ad16-36d5af920fa9/userFiles-aeb68992-3215-4897-8f8a-040396296185/Homo_sapiens_assembly18.fasta.fai
at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.(CachingIndexedFastaSequenceFile.java:159)
at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.(CachingIndexedFastaSequenceFile.java:125)
at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.(CachingIndexedFastaSequenceFile.java:110)
at org.broadinstitute.hellbender.engine.ReferenceFileSource.(ReferenceFileSource.java:35)
at org.broadinstitute.hellbender.engine.spark.LocusWalkerSpark.lambda$getAlignmentsFunction$a99dbf6a$1(LocusWalkerSpark.java:113)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$1$1.apply(JavaRDDLike.scala:125)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$1$1.apply(JavaRDDLike.scala:125)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:435)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:441)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130)
at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:129)
at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394)
at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:141)
at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83)
at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: htsjdk.samtools.SAMException: Fasta index file could not be opened: /private/var/folders/5s/v5t08tmd42z_2m2c30vqf6kc0000gn/T/spark-556aa7a2-4d88-4bae-ad16-36d5af920fa9/userFiles-aeb68992-3215-4897-8f8a-040396296185/Homo_sapiens_assembly18.fasta.fai
at htsjdk.samtools.reference.FastaSequenceIndex.(FastaSequenceIndex.java:74)
at htsjdk.samtools.reference.IndexedFastaSequenceFile.(IndexedFastaSequenceFile.java:98)
at htsjdk.samtools.reference.ReferenceSequenceFileFactory.getReferenceSequenceFile(ReferenceSequenceFileFactory.java:139)
at org.broadinstitute.hellbender.utils.fasta.CachingIndexedFastaSequenceFile.(CachingIndexedFastaSequenceFile.java:148)
... 24 more
Caused by: java.nio.file.FileSystemException: /private/var/folders/5s/v5t08tmd42z_2m2c30vqf6kc0000gn/T/spark-556aa7a2-4d88-4bae-ad16-36d5af920fa9/userFiles-aeb68992-3215-4897-8f8a-040396296185/Homo_sapiens_assembly18.fasta.fai: Too many open files
Is there any additional parameter I can set to overcome this issue?
Any ideas welcome.
Thanks,
Marek
The text was updated successfully, but these errors were encountered: