You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to run gatk MarkDuplicatesSpark (v 4.1.4.1) locally, so not on a spark cluster, and provided the option --conf 'spark.executor.cores=4' to tell MarkDuplicatesSpark to use only 4 cores on the machine. However when I check the system load with e.g. top I see that all 44 cores of the system are used by MarkDuplicatesSpark. What am I doing wrong?
User Report:
Hi,
I'm trying to run gatk MarkDuplicatesSpark (v 4.1.4.1) locally, so not on a spark cluster, and provided the option --conf 'spark.executor.cores=4' to tell MarkDuplicatesSpark to use only 4 cores on the machine. However when I check the system load with e.g. top I see that all 44 cores of the system are used by MarkDuplicatesSpark. What am I doing wrong?
command:
gatk MarkDuplicatesSpark
--tmp-dir /local/scratch/tmp
-I Control_aligned.bam
-O Control_aligned_sort_mkdp.bam \
-M Control_aligned_sort_mkdp.txt
--create-output-bam-index true
--read-validation-stringency LENIENT
--conf 'spark.executor.cores=4'
Solution is to use this argument:
--spark-master local[2] -> "Run on the local machine using two cores"
. More details in this doc: https://software.broadinstitute.org/gatk/documentation/article?id=11245This Issue was generated from your [forums]
[forums]: https://gatkforums.broadinstitute.org/gatk/discussion/24671/markduplicatesspark-not-respecting-conf-spark-executor-cores-4-option/p1
The text was updated successfully, but these errors were encountered: