Add to Readme MarkDuplicatesSpark syntax for specifying no. of cores on a local machine #6324

bhanugandham · 2019-12-16T22:17:09Z

User Report:

Hi,

I'm trying to run gatk MarkDuplicatesSpark (v 4.1.4.1) locally, so not on a spark cluster, and provided the option --conf 'spark.executor.cores=4' to tell MarkDuplicatesSpark to use only 4 cores on the machine. However when I check the system load with e.g. top I see that all 44 cores of the system are used by MarkDuplicatesSpark. What am I doing wrong?

command:
gatk MarkDuplicatesSpark
--tmp-dir /local/scratch/tmp
-I Control_aligned.bam
-O Control_aligned_sort_mkdp.bam \
-M Control_aligned_sort_mkdp.txt
--create-output-bam-index true
--read-validation-stringency LENIENT
--conf 'spark.executor.cores=4'

Solution is to use this argument: --spark-master local[2] -> "Run on the local machine using two cores". More details in this doc: https://software.broadinstitute.org/gatk/documentation/article?id=11245

This Issue was generated from your [forums]
[forums]: https://gatkforums.broadinstitute.org/gatk/discussion/24671/markduplicatesspark-not-respecting-conf-spark-executor-cores-4-option/p1

The text was updated successfully, but these errors were encountered:

Resolves #6324

…6682) Resolves #6324

…ed and does not use all installed cores: See: broadinstitute/gatk#6324

bhanugandham added the Vanilla label Dec 16, 2019

bhanugandham added this to the GATK-Priority-Backlog milestone Dec 16, 2019

bhanugandham assigned droazen Dec 16, 2019

droazen removed this from the GATK-Priority-Backlog milestone Jun 22, 2020

droazen assigned lbergelson Jun 29, 2020

droazen added the Spark label Jun 29, 2020

droazen unassigned lbergelson Jun 29, 2020

droazen added a commit that referenced this issue Jun 29, 2020

Add instructions for running Spark tools in LOCAL mode to the README

9fcd21f

Resolves #6324

droazen mentioned this issue Jun 29, 2020

Add instructions for running Spark tools in LOCAL mode to the README #6682

Merged

droazen closed this as completed in #6682 Jun 30, 2020

droazen added a commit that referenced this issue Jun 30, 2020

Add instructions for running Spark tools in LOCAL mode to the README (#…

196d871

…6682) Resolves #6324

jonn-smith pushed a commit that referenced this issue Jul 14, 2020

Add instructions for running Spark tools in LOCAL mode to the README (#…

68c4aec

…6682) Resolves #6324

mwalker174 pushed a commit that referenced this issue Nov 3, 2020

Add instructions for running Spark tools in LOCAL mode to the README (#…

6eb89d9

…6682) Resolves #6324

riederd added a commit to icbi-lab/nextNEOpi that referenced this issue Jun 15, 2021

- fix: local spark execution now respects the number of cores request…

935922f

…ed and does not use all installed cores: See: broadinstitute/gatk#6324

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add to Readme MarkDuplicatesSpark syntax for specifying no. of cores on a local machine #6324

Add to Readme MarkDuplicatesSpark syntax for specifying no. of cores on a local machine #6324

bhanugandham commented Dec 16, 2019 •

edited

Add to Readme MarkDuplicatesSpark syntax for specifying no. of cores on a local machine #6324

Add to Readme MarkDuplicatesSpark syntax for specifying no. of cores on a local machine #6324

Comments

bhanugandham commented Dec 16, 2019 • edited

bhanugandham commented Dec 16, 2019 •

edited