You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I don't know if it is an issue or I am doing something wrong, but I report my experience in case might be useful for GATK developers:
I took some Performance Analysis for the tool BQSRPipelineSpark (when GATK 4.0 was still in Beta, for understanding there was still gatk-launch as command to execute the tool), processing a Whole Exome Sequencing Genome of about 14 GB (obviously after applying FastqToSam and BwaAndMarkDuplicatesPipelineSpark) and it took about 70 minutes.
I tried to use the same tool in the same VM, with the same input data and now takes 626,95 minutes (a considerable difference of execution time). Is it normal or am I doing something wrong?
To be sure of what I am saying, I re-executed the old version tool with gatk-launch and it takes 65 minutes for example
The text was updated successfully, but these errors were encountered:
Possibly you are running into the Spark performance regression described in #4376. This was patched in the latest release (4.0.2.0) -- could you try running with that release and see if the issue is resolved?
@droazen great! I faced this problem because I am using a self deployed Docker Swarm and so I was using a GATK version of few weeks ago. Sorry for opening an useful issue.
For completeness, I used a VM with double of resources and it took 36,92 minutes, as predictable.
I don't know if it is an issue or I am doing something wrong, but I report my experience in case might be useful for GATK developers:
I took some Performance Analysis for the tool BQSRPipelineSpark (when GATK 4.0 was still in Beta, for understanding there was still
gatk-launch
as command to execute the tool), processing a Whole Exome Sequencing Genome of about 14 GB (obviously after applying FastqToSam and BwaAndMarkDuplicatesPipelineSpark) and it took about 70 minutes.I tried to use the same tool in the same VM, with the same input data and now takes 626,95 minutes (a considerable difference of execution time). Is it normal or am I doing something wrong?
To be sure of what I am saying, I re-executed the old version tool with
gatk-launch
and it takes 65 minutes for exampleThe text was updated successfully, but these errors were encountered: