Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Look into adaptive pruning in GATK 4.2.0.0 #7232

Open
GATKSupportTeam opened this issue Apr 26, 2021 · 1 comment
Open

Look into adaptive pruning in GATK 4.2.0.0 #7232

GATKSupportTeam opened this issue Apr 26, 2021 · 1 comment
Assignees

Comments

@GATKSupportTeam
Copy link
Collaborator

GATKSupportTeam commented Apr 26, 2021

Mutect2 Adaptive Pruning issue as discussed in GATK OH meeting.
Here is the original post:

This request was created from a contribution made by Nabeel Ahmed on April 07, 2021 09:13 UTC.

Link: https://gatk.broadinstitute.org/hc/en-us/community/posts/360077647812-Why-do-a-clear-expected-variant-not-show-up-in-the-Mutect2-vcf-file

--

I am running Mutect2 on a sample in tumor-only mode. This sample has mutations introduced and known to be true positive calls. However, I am unable to detect some of these calls in the vcf file after Mutect2 is run that have very clear read support as seen in IGV. I have used the –bam-output option to show the output bam and in IGV, it shows that there is no assembly in this region and no mutation event was detected. I am showing the IGV screenshot for one of such calls (chr12:25398285).

I am using the latest version GATK 4.2.0.0 and the following is the full Mutect2 command from the log file

java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /gatk/gatk-package-4.2.0.0-local.jar Mutect2 -R ../resources/hg19.fa -L ../resources/coding_regions.bed -I bam_files/sample1.bam --pon ../resources/pon.vcf.gz --germline-resource ../resources/af-only-gnomad.raw.sites.hg19.vcf.gz --bam-output sample1.mutect2_out.bam --recover-all-dangling-branches true -min-pruning 1 --min-dangling-branch-length 2 --debug --max-reads-per-alignment-start 0 --genotype-pon-sites True --f1r2-tar-gz vcf_files/f1r2.sample1.tar.gz -O vcf_files/unfiltered.sample1.vcf  

In the debug mode, the following log messages are generated for this region

08:01:26.086 INFO  Mutect2Engine - Assembling chr12:25398242-25398320 with 14298 reads:    (with overlap region = chr12:25398142-25398420)

I have another call with similar VAF that is detected in the vcf output(chr12:25380275).

chr12 25380275   .    T    G    .    .     AS_SB_TABLE=3911,5343|26,21;DP=9485;ECNT=1;MBQ=36,36;MFRL=0,0;MMQ=42,42;MPOS=18;POPAF=7.30;TLOD=53.53     GT:AD:AF:DP:F1R2:F2R1:SB   0/1:9254,47:4.970e-03:9301:5321,21:3867,26:3911,5343,26,21

The input and the output BAMs show this call with the variant.

In the logs, it shows the detection of an active region here:

08:01:23.642 INFO  Mutect2Engine - Assembling chr12:25380238-25380327 with 19912 reads:    (with overlap region = chr12:25380138-25380427)

08:01:24.119 INFO  EventMap - >> Events = EventMap{chr12:25380275-25380275 [T*, G],}

08:01:24.154 INFO  AssemblyResultSet - Trimming active region AssemblyRegion chr12:25380238-25380327 active?=true nReads=19912 with 2 haplotypes

08:01:24.154 INFO  AssemblyResultSet - Trimmed region to chr12:25380255-25380295 and reduced number of haplotypes from 2 to only 2

08:01:25.383 INFO  EventMap - >> Events = EventMap{chr12:25380275-25380275 [T*, G],}

I have tried troubleshooting with the steps stated in this [blog](/hc/en-us/articles/360043491652-When-HaplotypeCaller-and-Mutect2-do-not-call-an-expected-variant). However, it does not change the output vcf. I used the force-calling mode by giving the above call in an input vcf and the call did appear in the vcf file.

chr12 25398285   .    C    A    .    .     AS_SB_TABLE=4312,3630|14,8;DP=8096;ECNT=1;MBQ=36,36;MFRL=0,0;MMQ=42,42;MPOS=22;POPAF=7.30;TLOD=14.69     GT:AD:AF:DP:F1R2:F2R1:SB   0/1:7942,22:2.576e-03:7964:3609,8:4268,13:4312,3630,14,8

However, I cannot rely on force-calling mutations on a set of calls. I am unsure if I am missing out more calls that are not showing up. Are there any parameters I need to tune so that I do not miss calls like above?

(created from Zendesk ticket #136765)
gz#136765

@nabeel-bioinfo
Copy link

Thank you for taking a look into this. I followed recommendation of @gbrandt6 and reduced the --pruning-lod-threshold but this call is still unable to make it to the output of Mutect2. I tried different thresholds from 1.3, 0.7, 0.5 and even 0.1 but it did not lead to any difference in detecting this call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants