-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions Regarding Heterozygous Variants, Somatic Mutations, and Phasing in ClairS Usage #18
Comments
Hi, @sloth-eat-pudding, Thank you for your interest in ClairS.
Look forward to having a new LongPhase version for somatic variant calling! Zhenxian |
I apologize for not being clear earlier. Q1. I searched for confident heterozygous germline variants identified by ClairS in the normal.bam file, but found that they actually contain homozygous variants.(as shown in Figure 2). Q3. This is our development goal.Additionally, I noticed that Clair3's Makefile uses LongPhase v1.3. Our current version is v1.6, and I suggest upgrading to this version for improvements in accuracy and processing time. |
For Q1, seems there are no homozygous variants in normal BAM in Figure 2, are you referring to a homozygous reference(that is the same allele as the reference base)? But thanks for reporting this, we will check the details then. For Q3, thanks for the suggestion, sure, we will update LongPhase to v1.6 in our next release. |
Q1. Your understanding is correct. Thank you for your confirmation. |
Dear Clair Team,
I am a member of the longphase development team. Recently, while using ClairS and studying related literature, I have observed some results that raised a few questions I'd like to inquire about.
During the use of ont_quick_demo.sh, I used IGV to observe the vcf in [Figure 1] process.After organizing the data, I noticed that only the tumor samples were heterozygous (as shown in Figure 2). My question is, during the Germline variant calling step, are only the tumor cells identified as confident heterozygous?
From Figure 2, I observed that the SNPs used for longphase actually include some somatic mutations. I would like to confirm if the SNPs used for phasing are indeed a mix of somatic mutations and Germline variants.
In Figure 3, the number of SNPs used for phasing seems insufficient to cover the entire range. Attempting other variant calling software, I found that Clair3 (v1.0.5) results could cover a broader range. After phasing and haplotagging with Clair3 and longphase (v1.6), I obtained the results as shown in Figure 4 (B) (only displaying SNPs that were phased). I noticed that different haplotypes could still be distinguished in ClairS's output (as seen in Figure 5). Have you considered using this method?
For ClairS, tagging appears to be a crucial step. The ideal output would include H1/H2 and H2 (carrying somatic mutations), or H1, H2, H3 (tumor-specific), etc. Would such an approach be beneficial for the training or detection of the ClairS model? If you think it would be helpful, we are considering developing a new version focused on somatic mutations.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Thank you for your time and assistance!
The text was updated successfully, but these errors were encountered: