You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I notice that while --similarity cutoff is by default mention as 0.95 it actually uses 0.80 cutoff while merging and clustering results from samples. I feel 0.80 is bit conservative. What do you suggest?
Additionally, I am strain profiling species that is present with very high coverage in metagenomes (makes up to > 80% based on Kraken2 and Metaphlan3). In that case do you recommend depth (DP) or mapping quality (MAPQ) based filtering to filter out low quality SNPs?
Also let me know if you have any other recommendations?
Thanks,
Shriram
The text was updated successfully, but these errors were encountered:
Thank you for your careful observation. The default cutoff I set is 0.80 actually. I will revise it.
In practice, please use the 'python3 merge.py ' to cluster the strains with different cutoff values like 0.85, 0.90, 0.95, 0.97, and so on. The script will save the results for different cutoff values. You can choose a meaningful value in your analysis.
In the script, I set --qual to filter SNP with SNP quality. Also, I set --snp_ratio to filter SNP according to the comparison of locus' depth and species' mean depth. You can use these two parameters to filter the SNP. So, I think it is not very necessary to use DP and MAPQ to filter SNP.
I think you can use 'samtools tview' or other tools to look at the reads mapping at the SNP locus. If the SNP is not reliable, you can filter SNP with stricter criteria. It is useful to test with different criteria.
Hi,
I notice that while --similarity cutoff is by default mention as 0.95 it actually uses 0.80 cutoff while merging and clustering results from samples. I feel 0.80 is bit conservative. What do you suggest?
Additionally, I am strain profiling species that is present with very high coverage in metagenomes (makes up to > 80% based on Kraken2 and Metaphlan3). In that case do you recommend depth (DP) or mapping quality (MAPQ) based filtering to filter out low quality SNPs?
Also let me know if you have any other recommendations?
Thanks,
Shriram
The text was updated successfully, but these errors were encountered: