Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--similarity cutoff default #9

Open
ShriramHPatel opened this issue Jul 20, 2021 · 1 comment
Open

--similarity cutoff default #9

ShriramHPatel opened this issue Jul 20, 2021 · 1 comment

Comments

@ShriramHPatel
Copy link

Hi,

I notice that while --similarity cutoff is by default mention as 0.95 it actually uses 0.80 cutoff while merging and clustering results from samples. I feel 0.80 is bit conservative. What do you suggest?

Additionally, I am strain profiling species that is present with very high coverage in metagenomes (makes up to > 80% based on Kraken2 and Metaphlan3). In that case do you recommend depth (DP) or mapping quality (MAPQ) based filtering to filter out low quality SNPs?

Also let me know if you have any other recommendations?

Thanks,
Shriram

@wshuai294
Copy link
Owner

Hi Shriram,

Thank you for your careful observation. The default cutoff I set is 0.80 actually. I will revise it.

In practice, please use the 'python3 merge.py ' to cluster the strains with different cutoff values like 0.85, 0.90, 0.95, 0.97, and so on. The script will save the results for different cutoff values. You can choose a meaningful value in your analysis.

In the script, I set --qual to filter SNP with SNP quality. Also, I set --snp_ratio to filter SNP according to the comparison of locus' depth and species' mean depth. You can use these two parameters to filter the SNP. So, I think it is not very necessary to use DP and MAPQ to filter SNP.

I think you can use 'samtools tview' or other tools to look at the reads mapping at the SNP locus. If the SNP is not reliable, you can filter SNP with stricter criteria. It is useful to test with different criteria.

Best,
Shuai

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants