-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
way too many clusters? #166
Comments
I am not sure, but I think UPARSE is quite strict and eliminates clusters that have a low abundance or low quality sequence. It also removes chimeras as far as I know. That may be a reason why it ends up with fewer clusters. |
If I am not mistaken, the ITS sequences of fungi (if that's what you are studying) often have highly variable length gaps when aligned. That may cause Swarm to split groups more than other algorithms. |
@torognes is right. Swarm does only one thing: it makes clusters of sequences; whereas UPARSE also applies aggressive filters to remove rare sequences, low quality sequences, and chimeras. In my own analyses I use Other filters that can efficiently reduce the number of clusters:
|
Thank you @torognes and @frederic-mahe I think it is possible, I do eliminate singletons in UPARSE and it is true that removes chimeras automatically. I can try to remove chimeras before using SWARM, remove singletons after generating the clusters. What if I increase Gian |
I suggest to eliminate chimeras after clustering and after removing singletons. Increasing the In my own projects, using the high resolution |
thanks a lot for the explanation. I will follow your advice and let you know what I get. Gian |
I am going to close that issue. Please feel free to re-open if need be. |
Hello,
I am trying to figure out what would be the best
--differences
. I have 155 samples in my library, average of 700 bp ITS fragment, medium diversity since it is a mix of soil, roots and leaves.I tested --differences 1 with
--fastidious
and o got ~65 thousands clusters. Then I tried--differences 3
and I got ~31 thousands. I think it is still a little too much, what do you think? As a note, using UPARSE I got about ~6000 97% OTUs.Here's my code
Thanks a lot,
G.
The text was updated successfully, but these errors were encountered: