-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cluster.fit command #415
Comments
Required: model list, fasta and count file as well as the fasta file to fit. |
I"m sooooo excited for this option!!!! |
shhhh! don't tell anyone, it's ultra-top secret 😄 |
The criteria parameter allows you to indicate which metric will influence the fitting. Options are fit, combo and both. Default=both. Using fit means a sequence will be fitted to an OTU if the fit makes the metric for the fitted sequences better (only considers metric value generated by fit seqs). Using combo means a sequence will be fitted to an OTU if the fit makes the metric for the fitted and the reference sequences better (considers metric value generated by all reference and fit sequences). Using both means a sequence will be fitted to an OTU if it makes the metric for the fitted sequences better (fit) or the metric for the combo better (combo) #415
The printref parameter allows to indicate whether you want the reference seqs printed with the fit seqs. For example, if you are trying to see how a new patient's data changes the clustering, you want to set printref=t so the old patient and new patient OTUs are printed together. If you want to see how your data would fit with a reference like silva, setting printref=f would output only your sequences to the list file. By default printref=t for denovo clustering and printref=f when using a reference. #415
Add accnos parameter to assign reference sequences. |
The accnos parameter allows you to assign reference sequences by name. This can save time by allowing you to provide a distance matrix containing all the sequence distances rather than a sample matrix and reference matrix and mothur calculating the distances between the sample and reference. #415
Fit sequences to existing dataset model.
Simple Example:
Old Data:
otu1 otu2 otu3 otu4 otu5
A,B,C,D,E F,G,H,I J,K L M
New Data:
N O P Q R
New List:
otu2 otu4 otu5
O,Q,R N P
The text was updated successfully, but these errors were encountered: