New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distance matrix from classification amd class2tree #849
Comments
thx for your question @morgan-sparks - @trvinh can you give some details? It'd be worth adding a more detailed description to the function documentation too as it's pretty thin on what is being done within the function. |
Hi @morgan-sparks , For example, this is the rank vectors for 3 species:
Then our aligned rank matrix will look like:
It will be converted into this ID matrix (note: any missing ranks will have a pseudo ID from the previous rank):
That aligned ID matrix will be used for clustering. As you can see, our tree will be:
since So, the classification clusters the taxa not only based on the ranks (e.g. species, family, or phylum,...), but also on the actual taxonomy clade (specified by the IDs). Which means, it can cluster Arabidopsis within plant clade and snakes into reptiles. The more info you have in the aligned ID matrix (taxa with detailed taxonomy string; and enough taxa that cover all possible ranks), the higher resolution you can get for your taxonomy tree. For example, this is the first some lines of a real aligned ID matrix I am working with (not exactly the same as the one
Have I answered your question, Morgan? I hope that it can help! :-) |
thanks @trvinh - can you add some of that to the function documentation? |
I was thinking in the documentation for the class2tree function https://github.com/ropensci/taxize/blob/master/R/class2tree.R#L1-L52 - perhaps in the you can also add docs to specific internal methods if you want, and then use |
I will try :-) |
thank you! |
Session Info
The text was updated successfully, but these errors were encountered: