Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclassified vs "not classified" #41

Closed
mihinduk opened this issue Jun 15, 2020 · 3 comments
Closed

Unclassified vs "not classified" #41

mihinduk opened this issue Jun 15, 2020 · 3 comments

Comments

@mihinduk
Copy link

Hi,
I am trying to use CAT to annotate contigs derived from Illumina sequencing and assembled by Megahit. My understanding is that "Not classified" means that something prevented the classification of the contig, like "no hits to database", "hits not found in taxonomy files" or "no ORFs found". I have a number of "Classified" in the out.CAT.alignment.diamond file that have alignment scores of > 0.5 at one or two levels to a NCBI protein ID but - although that protein ID has taxonomy, they are labeled "not classified" at all levels.

Why are these in 2 separate bins, instead of the 2nd instance being a subset of the first "unable to be confidently classified"? Am I misunderstanding these results?

Thank you for your help,
Kathie Mihindukulasuriya

@Finesim97
Copy link

Finesim97 commented Jun 16, 2020

Hi,
I also had some contigs like this. If you look at the taxonomic lineage (numeric), you should see, that those are classified as 131567 or even the root (1). This can be caused by classifications as environmental samples.
Just to check this is the case, would you mind sharing the affected contigs or the CAT output?
Have a nice day,
Lukas Jansen

@bastiaanvonmeijenfeldt
Copy link
Collaborator

Hi @mihinduk and @Finesim97,

Yes you are right! Sometimes a contig is classified at a higher level than superkingdom, in which case CAT considers it classified but you won't find any classification on the official ranks.

The logic for this behaviour is that giving names to the lineage at official ranks is a post-classification step, and CAT is agnostic as to which level you want a classification to be made before something is called unclassified.

Hope this helps!

Best wishes,

Bastiaan

@mihinduk
Copy link
Author

This makes sense. Thank you so much for your quick reply.

Kathie

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants