Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

not all annotated features are allocated to the clusters #359

Closed
yanadya opened this issue Oct 28, 2017 · 1 comment
Closed

not all annotated features are allocated to the clusters #359

yanadya opened this issue Oct 28, 2017 · 1 comment

Comments

@yanadya
Copy link

yanadya commented Oct 28, 2017

Andrew,
Thank you very much for the brilliant software that really advanced our work!

Running a few tests I noted that not all annotated features are allocated to some clusters.

Starting with simple case and running roary with two identical pacbio sequences, which are the copies of one another and fully assembled into one chromosome.
Total number of annotated features in gff file is 5593.

roary -p 8 -s *gff
Core genes (99% <= strains <= 100%) 5090
Soft core genes (95% <= strains < 99%) 0
Shell genes (15% <= strains < 95%) 0
Cloud genes (0% <= strains < 15%) 0
Total genes (0% <= strains <= 100%) 5090

The result above is stable, after multiple runs it always finishes with the same numbers.

Total number of proteins in clustered_proteins file is 10486, thus from each gff only 5243 were clustered.

I am not clear why not all genetic features were allocated to some clusters.

I checked these that weren't clustered and couldn't come up with a simple explanation. They are mix of
39 CDS
202 misc_RNA
1 tmRNA
108 tRNA
and distributed across the genome. While half of them are quite short (< 100 nt ) the other half is of comparable length to these that were clustered.

I would be very grateful if you could give me some insights about why this could happen.
I can share the test sequence.

Thank you,
Nadejda

@yanadya
Copy link
Author

yanadya commented Oct 31, 2017

Closing the issue as already know the answer - blastp is the key )

@yanadya yanadya closed this as completed Oct 31, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant