You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Andrew,
Thank you very much for the brilliant software that really advanced our work!
Running a few tests I noted that not all annotated features are allocated to some clusters.
Starting with simple case and running roary with two identical pacbio sequences, which are the copies of one another and fully assembled into one chromosome.
Total number of annotated features in gff file is 5593.
The result above is stable, after multiple runs it always finishes with the same numbers.
Total number of proteins in clustered_proteins file is 10486, thus from each gff only 5243 were clustered.
I am not clear why not all genetic features were allocated to some clusters.
I checked these that weren't clustered and couldn't come up with a simple explanation. They are mix of
39 CDS
202 misc_RNA
1 tmRNA
108 tRNA
and distributed across the genome. While half of them are quite short (< 100 nt ) the other half is of comparable length to these that were clustered.
I would be very grateful if you could give me some insights about why this could happen.
I can share the test sequence.
Thank you,
Nadejda
The text was updated successfully, but these errors were encountered:
Andrew,
Thank you very much for the brilliant software that really advanced our work!
Running a few tests I noted that not all annotated features are allocated to some clusters.
Starting with simple case and running roary with two identical pacbio sequences, which are the copies of one another and fully assembled into one chromosome.
Total number of annotated features in gff file is 5593.
roary -p 8 -s *gff
Core genes (99% <= strains <= 100%) 5090
Soft core genes (95% <= strains < 99%) 0
Shell genes (15% <= strains < 95%) 0
Cloud genes (0% <= strains < 15%) 0
Total genes (0% <= strains <= 100%) 5090
The result above is stable, after multiple runs it always finishes with the same numbers.
Total number of proteins in clustered_proteins file is 10486, thus from each gff only 5243 were clustered.
I am not clear why not all genetic features were allocated to some clusters.
I checked these that weren't clustered and couldn't come up with a simple explanation. They are mix of
39 CDS
202 misc_RNA
1 tmRNA
108 tRNA
and distributed across the genome. While half of them are quite short (< 100 nt ) the other half is of comparable length to these that were clustered.
I would be very grateful if you could give me some insights about why this could happen.
I can share the test sequence.
Thank you,
Nadejda
The text was updated successfully, but these errors were encountered: