You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
May it is a silly question but it would not so efficient to include this steps if you are using a set of 600 genomes, for example. Ok, it is not a lot, but... Any statistics (just for curiosity).
The text was updated successfully, but these errors were encountered:
You can use annotation from genbank (or RAST) if you wish,and there are instructions on the roary webpage. The important thing is that all annotation & ORF prediction is performed using the same method, otherwise you will just get lots of noise and false signals. GenBank is not ideal since the submitters of genomes can submit the annotation, hence you can get a big mixture of different annotation methods. RefSeq is much better because they use PGAP to ensure consistent annotation (some exceptions to watch out for).
For an accurate study, I prefer to use RefSeq .gbff genomes because they share the same annotation process. Thank you Andrew. I will try both files and methods.
@felipelira i think the issue with refseq .gff files is that they do not have the FASTA file appended to them, and sometimes the GFF "ID" does not match the FASTA "ID".
May it is a silly question but it would not so efficient to include this steps if you are using a set of 600 genomes, for example. Ok, it is not a lot, but... Any statistics (just for curiosity).
The text was updated successfully, but these errors were encountered: