Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The databases need to include "Plasmids" too #12

Open
tseemann opened this issue Feb 4, 2015 · 7 comments
Open

The databases need to include "Plasmids" too #12

tseemann opened this issue Feb 4, 2015 · 7 comments

Comments

@tseemann
Copy link

tseemann commented Feb 4, 2015

Derrick,

I notice that often my pure bacterial samples still get 5% say of reads being unclassified. When I assemble these reads, they turn out to be bacterial plasmids.

The problem is that some of the Bacteria folders have chromosomes and plasmids, but there are also many separately submitted plasmids which are in a different Plasmids folder at NCBI:

ftp://ftp.ncbi.nih.gov/genomes/Plasmids/

It would be great to add support for this in the download tools, and in MiniKraken.

@DerrickWood
Copy link
Owner

Would the plasmids.all.fna.tar.gz file in that directory be sufficient for what you need? That would be easy enough to add to the download list.

@tseemann
Copy link
Author

Yes, that would be sufficient. This will help explain the unaccounted for 10-20% of reads in some samples where the plasmids have high copy number. Thanks!

@DerrickWood
Copy link
Owner

OK, I've added the plasmids as an option. I've got to monitor a few things over the next few months to make sure that file is consistently available before I can make it part of the standard installation, but at least people can more easily bring it in. I'll also likely add it to at least some version of the MiniKraken database in the near future as well.

@tseemann
Copy link
Author

Thank you.

The Plasmids folder has been there for many years, but not as long as Bacteria and Viruses, but NCBI do change things without notice. The annoying part is that Bacteria/ does contain some plasmids, just not all.

@nickp60
Copy link

nickp60 commented Nov 27, 2017

Hi, sorry to revive an old issue, but just to clarify, did the plasmids end up in either of the MiniKraken databases?

@jenniferlu717
Copy link
Collaborator

The plasmids in the plasmids folder did not end up in the minikraken databases but I'll work to include them in another version ASAP. Might take a couple of days though.

@jenniferlu717 jenniferlu717 reopened this Nov 27, 2017
@tseemann
Copy link
Author

tseemann commented Apr 7, 2018

@jenniferlu717 what ended up happening here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants