-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modifying the host genome for Kraken2 #113
Comments
Hi @taltman! Hope you and loved ones are keeping well. Thanks for getting in touch. Yep, @ababaian and I have introduced ourselves on the #viralrecon channel on nf-core Slack. Serrattus looks really cool and we look forward to helping in any way we can to get involved with the downstream analysis! The default Kraken2 database just contains the human genome. This will be downloaded from here and used by default unless you overwrite the At present, the implementation to build the Kraken 2 database is quite simple. You can provide your own I have intentionally avoided trying to parameterise all of the options that can be used to build the Kraken 2 database because of the issues I have seen in the past when downloading and creating the database. Ideally, I would prefer that this is monitored and built outside of the pipeline as opposed to having silent failures where some files were not downloaded properly but yet you have a database you can use. Having said that, I was thinking of adding the parameters below to the pipeline to make things more flexible but didnt get around to it:
To summarise, you can build the database however you choose by using custom fasta files mixed with standard ones and provide that to the pipeline with More than happy to listen to ideas and for any contributions that you may have 🙂 |
The current outputs from the Kraken 2 process in the pipeline are listed here. You can save the raw fastq with the |
Given the issues creating Kraken2 databases when downloading files I think we should either encourage users to use the default human one shipped with the pipeline (now hosted on AWS for more consistent and better download speeds) or to build their own and provide to the pipeline with |
Hello and thanks for your support. Is there any robust way to prepare the kraken2 database for other hosts? As we are working on plants, I thought we might download the human database and then add our plant genome to that!? Also if the kraken2 database here is just used to filter host reads, why didn't use other tools that have easier ways to prepare indexes like bwa and etc.. Cheers, |
Hi there,
A hearty hello from the Serratus project!
https://github.com/ababaian/serratus/
We are keen to work with you guys to integrate viralrecon into our effort to isolate as many distantly-related Coronaviruses as possible.
I'm the resident Kraken2 enthusiast there. Two questions:
Thanks!
The text was updated successfully, but these errors were encountered: