Skip to content

Commit

Permalink
Merge pull request #183 from ctmrbio/add-krakenuniq
Browse files Browse the repository at this point in the history
Add KrakenUniq
  • Loading branch information
boulund committed Sep 27, 2022
2 parents 05994d5 + d84711c commit 20b1361
Show file tree
Hide file tree
Showing 14 changed files with 662 additions and 220 deletions.
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ situations.
### Added
- Produce Snakemake report in zip format as well as HTML due to the HTML report being
broken in the later versions of Snakemake.
- Add KrakenUniq as taxonomic profiler as an alternative with lower false
positive rate than Kraken2.
- Added samplesheet as alternative input file selection method.
- Added `run_krona` setting for taxonomic profilers to make it possible to disable Krona
table and plot creation.
Expand All @@ -35,6 +37,9 @@ situations.
- Modified area and MetaPhlAn heatmap plotting scripts to better deal
with MetaPhlAn 4 output formats.
- Updated the documentation to reflect recent changes in StaG.
- Updated KrakenTools to v1.2
- Updated `scripts/join_tables.py` to v1.1, which includes support for skipping lines
before the header.

### Removed

Expand Down
1 change: 1 addition & 0 deletions Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ include: "rules/naive/bbcountunique.smk"
#############################
include: "rules/taxonomic_profiling/kaiju.smk"
include: "rules/taxonomic_profiling/kraken2.smk"
include: "rules/taxonomic_profiling/krakenuniq.smk"
include: "rules/taxonomic_profiling/metaphlan.smk"

#############################
Expand Down
7 changes: 7 additions & 0 deletions config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ naive:
taxonomic_profile:
kaiju: False
kraken2: False
krakenuniq: False
metaphlan: False
strain_level_profiling:
strainphlan: False # Will also run metaphlan. Please make sure you've added bt2_db_dir and bt2_index under metaphlan settings.
Expand Down Expand Up @@ -110,6 +111,12 @@ kraken2:
include: ""
exclude: "--exclude 9605 9606" # Taxid 9605 and 9606 are (G) Homo and (S) Homo sapiens

krakenuniq:
db: "" # [Required] Path to KrakenUniq DB folder
extra: "" # Extra command line arguments for krakenuniq (do not add/change output files)
keep_kraken: False # Keep the kraken output files
keep_kreport: True # Keep the kreport output files

metaphlan:
bt2_db_dir: "" # [Required] Path to MetaPhlAn database dir
bt2_index: "" # [Required] Name of MetaPhlAn database index
Expand Down
12 changes: 12 additions & 0 deletions docs/source/modules.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
.. _FastQC: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
.. _Kaiju: http://kaiju.binf.ku.dk/
.. _Kraken2: https://ccb.jhu.edu/software/kraken2/
.. _KrakenUniq: https://github.com/fbreitwieser/krakenuniq
.. _Bracken: https://ccb.jhu.edu/software/bracken/
.. _groot: https://groot-documentation.readthedocs.io
.. _MetaPhlAn: https://github.com/biobakery/MetaPhlAn/wiki/MetaPhlAn-4
Expand Down Expand Up @@ -168,6 +169,17 @@ each sample::
all_samples.<taxonomic_level>.filtered.bracken.txt
all_samples.bracken.mpa_style.txt
KrakenUniq
----------
:Tool: `KrakenUniq`_
:Output folder: ``krakenuniq``

Run `KrakenUniq`_ on the trimmed and filtered reads to produce a taxonomic profile.
The KrakenUniq module produces the following output files::

<sample>.kraken.gz
<sample>.kreport
all_samples.krakenuniq.txt

MetaPhlAn
----------
Expand Down
7 changes: 7 additions & 0 deletions envs/krakenuniq.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
name: stag-krakenuniq
channels:
- bioconda
- conda-forge
- defaults
dependencies:
- krakenuniq =1.0.0
1 change: 1 addition & 0 deletions rules/publications.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
"HUMAnN": "Franzosa EA*, McIver LJ*, et al. (2018). Species-level functional profiling of metagenomes and metatranscriptomes. Nat Methods 15: 962-968. https://doi.org/10.1038/s41592-018-0176-y",
"Kaiju": "Menzel, P., Ng, K. L., & Krogh, A. (2016). Fast and sensitive taxonomic classification for metagenomics with Kaiju. Nature communications, 7, 11257. Available online at: https://github.com/bioinformatics-centre/kaiju",
"Kraken2": "Wood, D.E., Lu, J., & Langmead, B. (2019). Improved metagenomic analysis with Kraken 2. Genome biology, 20, 257. https://doi.org/10.1186/s13059-019-1891-0",
"KrakenUniq": "Breitwieser FP, Baker DN, Salzberg SL. KrakenUniq: confident and fast metagenomics classification using unique k-mer counts. Genome Biology, Dec 2018. https://doi.org/10.1186/s13059-018-1568-0",
"Krona": "Ondov BD, Bergman NH, and Phillippy AM. Interactive metagenomic visualization in a Web browser. BMC Bioinformatics. 2011 Sep 30; 12(1):385. https://doi.org/10.1186/1471-2105-12-385",
"MEGAHIT": "Li, D., Liu, C-M., Luo, R., Sadakane, K., and Lam, T-W., (2015). MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics, https://doi.org/10.1093/bioinformatics/btv033 [PMID: 25609793].",
"MaxBin2": "Yu-Wei Wu, Blake A. Simmons, Steven W. Singer (2016). MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. Bioinformatics, Volume 32, Issue 4, 15 February 2016, Pages 605–607, https://doi.org/10.1093/bioinformatics/btv638",
Expand Down
90 changes: 90 additions & 0 deletions rules/taxonomic_profiling/krakenuniq.smk
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# vim: syntax=python expandtab
# Taxonomic classification of metagenomic reads using KrakenUniq
from pathlib import Path

from snakemake.exceptions import WorkflowError

localrules:
combine_krakenuniq_reports


krakenuniq_config = config["krakenuniq"]
if config["taxonomic_profile"]["krakenuniq"]:
if not (krakenuniq_config["db"] and Path(krakenuniq_config["db"]).exists()):
err_message = "No KrakenUniq database folder at: '{}'!\n".format(krakenuniq_config["db"])
err_message += "Specify the path in the krakenuniq section of config.yaml.\n"
err_message += "If you do not want to run krakenuniq for taxonomic profiling, set 'krakenuniq: False' in config.yaml"
raise WorkflowError(err_message)

# Add KrakenUniq output files to 'all_outputs' from the main Snakefile scope.
# SAMPLES is also from the main Snakefile scope.
krakens = expand(OUTDIR/"krakenuniq/{sample}.kraken.gz", sample=SAMPLES)
kreports = expand(OUTDIR/"krakenuniq/{sample}.kreport", sample=SAMPLES)
combined_kreport = expand(OUTDIR/"krakenuniq/all_samples.krakenuniq.txt", sample=SAMPLES)
all_outputs.extend(krakens)
all_outputs.extend(kreports)
all_outputs.append(combined_kreport)

citations.add(publications["KrakenUniq"])
citations.add(publications["Krona"])


rule krakenuniq:
input:
read1=OUTDIR/"host_removal/{sample}_1.fq.gz",
read2=OUTDIR/"host_removal/{sample}_2.fq.gz",
output:
kraken=OUTDIR/"krakenuniq/{sample}.kraken.gz" if krakenuniq_config["keep_kraken"] else temp(OUTDIR/"krakenuniq/{sample}.kraken.gz"),
kreport=OUTDIR/"krakenuniq/{sample}.kreport" if krakenuniq_config["keep_kreport"] else temp(OUTDIR/"krakenuniq/{sample}.kreport"),
log:
LOGDIR/"krakenuniq/{sample}.krakenuniq.log"
shadow:
"shallow"
threads:
cluster_config["krakenuniq"]["n"] if "krakenuniq" in cluster_config else 4
conda:
"../../envs/krakenuniq.yaml"
container:
"docker://quay.io/biocontainers/krakenuniq:1.0.0--pl5321h19e8d03_0"
params:
db=krakenuniq_config["db"],
extra=krakenuniq_config["extra"],
shell:
"""
krakenuniq \
--db {params.db} \
--threads {threads} \
--output {output.kraken} \
--report-file {output.kreport} \
--paired \
{input.read1} {input.read2} \
{params.extra} \
2> {log}
"""


rule combine_krakenuniq_reports:
input:
kreports=expand(OUTDIR/"krakenuniq/{sample}.kreport", sample=SAMPLES)
output:
combined=OUTDIR/"krakenuniq/all_samples.krakenuniq.txt"
log:
LOGDIR/"krakenuniq/all_samples.krakenuniq.log"
shadow:
"shallow"
threads:
1
conda:
"../../envs/stag-mwc.yaml"
container:
"oras://ghcr.io/ctmrbio/stag-mwc:stag-mwc"+singularity_branch_tag
shell:
"""
scripts/join_tables.py \
--feature-column rank,taxName \
--value-column taxReads \
--outfile {output.combined} \
--skiplines 3 \
{input.kreports} \
2> {log}
"""
1 change: 0 additions & 1 deletion scripts/KrakenTools/LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,3 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

Loading

0 comments on commit 20b1361

Please sign in to comment.