The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
- #350 - Adds support for CheckM as alternative bin completeness and QC tool (added by @jfy133 and @skrakau)
- #353 - Added the busco_clean parameter to optionally clean each BUSCO directory after a successful (by @prototaxites)
- #361 - Added the skip_clipping parameter to skip read preprocessing with fastp or adapterremoval. Running the pipeline with skip_clipping, keep_phix and without specifying a host genome or fasta file skips the FASTQC_TRIMMED process (by @prototaxites)
- #365 - Added CONCOCT as an additional (optional) binning tool (by @jfy133)
- #366 - Added CAT_SUMMARISE process and cat_official_taxonomy parameter (by @prototaxites)
- #372 - Allow CAT_DB to take an extracted database as well as a tar.gz file (by @prototaxites).
- #380 - Added support for saving processed reads (clipped, host removed etc.) to results directory (by @jfy133)
- #394 - Added GUNC for additional chimeric bin/contamination QC (added by @jfy133)
- #340,#368,#373 - Update to nf-core 2.7.2
TEMPLATE
(by @jfy133, @d4straub, @skrakau) - #373 - Removed parameter
--enable_conda
. Updated local modules to new conda syntax and updated nf-core modules (by @skrakau) - #385 - CAT also now runs on unbinned contigs as well as binned contigs (added by @jfy133)
- #399 - Removed undocumented BUSCO_PLOT process (previously generated
*.busco_figure.png
plots unsuitable for metagenomics) (by @skrakau).
- #345 - Bowtie2 mode changed to global alignment for ancient DNA mode (
--very-sensitive
mode) to prevent soft clipping at the end of reads when running in local mode. (by @maxibor) - #349 - Add a warning that pipeline will reset minimum contig size to 1500 specifically MetaBAT2 process, if a user supplies below this threshold. (by @jfy133)
- #352 - Escape the case in the BUSCO module that BUSCO can just detect a root lineage but is not able to find any marker genes (by @alexhbnr)
- #355 - Include error code 21 for retrying with higher memory for SPAdes and hybridSPAdes (by @mglubber)
Tool | Previous version | New version |
---|---|---|
BUSCO | 5.1.0 | 5.4.3 |
BCFtools | 1.14 | 1.16 |
Freebayes | 1.3.5 | 1.3.6 |
SAMtools | 1.15 | 1.16.1 |
- #328 - Fix too many symbolic links issue in local convert_depths module (reported by @ChristophKnapp and fixed by @apeltzer, @jfy133)
- #329 - Each sample now gets it's own result directory for PyDamage analysis and filter (reported and fixed by @maxibor)
- #263 - Restructure binning subworkflow in preparation for aDNA workflow and extended binning
- #247 - Add ancient DNA subworkflow
- #263 - Add MaxBin2 as second contig binning tool
- #285 - Add AdapterRemoval2 as an alternative read trimmer
- #291 - Add DAS Tool for bin refinement
- #319 - Activate pipeline-specific institutional nf-core/configs
- #269,#283,#289,#302 - Update to nf-core 2.4
TEMPLATE
- #286 - Cite our publication instead of the preprint
- #291, #299 - Add extra results folder
GenomeBinning/depths/contigs
for[assembler]-[sample/group]-depth.txt.gz
, andGenomeBinning/depths/bins
forbin_depths_summary.tsv
and[assembler]-[binner]-[sample/group]-binDepths.heatmap.png
- #315 - Replace base container for standard shell tools to fix problems with running on Google Cloud
- #290 - Fix caching of binning input
- #305 - Add missing Bowtie2 version for process
BOWTIE2_PHIX_REMOVAL_ALIGN
tosoftware_versions.yml
- #307 - Fix retrieval of GTDB-Tk version (note about newer version caused error in
CUSTOM_DUMPSOFTWAREVERSIONS
) - #309 - Fix publishing of BUSCO
busco_downloads/
folder, i.e. publish only when--save_busco_reference
is specified - #321 - Fix parameter processing in
BOWTIE2_REMOVAL_ALIGN
(which was erroneously forBOWTIE2_PHIX_REMOVAL_ALIGN
)
Tool | Previous version | New version |
---|---|---|
fastp | 0.20.1 | 0.23.2 |
MultiQC | 1.11 | 1.12 |
- #240 - Add prodigal to predict protein-coding genes for assemblies.
- #241 - Add parameter
--skip_prodigal
. - #244 - Add pipeline preprint information.
- #245 - Add Prokka to annotate binned genomes.
- #249 - Update workflow overview figure.
- #258 - Updated MultiQC 1.9 to 1.11.
- #260 - Updated SPAdes 3.13.1 -> 3.15.3, MEGAHIT 1.2.7 -> 1.2.7
- #256 - Fix
--skip_busco
. - #236 - Fix large assemblies (> 4 billion nucleotides in length).
- #254 - Fix MetaBAT2 error with nextflow version 21.10.x (21.04.03 is the latest functional version for nf-core/mag 2.1.0).
- #255 - Update gtdbtk conda channel.
- #258 - FastP results are now in MultiQC.
- #212, #214 - Add bin abundance estimation based on median sequencing depths of corresponding contigs (results are written to
results/GenomeBinning/bin_depths_summary.tsv
andresults/GenomeBinning/bin_summary.tsv
) #197. - #214 - Add generation of (clustered) heat maps with bin abundances across samples (using centered log-ratios)
- #217 - Publish genes predicted with Prodigal within BUSCO run (written to
results/GenomeBinning/QC/BUSCO/[assembler]-[bin]_prodigal.gff
).
- #218 - Update to nf-core 2.0.1
TEMPLATE
(DSL2)
- #226 - Fix handling of
BUSCO
output when run in auto lineage selection mode and selected specific lineage is the same as the generic one.
- #179 - Add BUSCO automated lineage selection functionality (new default). The pameter
--busco_auto_lineage_prok
can be used to only consider prokaryotes and the parameter--busco_download_path
to run BUSCO inoffline
mode. - #178 - Add taxonomic bin classification with
GTDB-Tk
v1.5.0
(for bins filtered based onBUSCO
QC metrics). - #196 - Add process for CAT database creation as an alternative to using pre-built databases.
- #162 - Switch to DSL2
- #162 - Changed
--input
file format fromTSV
toCSV
format, requires header now - #162 - Update
README.md
,docs/usage.md
anddocs/output.md
- #162 - Update
FastP
from version0.20.0
to0.20.1
- #162 - Update
Bowtie2
from version2.3.5
to2.4.2
- #162 - Update
FastQC
from version0.11.8
to0.11.9
- #172 - Compressed discarded MetaBAT2 output files
- #176 - Update CAT DB link
- #179 - Update
BUSCO
from version4.1.4
to5.1.0
- #179 - By default BUSCO now performs automated lineage selection instead of using the bacteria_odb10 lineage as reference. Specific lineage datasets can still be provided via
--busco_reference
. - #178 - Change output file:
results/GenomeBinning/QC/quast_and_busco_summary.tsv
->results/GenomeBinning/bin_summary.tsv
, contains GTDB-Tk results as well. - #191 - Update to nf-core 1.14
TEMPLATE
- #193 - Compress CAT output files #180
- #198 - Requires nextflow version
>= 21.04.0
- #200 - Small changes in GitHub Actions tests
- #203 - Renamed
fastp
params and improved description in documentation:--mean_quality
->--fastp_qualified_quality
,--trimming_quality
->--fastp_cut_mean_quality
- #175 - Fix bug in retrieving the
--max_unbinned_contigs
longest unbinned sequences that are longer than--min_length_unbinned_contigs
(split_fasta.py
) - #175 - Improved runtime of
split_fasta.py
inMETABAT2
process (important for large assemblies, e.g. when computing co-assemblies) - #194 - Allow different folder structures for Kraken2 databases containing
*.k2d
files #187 - #195 - Fix documentation regarding required compression of input FastQ files #160
- #196 - Add process for CAT database creation as solution for problem caused by incompatible
DIAMOND
version used for pre-builtCAT database
andCAT classification
#90, #188
- #146 - Add
--coassemble_group
parameter to allow group-wise co-assembly - #146 - Add
--binning_map_mode
parameter allowing different mapping strategies to compute co-abundances used for binning (all
,group
,own
) - #149 - Add two new parameters to allow custom SPAdes and MEGAHIT options (
--spades_options
and--megahit_options
)
- #141 - Update to nf-core 1.12.1
TEMPLATE
- #143 - Manifest file has to be handed over via
--input
parameter now - #143 - Changed format of manifest input file: requires a '.tsv' suffix and additionally contains group ID
- #143 - TSV
--input
file allows now also entries containing only short reads - #145 - When using TSV input files, uses sample IDs now for
FastQC
instead of basenames of original read files. Allows non-unique file basenames.
- #143 - Change parameter:
--manifest
->--input
- #135 - Update to nf-core 1.12
TEMPLATE
- #123 - Update to new nf-core 1.11
TEMPLATE
- #118 - Fix
seaborn
tov0.10.1
to avoidnanoplot
error - #120 - Fix link to CAT database in help message
- #124 - Fix description of
CAT
process inoutput.md
- #35 - Add social preview image
- #49 - Add host read removal with
Bowtie 2
and according custom section toMultiQC
- #49 - Add separate
MultiQC
section forFastQC
after preprocessing - #65 - Add
MetaBAT2
RNG seed parameter--metabat_rng_seed
and set the default to 1 which ensures reproducible binning results - #65 - Add parameters
--megahit_fix_cpu_1
,--spades_fix_cpus
and--spadeshybrid_fix_cpus
to ensure reproducible results from assembly tools - #66 - Export
depth.txt.gz
into result folder - #67 - Compress assembly files
- #82 - Add
nextflow_schema.json
- #104 - Add parameter
--save_busco_reference
- #56 - Update
MetaBAT2
fromv2.13
tov2.15
- #46 - Update
MultiQC
fromv1.7
tov1.9
- #88 - Update to new nf-core 1.10.2
TEMPLATE
- #88 -
--reads
is now removed, use--input
instead - #101 - Prevented PhiX alignments from being stored in work directory #97
- #104, #111 - Update
BUSCO
fromv3.0.2
tov4.1.4
- #29 - Fix
MetaBAT2
binning discards unbinned contigs #27 - #31, #36, #76, #107 - Fix links in README
- #47 - Fix missing
MultiQC
when--skip_quast
or--skip_busco
was specified - #49, #89 - Added missing parameters to summary
- #50 - Fix missing channels when
--keep_phix
is specified - #54 - Updated links to
minikraken db
- #54 - Fixed
Kraken2
dp preparation: allow different names for compressed archive file and contained folder as for some minikraken dbs - #55 - Fixed channel joining for multiple samples causing
MetaBAT2
error #32 - #57 - Fix number of threads used by
MetaBAT2
programjgi_summarize_bam_contig_depths
- #70 - Fix
SPAdes
memory conversion issue #61 - #71 - No more ignoring errors in
SPAdes
assembly - #72 - No more ignoring of
BUSCO
errors - #73, #75 - Improved output documentation
- #96 - Fix missing bin names in
MultiQC
BUSCO section #78 - #104 - Fix
BUSCO
errors causing missing summary output #77
- #29 - Change depreciated parameters:
--singleEnd
->--single_end
,--igenomesIgnore
->--igenomes_ignore
Initial release of nf-core/mag, created with the nf-core template.
- short and long reads QC (fastp, porechop, filtlong, fastqc)
- Lambda and PhiX detection and filtering (bowtie2, nanolyse)
- Taxonomic classification of reads (centrifuge, kraken2)
- Short read and hybrid assembly (megahit, metaspades)
- metagenome binning (metabat2)
- QC of bins (busco, quast)
- annotation (cat/bat)