CFIA-NCFAD/nf-flu - Influenza A Virus Genome Assembly Nextflow Workflow

Introduction

nf-flu is a bioinformatics analysis pipeline for assembly and H/N subtyping of Influenza A virus. The pipeline supports both Illumina and Nanopore Platform. Since Influenza is a special virus with multiple gene segments (8 segments) and there might be a reference or multiple we would want to align against, the pipeline will automatically pull top match references for each segment. To achieve this task, the pipeline downloads Influenza database from NCBI and user could provide their own reference database. The pipline performs read mapping against each reference segment, variant calling and genome assembly.

The pipeline is implemented in Nextflow

Pipeline summary

Download latest NCBI Influenza DB sequences and metadata (or use user-specified files)
Merge reads of re-sequenced samples (cat) (if needed)
Assembly of Influenza gene segments with IRMA using the built-in FLU module
Nucleotide BLAST search against NCBI Influenza DB
Automatically pull top match references for segments
H/N subtype prediction and Excel XLSX report generation based on BLAST results
Perform Variant calling and genome assembly for all segments.

Quick Start

Install Nextflow (>=21.04.0).
Install any of Docker, Singularity, Podman, Shifter or Charliecloud for full pipeline reproducibility (please only use Conda as a last resort)
Download the pipeline and test it on a minimal dataset with a single command:
```
nextflow run CFIA-NCFAD/nf-flu -profile test,<docker/singularity/podman/shifter/charliecloud/conda>
```
- If you are using singularity then the pipeline will auto-detect this and attempt to download the Singularity images directly as opposed to performing a conversion from Docker images. If you are persistently observing issues downloading Singularity images directly due to timeout or network issues then please use the --singularity_pull_docker_container parameter to pull and convert the Docker image instead. Alternatively, it is highly recommended to use the nf-core download command to pre-download all of the required containers before running the pipeline and to set the NXF_SINGULARITY_CACHEDIR or singularity.cacheDir Nextflow options to be able to store and re-use the images from a central location for future pipeline runs.
- If you are using conda, it is highly recommended to use the NXF_CONDA_CACHEDIR or conda.cacheDir settings to store the environments in a central location for future pipeline runs.

Run your own analysis

[Optional] Generate an input samplesheet from a directory containing Illumina FASTQ files (e.g. /path/to/illumina_run/Data/Intensities/Basecalls/) with the included Python script fastq_dir_to_samplesheet.py before you run the pipeline (requires Python 3 installed locally) e.g.
```
python ~/.nextflow/assets/CFIA-NCFAD/nf-flu/bin/fastq_dir_to_samplesheet.py \
  -i /path/to/illumina_run/Data/Intensities/Basecalls/ \
  -o samplesheet.csv
```

Typical command for Illumina Platform

nextflow run CFIA-NCFAD/nf-flu \
  --input samplesheet.csv \
  --platform illumina \
  --profile <docker/singularity/podman/shifter/charliecloud/conda>

Typical command for Nanopore Platform

nextflow run CFIA-NCFAD/nf-flu \
  --input samplesheet.csv \
  --platform nanopore \
  --profile <docker/singularity/conda>

Documentation

The nf-flu pipeline comes with:

Usage and
Output documentation.

Resources

NCBI Influenza FTP site
IRMA Iterative Refinement Meta-Assembler
- IRMA Publication

Credits

The nf-flu pipeline was originally developed by Peter Kruczkiewicz from CFIA-NCFAD, Hai Nguyen extended the piepline for Nanopore data analysis.

nf-core project for establishing Nextflow workflow development best-practices, nf-core tools and nf-core modules
nf-core/viralrecon for inspiration and setting a high standard for viral sequence data analysis pipelines
Conda and Bioconda project for making it easy to install, distribute and use bioinformatics software.
Biocontainers for automatic creation of Docker and Singularity containers for bioinformatics software in [Bioconda]

Name		Name	Last commit message	Last commit date
Latest commit History 156 Commits
.github		.github
assets		assets
bin		bin
conf		conf
docs		docs
lib		lib
modules		modules
templates		templates
workflows		workflows
.gitignore		.gitignore
.nf-core.yml		.nf-core.yml
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
main.nf		main.nf
modules.json		modules.json
nextflow.config		nextflow.config
nextflow_schema.json		nextflow_schema.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CFIA-NCFAD/nf-flu - Influenza A Virus Genome Assembly Nextflow Workflow

Introduction

Pipeline summary

Quick Start

Documentation

Resources

Credits

About

Releases

Packages

Languages

License

ric-costa/nf-flu

Folders and files

Latest commit

History

Repository files navigation

CFIA-NCFAD/nf-flu - Influenza A Virus Genome Assembly Nextflow Workflow

Introduction

Pipeline summary

Quick Start

Documentation

Resources

Credits

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages