Skip to content

Influenza genome analysis Nextflow workflow

License

Notifications You must be signed in to change notification settings

ayooluwaB/nf-iav-illumina

 
 

Repository files navigation

Influenza Genome Analysis Nextflow Workflow

CI

Nextflow run with conda run with docker run with singularity

Introduction

nf-iav-illumina is a bioinformatics analysis pipeline for assembly and H/N subtyping of Influenza A virus Illumina sequence data.

The pipeline is implemented in Nextflow

for the IRMA assembly and H/N subtyping by nucleotide BLAST against the NCBI Influenza DB.

Pipeline summary

  1. Download latest NCBI Influenza DB sequences and metadata (or use user-specified files)
  2. Merge reads of re-sequenced samples (cat) (if needed)
  3. Assembly of Influenza gene segments with IRMA using the built-in FLU module
  4. Nucleotide BLAST search against NCBI Influenza DB
  5. H/N subtype prediction and Excel XLSX report generation based on BLAST results

Quick Start

  1. Install Nextflow (>=21.04.0).

  2. Install any of Docker, Singularity, Podman, Shifter or Charliecloud for full pipeline reproducibility (please only use Conda as a last resort)

  3. Download the pipeline and test it on a minimal dataset with a single command:

    nextflow run peterk87/nf-iav-illumina -profile test,<docker/singularity/podman/shifter/charliecloud/conda>
    • If you are using singularity then the pipeline will auto-detect this and attempt to download the Singularity images directly as opposed to performing a conversion from Docker images. If you are persistently observing issues downloading Singularity images directly due to timeout or network issues then please use the --singularity_pull_docker_container parameter to pull and convert the Docker image instead. Alternatively, it is highly recommended to use the nf-core download command to pre-download all of the required containers before running the pipeline and to set the NXF_SINGULARITY_CACHEDIR or singularity.cacheDir Nextflow options to be able to store and re-use the images from a central location for future pipeline runs.
    • If you are using conda, it is highly recommended to use the NXF_CONDA_CACHEDIR or conda.cacheDir settings to store the environments in a central location for future pipeline runs.
  4. Run your own analysis

    • [Optional] Generate an input samplesheet from a directory containing Illumina FASTQ files (e.g. /path/to/illumina_run/Data/Intensities/Basecalls/) with the included Python script fastq_dir_to_samplesheet.py before you run the pipeline (requires Python 3 installed locally) e.g.

      python ~/.nextflow/assets/peterk87/nf-iav-illumina/bin/fastq_dir_to_samplesheet.py \
        -i /path/to/illumina_run/Data/Intensities/Basecalls/ \
        -o samplesheet.csv
    • Typical command

      nextflow run peterk87/nf-iav-illumina \
        --input samplesheet.csv \
        --profile <docker/singularity/podman/shifter/charliecloud/conda>

Documentation

The nf-iav-illumina pipeline comes with:

Resources

Credits

About

Influenza genome analysis Nextflow workflow

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Nextflow 43.2%
  • Groovy 33.4%
  • Python 23.4%