Skip to content

lmtani/wf-human-mito

Repository files navigation

Human Mitochondrial Workflow

This repository contains a nextflow workflow for running mitochondrial analysis. This workflow is heavily inspired by the gatk-workflows/gatk4-mitochondria-pipeline and by the nf-core community.

🔧 Setup

The workflow uses nextflow to manage compute and software resources, as such nextflow will need to be installed before attempting to run the workflow.

The workflow can currently be run using Docker to provide isolation of the required software. This methods are automated out-of-the-box provided docker is installed.

👷 I'm still working to make conda and singularity profiles available

📥 Inputs

  • Pairs of FASTQ file. One pair for each sample or
  • Alignment file, one per sample. Accepts BAM or CRAM formats.
  • Human Genome Reference - choose the one that best suits your needs.

Notes

  • Allows paired FASTQs, alignments or both.
  • Some of the human genome references can be downloaded here, hosted by Google. Example: GRCh38. This workflow needs to have access to .fasta, .dict, .fai, 64.ann, 64.amb, 64.sa, 64.pac, 64.alt files.
  • The --reference parameter must point to the same reference of the input alignments.

⚙ Running

# For help:
nextflow run lmtani/wf-human-mito -r main --help

# Example:
nextflow run lmtani/wf-human-mito -r main \
    --fastq 'fastqs/*_R{1,2}.fq.gz' \ # ex: sample_R1.fq.gz and sample_R2.fq.gz
    --alignments 'bams/*.cra{m,i}' \  # ex: sample.cram and sample.crai
    --reference /refs/Homo_sapiens_assembly38.fasta \
    --outdir outdir \
    -profile docker

📤 Outputs

  • Alignment in BAM format (outdir/alignments/)
  • Variants in VCF format (outdir/variants/)
  • CSV file with informations, e.g: Haplotype groups (major and minor), coverage, etc. for all samples.
  • All intermediate outputs (ex: outdir/workspace/align_raw_reads/ contains all whole genome alignments)

Useful links