FHI's SARS-CoV-2 Illumina Pipeline

Bioinformatic pipeline for SARS-CoV-2 sequence analysis used at the Folkehelseinstituttet

Description

Docker-based solution for sequence analysis of SARS-CoV-2 Illumina samples

Primer schemes supported

Installation

git clone https://github.com/garcia-nacho/FHI_SC2_Pipeline_Illumina
cd FHI_SC2_Pipeline_Illumina
docker build -t garcianacho/fhisc2:Illumina .

Note that building the image for the first time can take up to two hours.

Alternativetly, it is posible to pull updated builds from Dockerhub:

docker pull garcianacho/fhisc2:Illumina

Running the pipeline

ArticV4:
docker run -it --rm -v $(pwd):/home/docker/Fastq garcianacho/fhisc2:Illumina SARS-CoV-2_Illumina_Docker_V12.sh ArticV4

ArticV3:
docker run -it --rm -v $(pwd):/home/docker/Fastq garcianacho/fhisc2:Illumina SARS-CoV-2_Illumina_Docker_V12.sh ArticV3

Note that older versions of docker might require the flag --privileged and that multiuser systems might require the flag -u 1000 to run

The script expects the following folder structure where the fastq.gz files are placed inside independent folders for each Sample

./ExpXX    
  |-ExperimentXX.xlsx      
  |-Sample1     
      |-Sample1_SX_LXXXX_R1.fastq.gz       
      |-Sample1_SX_LXXXX_R2.fastq.gz      
  |-Sample2      
      |-Sample2_SX_LXXXX_R1.fastq.gz   
      |-Sample2_SX_LXXXX_R2.fastq.gz   
  |-Sample3   
      |-Sample2_SX_LXXXX_R1.fastq.gz   
      |-Sample2_SX_LXXXX_R2.fastq.gz
  |-...

The script also expects a .xlsx file, that contains information about the position of the samples on a 96-well-plate and the DNA concentration (alternatively this column can be used for the Ct-values). If the file is not properly formated the script will run without errors but the Quality-control plot will not be generated or it will contain errors. Note that the script takes the name of the experiment from the name of the xlsx file. If the file is not found the names of the output files might be incorrect. It is possible to download a template of the xlsx file here

Outputs

-Summary including mutations found, pangolin lineage, number of reads, coverage, depth, etc...
-Bam files
-Consensus sequences
-Aligned consensus sequences
-Consensus nucleotide sequence for gene S
-Indels and frameshift identification
-Quality-control plot for the plate to detect possible contaminations
-Phylogenetic-tree plot of the samples
-Noise during variant calling across the genome
-Quality-control for contaminations/low-quality samples
-Amplicon efficacy of the selected primer-set for all the samples

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
Binaries		Binaries
CommonFiles		CommonFiles
Scripts		Scripts
Dockerfile		Dockerfile
Dockerfile.bk		Dockerfile.bk
Illumina_V12.sh.bk		Illumina_V12.sh.bk
README.md		README.md
Template_FHISC2_Illumina.xlsx		Template_FHISC2_Illumina.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FHI's SARS-CoV-2 Illumina Pipeline

Description

Primer schemes supported

Installation

Running the pipeline

Outputs

About

Releases

Packages

Contributors 2

Languages

garcia-nacho/FHI_SC2_Pipeline_Illumina

Folders and files

Latest commit

History

Repository files navigation

FHI's SARS-CoV-2 Illumina Pipeline

Description

Primer schemes supported

Installation

Running the pipeline

Outputs

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages