Skip to content

Configuration file

Pierre-Edouard Guerin edited this page Mar 3, 2020 · 7 revisions

Snakemake directly supports the configuration of the workflow. A configuration is provided as a YAML file e.g 01_infos/config.yaml.

In order to configure your own workflow, you have to replace/fill fields of the configuration file.

Singularity

Absolute Path of the docker container .sif file built with singularity (see README.md installation section)

Trimmomatic

This section refers to the list of required parameters for the command trimmomatic

type

A string value PE (illumina paired-end) or SE (illumina single-end) to describe the type of .fastq raw sequencing data provided as input.

threads

The number of cores used by the trimmomatic command.

illuminaclip

A list fastaWithAdaptersEtc, seedMismatches, palindromeClipThreshold, simpleClipThreshold

  • fastaWithAdaptersEtc: specifies the path to a fasta file containing all the adapters, PCR sequences you want to trimm.
  • seedMismatches: specifies the maximum mismatch count which will still allow a full match to be performed.
  • palindromeClipThreshold: specifies how accurate the match between the two 'adapter ligated' reads must be for PE palindrome read alignment.
  • simpleClipThreshold: specifies how accurate the match between any adapter etc. sequence must be against a read.

leading

Specifies the minimum quality phred scrore required to keep a base. Remove low quality bases from the beginning. As long as a base has a value below this threshold the base is removed and the next base will be investigated.

trailing

Specifies the minimum quality phred score required to keep a base. Remove low quality bases from the end. As long as a base has a value below this threshold the base is removed and the next base (which as trimmomatic is starting from the 3‟ prime end would be base preceding the just removed base) will be investigated.

slidingwindow

A list of two numbers windowSize, requiredQuality:

  • windowSize: specifies the number of bases to average across
  • requiredQuality: specifies the average quality required

Perform a sliding window trimming, cutting once the average quality within the window falls below a threshold. By considering multiple bases, a single poor quality base will not cause the removal of high quality data later in the read.

minlen

Specifies the minimum length of reads to be kept.

Clone this wiki locally