-
Notifications
You must be signed in to change notification settings - Fork 0
Configuration file
Snakemake directly supports the configuration of the workflow. A configuration is provided as a YAML file e.g 01_infos/config.yaml.
In order to configure your own workflow, you have to replace/fill fields of the configuration file.
Absolute Path of the docker container .sif
file built with singularity (see README.md installation section)
This section refers to the list of required parameters for the command trimmomatic
A string value PE (illumina paired-end) or SE (illumina single-end) to describe the type of .fastq
raw sequencing data provided as input.
The number of cores used by the trimmomatic
command.
A list fastaWithAdaptersEtc, seedMismatches, palindromeClipThreshold, simpleClipThreshold
- fastaWithAdaptersEtc: specifies the path to a fasta file containing all the adapters, PCR sequences you want to trimm.
- seedMismatches: specifies the maximum mismatch count which will still allow a full match to be performed.
- palindromeClipThreshold: specifies how accurate the match between the two 'adapter ligated' reads must be for PE palindrome read alignment.
- simpleClipThreshold: specifies how accurate the match between any adapter etc. sequence must be against a read.
Specifies the minimum quality phred scrore required to keep a base. Remove low quality bases from the beginning. As long as a base has a value below this threshold the base is removed and the next base will be investigated.
Specifies the minimum quality phred score required to keep a base. Remove low quality bases from the end. As long as a base has a value below this threshold the base is removed and the next base (which as trimmomatic is starting from the 3‟ prime end would be base preceding the just removed base) will be investigated.
A list of two numbers windowSize, requiredQuality:
- windowSize: specifies the number of bases to average across
- requiredQuality: specifies the average quality required
Perform a sliding window trimming, cutting once the average quality within the window falls below a threshold. By considering multiple bases, a single poor quality base will not cause the removal of high quality data later in the read.
Specifies the minimum length of reads to be kept.