-
Notifications
You must be signed in to change notification settings - Fork 8
Quick Start
SafeSeqS accepts the following list of run time parameters and requires (1) a set of input files in the Study Directory (2) a settings JSON file identifying study parameters in the Study Data Directory.
| -d | Required | Directory containing the Study input files. |
| -r | Required | Run directory. Will be created under the Study directory. |
| -sf | Required | Settings file that identifies the parameters for this run. Settings file must exist in the SafeSeqs Data Directory. |
| -w | Optional | Number of concurrent worker sub-processes to run. Default: 1 |
| -s | Optional | Start Stepname. Used when re-starting partially completed run. Processing will begin at the named step. |
| -e | Optional | End Stepname. Used when a partial run is desired. Processing will end before the named step. |
Example:
python -m safeseqs_controller -d C:\labName\studyName -r runName -sf SettingsTemplate.json
An example of each file is provided in the \example sub-directory.
-
safeseqs.json - Contains information about the study data. It identifies the FASTQ input files, the well barcode association filename, the UID length being used in the study data, and the ascii adjustment being used in the study data.
-
Study Data FASTQ Files - One or more sets of reads and barcode files - must be in the Study Directory. Both FASTQ and FASTQ.gz are supported.
-
barcodemap.txt - Tab delimited file containing mapping of samples to barcodes and primer sets.
-
primers.txt - Tab delimited file containing primers for the study.
A settings file must be identified at run time with the -sf argument. Parameters in the settings file control decision making during processing runs. A SettingsTemplate.json is delivered with the package. It is located in the Study Data Directory. It can be cloned and modified to create different run scenarios.
| max_mismatches_for_used_reads | Required | Integer. Example 3 |
| max_indels_for_used_reads | Required | Integer. Example 1 |
| max_amp_per_UID_family | Required | Integer. Example 1 |
| min_good_reads_usable_family | Required | Integer. Example 2 |
| min_perc_good_reads_per_UID_family | Required | Integer. Example 95 = 95% |
| super_mut_perc_homegeneity | Required | Integer. Example 90 = 90% |
| default_indel_rate | Required | Float. Example .001 |
| default_sbs_rate | Required | Float. Example .001 |
| mark_UIDs_with_Ns_UnUsable | Optional | Valid Values: Yes or No Default: No |
| perform_opt_dup_removal | Optional | Valid Values: Yes or No Default: No |
| opt_dup_distance | Required with peform_opt_dup_removal | Integer. Example 5000 |
| fh_limit | Required | File handle limit for system. Example 2048 |
| load_bad_bc | Optional | Valid Values: Yes or No Default: No |
| load_not_used_bc | Optional | Valid Values: Yes or No Default: No |
- COSMIC.txt - Tab delimited file containing COSMIC values for known single base pair mutations. An empty file is delivered with the package.
- dbSNP.txt - Tab delimited file containing dbSNP values for known single base pair mutations. An empty file is delivered with the package.