Home

SafeSeqS Installation

Use pip to install the safeseqs package:

$ pip install safeseqs

SafeSeqS depends on several other packages and these will be installed automatically by pip if they are not already installed.

Required packages:

scipy
pywin32 (on Windows machines)

Quick Start

Running SafeSeqS requires (1) a set of input files in the Study Directory (2) a settings JSON file identifying study parameters.

Runtime parameter descriptions


-d	Required	Directory containing the Study input files.
-r	Required	Run directory. Will be created under the Study directory.
-sf	Required	Settings file that identifies the parameters for this run. Settings file must exist in the SafeSeqs Data Directory.
-w	Optional	Number of concurrent worker sub-processes to run. Default: 1
-s	Optional	Start Stepname. Used when re-starting partially completed run. Processing will begin at the named step.
-e	Optional	End Stepname. Used when a partial run is desired. Processing will end before the named step.

Example assuming input files are in a directory called C:\SAFESEQS\input. The results should be stored in the sub-directory \Nov09 under the C:\SAFESEQS\input directory. The file SettingsTemplate.json should exist in the SafeSeqS Project Data directory and will be used to determine the run time settings. Up to 6 sub-processes may be run concurrently.

python -m safeseqs_controller -d C:\SAFESEQS\input -r Nov09 -sf SettingsTemplate.json -w 6

Required Input Files in the Study Directory

safeseqs.json - File identifying the Fastq input files, the barcodemap filename, and the ascii adjustment being used in the study data. Format of the file is:

{"reads_pattern" : "", "barcodes_pattern" : "", "reads_files" : ["fastq file 1 (just file name - must be in Study Directory", "fastq file 2", ... ], "barcodes_files" : ["barcode file 1", "barcode file 2", ...], "barcodemap" : "well barcode association file", "ascii_adj" : 33}

Study Data Fastq Files - One or more sets of reads and barcode files.
barcodemap.txt - Tab delimited file containing mapping of samples to barcodes and primer sets. BarcodeMapRecord format 'barcodeNumber', 'barcode', 'wbcPlateNumber', 'template', 'purpose', 'gEsWellOrTotalULUsed', 'mutOrTotalGEsWell', 'ampMatchName', 'row', 'col'.
primers.txt - Tab delimited file containing primers for the study. PrimerRecord format 'ampMatchName', 'gene', 'read1', 'read2', 'ampSeq', 'target_len', 'chrom', 'readStrand', 'hg19_start', 'hg19_end'.

Parameters in Settings File

Parameters in the settings file control decision making during processing runs. A SettingsTemplate.json is delivered with the package. It is located in the Study Data Directory. It can be cloned and modified to create different run scenarios.


uidlength	Required	Length of the UID in the study data.
max_mismatches_for_used_reads	Required	Integer. Example 3
max_indels_for_used_reads	Required	Integer. Example 1
max_amp_per_UID_family	Required	Integer. Example 1
min_good_reads_usable_family	Required	Integer. Example 2
min_perc_good_reads_per_UID_family	Required	Integer. Example 95 = 95%
super_mut_perc_homegeneity	Required	Integer. Example 90 = 90%
default_indel_rate	Required	Float. Example .001
default_sbs_rate	Required	Float. Example .001
mark_UIDs_with_Ns_UnUsable	Optional	Valid Values: Yes or No Default: No
perform_opt_dup_removal	Optional	Valid Values: Yes or No Default: No
load_bad_bc	Optional	Valid Values: Yes or No Default: No
load_not_used_bc	Optional	Valid Values: Yes or No Default: No
save_merge	Optional	Valid Values: Yes or No Default: No