SaLTy assigns a lineage to Staphylococcus aureus WGS data and is suitable for describing large-scale S. aureus genomic epidemiology.
SALTy typing is highly accurate and can quickly analyse large volumes of S. aureus illumina reads (fastq.gz) or genome assemblies (fasta).
Now Published in Microbial Genomics!
- python (v3.6 or greater)
- kma (v1.4.9 or greater)
- pandas (v1.5.0 or greater)
- mlst (v2.23.0 or greater)
conda install -c conda-forge -c bioconda salty
**See below for Mac M1 installation.
salty.py -i <input_folder> -o <output_folder>
GENERAL:
-t THREADS, --threads THREADS
Number of threads (speeds up parsing raw reads).
-f, --force Overwite existing output folder.
--report Only generate summary report from previous SALTy
outputs.
--check Checks that required dependencies are available.
-v, --version Returns current SaLTy version.
INPUT:
-i INPUT_FOLDER, --input_folder INPUT_FOLDER
Folder of genomes (*.fasta or *.fna) and/or pair end
reads (each accession must have *_1.fastq.qz and
*_2.fastq.
OUTPUT:
-o OUTPUT_FOLDER, --output_folder OUTPUT_FOLDER
Output Folder to save result.
-c, --csv_format Output file in csv format.
-s, --summary Concatenate all output assignments into single file.
DB & Program Paths:
-l LINEAGES, --lineages LINEAGES
Path to specific alleles for each lineage.
-k KMA_INDEX, --kma_index KMA_INDEX
Path to indexed KMA database.
-m, --mlstPrediction Explained in ReadMe. Used as backup when lineage is unable to be called
through SaLTy screening. Marked with *.
Run SaLTy on a folder containing pairs of fastq files and/or assemblies.
salty --input_folder /input/reads/folder/ --output_folder /output/folder/name/ --threads INT
Specify SaLTy output assigned lineages in comma-separated (.csv) format.
salty --input_folder /input/reads/folder/ --output_folder /output/folder/name/ --csv_format
Genome Lineage SACOL0451 SACOL1908 SACOL2725
SRR992071338 15 24 20 24
SRR992044718 15 24 20 24
SRR10591328 15 24 20 69
Column | Description |
---|---|
Sample | Input strain ID (extracted from input files). |
Lineage | The lineage assigned by the SaLTy algorithim. |
SACOL0451 | The identified allele for the SACOL0451 locus. |
SACOL1908 | The identified allele for the SACOL1908 locus. |
SACOL2725 | The identified allele for the SACOL2725 locus. |
As of March 2023 not all required SaLTy dependencies have been developed for both Intel and Mac M1 chipsets. The conda environments created on Mac M1 can be configured to collect dependencies developed only for Intel (osx-64). Additionally, python v3.9.0 should be used to avoid compatability issues with Numpy (required for Pandas which is required for SaLTy).
- Create a conda environment and configure the environment for Intel (osx-64).
conda create -n salty
conda activate salty
conda config --env --set subdir osx-64
- Install SaLTy with Python v3.9.0.
conda install -c bioconda salty python=3.9.0
It is possible to infer a SaLTy lineage through the Multi-Locus Sequencing Type (MLST). During the development of SaLTy a select number of MLST types were associated with SaLTy lineages. This list of MLST types is not exhaustive and will not be continually updated.
MLST type can be used in some instances to infer the SaLTy lineage. Referred to as mlstPrediction (in the SaLTy usage), when SaLTy has attempted to use MLST to predict a lineage an asterisk (*) is marked.
Below are three cases of SaLTy analysis and the use of mlstPrediction is explained.
#1 SaLTy predicted lineage only using three-gene markers (no asterisk).
#2 SaLTy predicted lineage based on MLST type. Three-gene markers unable to infer lineage. Instead associated MLST type used.
#3 SaLTy unable to predict lineage. Tried both three-gene markers and mlst prediction (marker by asterisk).
# Genome Lineage SACOL0451 SACOL1908 SACOL2725
1 SRR9920718 15 20 24 24
2 ERR109478 *4 13 - 16
3 ERR1213758 *No lin. - - -