Synsor is a tool for identifying engineered DNA sequences. To accomplish this, Synsor leverages k-mer signature differences between natural and engineered DNA sequences and uses an artificial neural network to predict which sequences are likely to have been engineered. The model was trained on the 7-mer frequencies of natural plasmid and synthetic vector sequences from NCBI.
Create the conda environment: conda create env -f environment.yml
.
python synsor.py \
count \
-i test.fa \
-k 7 \
-o freq_dir
python synsor.py \
predict \
-f freq_dir \
-m model \
-o output
The source codes are licensed under GPL less public licence. Users can contribute by making comments on the issues tracker, the wiki or direct contact via e-mail (see below).
For more information, please refer to the following article:
Aidan P. Tay, Kieran Didi, Anuradha Wickramarachchi, Denis C. Bauer, Laurence O. W. Wilson, Maciej Maselko.
Synsor: a tool for alignment-free detection of engineered DNA sequences.
Frontiers in Bioengineering and Biotechnology, 2024, 12:1375626.
PMID:39070163
Aidan Tay: a.tay@unswalumni.com