SiftSeq

A state-of-the-art CNN+LSTM neural network model that predicts whether short sequences of DNA are viral, human, or bacterial in origin

Performance

SiftSeq significantly outperforms benchmarks set by the recent classifiers ViraMiner and VirNet, while being able to handle more classes and shorter sequences.

Installation

If you do not yet have it, download Docker.
Pull the docker image for SiftSeq from Docker Hub. On the command line, type

docker pull elanstop/sift-seq:latest

Run a container from this image. On the command line, type

docker run -it elanstop/sift-seq:latest

Usage

To examine predictions, e.g., for the input file all_raw_reads.fasta contained in the data folder, cd to the sift_seq directory within the container and run

python make_prediction.py all_raw_reads.fasta output.csv

where output.csv is the chosen name of the output file that will hold the predictions. More generally, input files stored on your machine may be copied to the container using docker cp.

To train the model using the example data, cd to the sift_seq directory within the container and run:

python train.py

Trained models are saved in the saved_models folder after each epoch and labelled with their validation accuracy. To supply your own training data, simply use docker cp to transfer files to the container, and modify the paths within train.py to point to your chosen files.

Contact

Elan Stopnitzky, e.stopnitzky@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
data		data
images		images
saved_models		saved_models
sift_seq		sift_seq
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

images

images

saved_models

saved_models

sift_seq

sift_seq

Dockerfile

Dockerfile

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

SiftSeq

Performance

Contents

Installation

Usage

Contact

About

Releases

Packages

Languages

elanstop/sift-seq

Folders and files

Latest commit

History

Repository files navigation

SiftSeq

Performance

Contents

Installation

Usage

Contact

About

Resources

Stars

Watchers

Forks

Languages