knotify

RNA secondary structure prediction engine.

The monorepo provides a grammar based algorithm to predict the pseudoknots patterns of the secondary structure of any RNA sequence.

Authors: Andrikos Christos, Makris Evaggelos, Pavlatos Christos, Rassias Georgios, Angelos Kolaitis

Run with Docker

knotify is written in a mix of Python and C code. The easiest way to run it is to build the Docker image and then run as follows. We do not currently provide pre-built Docker images, but that might change in the future.

# build docker image and tag as `knotify:dev`
docker build -t knotify:dev .

# run an example 'rna_analysis'
docker run --rm -it knotify:dev /knotify/bin/rna_analysis --sequence AAAAAACUAAUAGAGGGGGGACUUAGCGCCCCCCAAACCGUAACCCC

Development Environment

knotify has binary depedendencies that require running on a Ubuntu 20.04 system. Building the code consists of 2 parts: Setting up a Python 3 virtual environment and building the C parser libraries.

Note that the instructions below will probably not work on newer Ubuntu systems or different Linux distributions (e.g. CentOS). This is because of the wheel-requirements.txt pinning the ViennaRNA C library to the ubuntu2004 version, which requires a specific glibc version. In the future, we will build ViennaRNA from source to avoid this restriction.

$ make deps  # install package dependencies
$ make       # build the parser and setup the virtual environment at ./.venv

After installation, make sure to activate the virtual environment before running any of the commands below:

$ . ./.venv/bin/activate

Execute

rna_analysis

For a single sequence. See --help for a complete list of options:

$ rna_analysis --sequence AAAAAACUAAUAGAGGGGGGACUUAGCGCCCCCCAAACCGUAACCCC

rna_benchmark

Run a benchmark for a number of cases from a YAML file, saving results in JSON format in result.json. See cases.yaml for an example YAML file. See --help for a list of available options.

$ rna_benchmark --cases cases/cases.yaml --max-dd-size 2 --max-stem-allow-smaller 1 --allow-ug --prune-early --parser bruteforce > result.json
$ rna_benchmark --cases cases/cases.yaml --max-dd-size 2 --max-stem-allow-smaller 1 --allow-ug --prune-early --parser yaep > result.json

Calling directly from Python code

See scripts/00-example.py for using knotify directly from Python code.

Unit Tests

We use pytest for all unit tests. After enabling the virtual environment, you can run them with:

$ pytest

Implementation details

The algorighm consits of the two following steps:

1.  Parse mulitple RNA subsequences to define the potenital pseudoknot structures

2.  Choose the pseudoknot structure that is considered to be the most stable one. This steps is based on the concept of energy minimization concept.

The core algorithm was initially implemented in python based on the wide-known NLTK package. Due to serious performance issues we moved the parsing into c utilizing the yaep parser which is able to parse ambient grammars.

Scoring

Compare prediction dot bracket with ground truth. Create confusion matrix

Definition	Description
true positive	Ground truth has a stem here, and I have correctly found that stem (matching the pair, TODO: distance +- 1)
true negative	Ground truth does not have a stem, and I do not have a stem
false positive	I have predicted a stem here, but there is no stem in ground truth
false negative	I have predicted no stem, but ground truth has a stem here

Loop sizes and indices

Unused right loop:

inclusive_start_index = left_core_inner + left_loop_stems + 1
unused_right_loop_size = right_core_outer - inclusive_start_index
inclusive_end_index = right_core_outer - 1

Unused left loop:

unused_left_loop_size = left_loop_size - right_core_stems
inclusive_start_index = left_core_outer + 1
inclusive_end_index = inclusive_start_index + left_loop_size - right_core_stems - 1

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
.github/workflows		.github/workflows
cases		cases
knotify		knotify
pairalign		pairalign
parsers		parsers
pkenergy		pkenergy
scripts		scripts
tests		tests
wheels		wheels
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
dev-requirements.txt		dev-requirements.txt
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
wheel-requirements.txt		wheel-requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

knotify

Run with Docker

Development Environment

Execute

rna_analysis

rna_benchmark

Calling directly from Python code

Unit Tests

Implementation details

Scoring

Loop sizes and indices

Unused right loop:

Unused left loop:

About

Releases

Packages

Contributors 2

Languages

License

ntua-dslab/knotify

Folders and files

Latest commit

History

Repository files navigation

knotify

Run with Docker

Development Environment

Execute

rna_analysis

rna_benchmark

Calling directly from Python code

Unit Tests

Implementation details

Scoring

Loop sizes and indices

Unused right loop:

Unused left loop:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages