Skip to content

LuninaPolina/SecondaryStructureAnalyzer

Repository files navigation

Secondary structure analysis research: formal grammars + neural networks

Solution structure

Parsing tool

Requirements (Ubuntu)

  • Install mono
  • Install .net framework 4.7.2 via wine
  • Install dotnet-sdk-3.0
  • dotnet add package FSharp.Core --version 4.5.0

Run

./YaccConstructor/src/SecondaryStructureExtracter/bin/Debug/SecondaryStructureExtracter.exe -g 'grammar.txt' -i 'input_sequences.fasta' -o 'output_dir/'

Additional argument: output file format -f csv (default — bmp)

Grammar example is presented in the repo root folder

Neural networks

Requirements

  • Python 3
  • TensorFlow-gpu 1x
  • Keras, scikit-image

Repository description

Experiments

  • 16S detection — binary classifier that separates true 16s RNA sequences from random parts of genome
  • Chimeras detection — not comleted research for chimeric sequences search in biological databases
  • Secondary structure prediction — model for predicting RNA secondary structure contact map from parsing provided matrix
  • TRNA classification — solutions for several trna classification tasks with small amount of classes

Content

  • Links to all required datasets and trained models weights are presented in data/datalinks.txt files
  • Models code and logs are presented in model/ folder
  • All useful data processing scripts are stored in scripts/ folder

Papers


Contact: lunina_polina@mail.ru

About

Project for genomic sequences analysis employing the combination of formal grammars and neural networks.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages