Skip to content

ThomasBeder/CLEARER

Repository files navigation

CLassifier of Essentiality AcRoss EukaRyotes (CLEARER)


#CLAERER is a machine learning approach for essential gene #prediction across eukaryotes

ATTENTION: Please find the archieved source code, data sets and trained machines on Zenodo (https://doi.org/10.5281/zenodo.5557738).

Please find the preprint describing CLEARER on https://www.biorxiv.org/content/10.1101/2021.04.15.439934v1. CLEAERER uses many diverse features which can be generated based on the scripts in "/Feature_generation" directory. Feature names and tools to generate them are listed in "Feature_description.csv". DNA and protein sequences for feature generation can be found in "/Sequences". The class labels for each of the organisms can be found in "/Class_labels". For Drosophila melanogaster, Homo sapiens and Mus musculus cellular and organismal essential gene information are available.

Attention! Feature generation and machine learning are time consuming. All features for the organisms are pre-build and can be found in "/Features". Feature selection and machine learning is exemplified for Saccharomyces cerevisiae in "Feature_selection_and_machine_learning.R"

All trained models can be found in /trained_models.

General questions to thomasbeder7@gmail.com.

Specific questions about Cofactory NetCGlyc NetNGlyc, NetOGlyc, ProP, SignalP, TargetP, TMHMM, YinOYang, NetworkX and gene set features to phemmysmart@gmail.com. CodonW (http://codonw.sourceforge.net/) features and DeepLoc (http://www.cbs.dtu.dk/services/) were generated by the tools without modification. Question about CodonW or DeepLoc to phemmysmart@gmail.com.

About

Essential gene prediction across eukaryotes

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages