CREWdb 1.0 : Optimizing Chromatin Readers, Writers and Erasers database using Machine Learning-based Approach.

Abstract

Aberration in heterochromatin and euchromatin states contributes to various disease phenotypes. The transcriptional regulation between these two states is significantly governed by post-translational modifications made by three functional types of chromatin regulators: readers, writers, and erasers. Writers introduce a chemical modification to DNA and histone tails, readers bind the modification to histone tails using specialized domains, and erasers remove the modification introduced by writers. Altered regulation of these chromatin regulators plays a key role in complex diseases such as cancer, neurodevelopmental diseases, myocardial diseases, and embryonic development. Due to the reversible nature of chromatin modifications, we have the opportunity to develop therapeutic approaches targeting chromatin regulators. However, a limited number of chromatin regulators have been identified, and a subset of those identified have been ambiguously classified as multiple chromatin regulator types. Thus, we have applied machine learning-based approaches to predict and classify the functionality of chromatin regulator proteins, optimizing the accuracy of the first comprehensive database of chromatin regulators known as CREWdb.

How to use

Requirements : Python (v 3.9) Packages:

Numpy (v 1.16.0)
Pandas (v 1.1.5)
Scikit-learn (v 0.24.2)
matplotlib (v 3.3.4)
imbalanced-learn (v 0.8.1)
seaborn (v 0.11.2)

After setting up the Python environment with the above packages, the python scripts can be executed using the command: python filename.py Using Anaconda/ Miniconda is recommended to run the scripts and iPython notebooks.

If you have any problems running our code, please feel free to contact us (smollah@wustl.edu, g.reetika@wustl.edu)

Note : This repository only contains the code for the machine learning models. The code for the web-interface and database is not included.

Citation

Please cite the following paper if you are using CREWdb for your research:

CREWdb 1.0: Optimizing Chromatin Readers, Erasers, and Writers Database using Machine Learning-Based Approach. Maya Natesan, Reetika Ghag, Mitchell Kong, Min Shi, Shamim Mollah bioRxiv 2022.06.02.494594; doi: https://doi.org/10.1101/2022.06.02.494594

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
data		data
.DS_Store		.DS_Store
5-Fold-CREWdb.ipynb		5-Fold-CREWdb.ipynb
5-Fold-CREWdb.py		5-Fold-CREWdb.py
LICENSE		LICENSE
LOOCV-CREWdb.ipynb		LOOCV-CREWdb.ipynb
LOOCV-CREWdb.py		LOOCV-CREWdb.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

.DS_Store

.DS_Store

5-Fold-CREWdb.ipynb

5-Fold-CREWdb.ipynb

5-Fold-CREWdb.py

5-Fold-CREWdb.py

LICENSE

LICENSE

LOOCV-CREWdb.ipynb

LOOCV-CREWdb.ipynb

LOOCV-CREWdb.py

LOOCV-CREWdb.py

README.md

README.md

Repository files navigation

CREWdb 1.0 : Optimizing Chromatin Readers, Writers and Erasers database using Machine Learning-based Approach.

Abstract

Contents

How to use

Citation

About

Releases

Packages

Contributors 3

Languages

License

smollahlab/CREWkb

Folders and files

Latest commit

History

Repository files navigation

CREWdb 1.0 : Optimizing Chromatin Readers, Writers and Erasers database using Machine Learning-based Approach.

Abstract

Contents

How to use

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages