GBDT_KgluSite

In this paper, a new lysine glutarylation(Kglu) site prediction model GBDT_Kglu was proposed， which adopted seven feature encoding methods to convert protein sequences into digital information, including BE, BLOSUM62, EAAC, CTDC, PSSM, CKSAAP, and Secondary Structural information. Then, the NearMiss-3 method dealed with the imbalanced data set issue ,and Elastics Net was used to filter redundant information in the features. Finally, the prediction model for identify Kglu site based on GBDT was established

Requirement

Backend = Tensorflow(1.14.0)
keras(2.3.1)
Numpy(1.20.2)
scikit-learn(1.0.2)
pandas(1.3.5)
matplotlib(3.5.2)\

Dataset

The data uploaded in DataSet is the original data before dividing the dataset, with 707 positive samples and 4369 negative samples, all with a sample length of 33, where X stands for virtual amino acids. Glutarylation.csv is the original dataset, Glutarylation208.csv is obtained by removing duplicate data using CD-hit, and contains a total of 208 proteins. The folder Train contains all training data, while Test contains all independent test data.

Feature

There are seven features were used in GBDT_KgluSite model. Two of them were generated by one_hot.py, and CKSAAP.py， the PSSM feature was generated by PSI-BLAST, The rest of them were obtained by iLearnPlus.

Model

GBDT_ KgluSite.py can be directly used to predict glutarylation modification sites when load the pretrained model GBDT_KgluSite.pickle

Contact

Feel free to contact us if you nedd any help: flyinsky6@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
Dataset		Dataset
Feature		Feature
Model		Model
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GBDT_KgluSite

Requirement

Dataset

Feature

Model

Contact

About

Releases

Packages

Languages

flyinsky6/GBDT_KgluSite

Folders and files

Latest commit

History

Repository files navigation

GBDT_KgluSite

Requirement

Dataset

Feature

Model

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages