Skip to content

khanhlee/KLF-XGB

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

KLF-XGB

A sequence-based prediction of Kruppel-like factors proteins using XGBoost and optimized features

Krüppel-like factors (KLF) refer to a group of conserved zinc finger-containing transcription factors that are involved in various physiological and biological processes, including cell proliferation, differentiation, development, and apoptosis. Owing to the diverging non-DNA binding N-terminal sequences of KLFs and sophisticated relations to carcinogenesis and metabolic diseases, many attempts have been made over the last decade in order to broaden our knowledge of these proteins. Some bioinformatics methods such as sequence similarity searches, multiple sequence alignment, phylogenetic reconstruction, and gene synteny analysis have also been proposed to identify KLF proteins. In this study, we proposed a novel computational approach by using machine learning on features calculated from primary sequences. To detail, our XGBoost-based model is efficient in identifying KLF proteins, with accuracy of 96.4% and MCC of 0.704. It also holds a promising performance when testing our model on an independent dataset. Therefore, our model could serve as an useful tool to identify new KLF proteins and provide necessary information for biologists and researchers in KLF proteins.

About

A sequence-based prediction of Kruppel-like factors proteins using XGBoost and optimized features

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages