IonPred: Prediction of ion-ligand binding sites with ELECTRA

Introduction

Interactions between proteins and ions are essential for the proteins to carry out various biological functions like structural stability, metabolism, signal transport, etc. As more than half of all proteins bind to ions, it becomes necessary to identify ion-binding sites. This helps to understand their biological functions and is also very useful in drug discovery studies. While several computational approaches have been proposed, this remains a difficult problem due to the small size and high versatility of the metal and acid radical. In this study, we propose IonPred, a sequence-based approach using ELECTRA (Efficiently Learning an Encoder that Classifies Token Replacements Accurately) which is based on replacement token detection of amino acid residues from protein sequences. This model is designed to predict 9 metal ions (Zn²⁺, Cu²⁺, Fe²⁺, Fe³⁺, Ca²⁺, Mg²⁺, Mn²⁺, Na⁺, and K⁺) and 4 acid radical ion ligands (CO₃²⁻, SO₃²⁻, PO₄³⁻, NO²⁻).

Requirements

Set up

The input for this tool consists of raw protein sequences in fasta format. While the output consists of probability scores for each candidate site.

The threshold used is 0.5. So candidate residues that have a probability >= 0.5 are considered to be ion-binding sites.
Data sets used to be used to run prediction must be placed in the directory called test
While the results for each residue binding site would be found in the directoryresults. In the results directory, the predictions would be saved in a directory labeled with the corresponding ion name.
A batch size of 128 was used while running predictions but this parameter can be modified.

Run Prediction

For example to predict Zinc binding site i.e. Zn²⁺, run the command:

python3 predict.py -input test/zn.fasta -ion-type ZN

For guidance on other parameters, run:

python3 predict.py -help

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
data		data
finetune		finetune
lib		lib
model		model
pretrain		pretrain
results		results
test		test
util		util
vocab		vocab
.gitignore		.gitignore
README.md		README.md
build_pretraining_dataset.py		build_pretraining_dataset.py
configure_finetuning.py		configure_finetuning.py
configure_pretraining.py		configure_pretraining.py
extract_features.py		extract_features.py
get_binding_sites.py		get_binding_sites.py
predict.py		predict.py
results_test_results.txt		results_test_results.txt
run_finetuning.py		run_finetuning.py
run_pretraining.py		run_pretraining.py

clemEssien/IonPred

Folders and files

Latest commit

History

Repository files navigation

IonPred: Prediction of ion-ligand binding sites with ELECTRA

Introduction

Requirements

Set up

Run Prediction

About

Resources

Stars

Watchers

Forks

Languages