Skip to content

This is the official source of paper "MocFormer: A Two-Stage Pre-training-Driven Transformer for Drug-Target Interactions Prediction"

Notifications You must be signed in to change notification settings

DHCGroup/MocFormer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

MocFormer: A Two-Stage Pre-training-Driven Transformer for Drug-Target Interactions Prediction

The official repository of the paper MocFormer: A Two-Stage Pre-training-Driven Transformer for Drug-Target Interactions Prediction

image

A novel two-stage pre-trained framework (Mocformer) is proposed for drug-target interactions prediction. In the first stage, pre-trained molecule and protein models develop a comprehensive feature representation, enhancing the framework's ability to handle drug and protein diversity. This also reduces bias, improving prediction accuracy. In the second stage, a transformer with bilinear pooling and a fully connected layer enables predictions based on feature vectors.

Installation

git clone https://github.com/rickwang28574/MocFormer.git
cd MocFormer

Install dependency

conda create -n MocFormer python==3.8.1
conda activate MocFormer
pip install -r requirements.txt

Processed Datasets

The two datasets provided below are results obtained by processing and fine-tuning small molecules and proteins using Unimol and ESM-2, respectively. The composition of the datasets includes: SMILES representations for small molecules, amino acid sequences for proteins, embeddings for small molecules, embeddings for proteins, and labels.

For DrugBank dataset

https://drive.google.com/file/d/1PsFQusALcyp2NkjFs5xw-CHhSqZpzSVr/view?usp=sharing

For Epigenetic-regulators dataset

Train: https://drive.google.com/file/d/1_aJX3UBZMDsi32EZz25BAW3KRvdQHsy9/view?usp=sharing
Test: https://drive.google.com/file/d/1k-Y6fBAY8U8IukxaO9Dhh4ESzYRLtIGz/view?usp=sharing

Train & Test

Please run the billnear_DrugBank_uni_esm2_3B_trans copy.ipynb notebook sequentially, and you will obtain results in the "Test" section of the file. In this Jupyter notebook, we have provided model training and testing code using the DrugBank dataset as an example. The model for the Epigenetic-regulators dataset will be very similar.

Citation

@article{Zhang2023.09.13.557595,
  title={MocFormer: A Two-Stage Pre-training-Driven Transformer for Drug-Target Interactions Prediction},
  author={Yi-Lun Zhang and Wen-Tao Wang and Jia-Hui Guan and Deepak Kumar Jain and Tian-Yang Wang and Swalpa Kumar Roy},
  journal={International Journal of Computational Intelligence Systems},
  year={2024}
}

About

This is the official source of paper "MocFormer: A Two-Stage Pre-training-Driven Transformer for Drug-Target Interactions Prediction"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published