Decrypting Orphan GPCR Drug Discovery via Multitask Learning

Wei-Cheng Huang 1, Wei-Ting Lin 1, Ming-Shiu Hung 1, Jinq-Chyi Lee 1, Chun-Wei Tung 1*

1 Institute of Biotechnology and Pharmaceutical Research, National Health Research Institutes, Miaoli County, Taiwan

Correspondence: cwtung@nhri.edu.tw

Abstract

The drug discovery of G protein-coupled receptors (GPCRs) superfamily using computational models is often limited by the availability of protein three-dimensional (3D) structures and chemicals with experimentally measured bioactivities. Orphan GPCRs without known ligands further complicate the process. To enable drug discovery for human orphan GPCRs, multitask models were proposed for predicting half maximal effective concentrations (EC50) of the pairs of chemicals and GPCRs. Protein multiple sequence alignment features, and physicochemical properties and fingerprints of chemicals were utilized to encode the protein and chemical information, respectively. The protein features enabled the transfer of data-rich GPCRs to orphan receptors and transferability based on similarity of protein features. The final model was trained using both agonist and antagonist data from 200 GPCRs and showed an excellent mean squared error (MSE) of 0.24 in the validation dataset. An independent test using the orphan dataset consisting of 16 receptors associated with less than 8 bioactivities showed a reasonably good MSE of 1.51 that can be further improved to 0.53 by considering transferability based on protein features. The informative features were identified and mapped to corresponding 3D structures to gain insights into the mechanism of GPCR-ligand interactions across the GPCR family. The proposed method provides a novel perspective on learning ligand bioactivity within the diverse human GPCR superfamily and can potentially accelerate the discovery of therapeutic agents for orphan GPCRs.

Keywords

Multitask Learning; G Protein-Coupled Receptors; GPCR; Feature Selection; Ligand-Based Virtual Screening

Python installation and environment setup

You may download the whole directory with script files and compressed result files.

use the python_env_setup.sh file to setup all dependencies
use the extract_files.sh file to extract all the result files for this task.

git clone git@github.com:drhuangwc/GPCR.git
sh python_env_setup.sh
sh extract_files.sh > files.out &

Script files:

GPCR_MTL_dataprocess.ipynb: script + original dataset
GPCR_MTL_autogluontrain.ipynb: script for separate protein Agonist training using Autogluon AutoML & separate protein Antagonist training using Autogluon AutoML & separate protein Agonist & Antagonist training using Autogluon AutoML
GPCR_MTL_Mol2vec.ipynb: script for using Mol2vec replace chemical features + result data
GPCR_MTL_FeatureSelection.ipynb: script for using mRMR feature selection of the files + FeatureSelection_GPCRtrain + result data
GPCR_MTL_TRAINevaluate.ipynb: script for evaluation of validation data and orphan data + result data
GPCR_MTL_similarity.ipynb: script for calculate Tanimoto similarity with or without mRMR selected features in 7 parts of the GPCR
GPCR_MTL_colorPDB.ipynb: script for mapping the mRMR selected features into the protein structure files
GPCR_MTL_GPRC5A_pred.ipynb: script for the prediction of GPRC5A activities

Name		Name	Last commit message	Last commit date
Latest commit History 399 Commits
GPCR_agonist_GPRC5A		GPCR_agonist_GPRC5A
Human_AgonistAntagonist		Human_AgonistAntagonist
bin		bin
colorPDB_selefeatureTOP200		colorPDB_selefeatureTOP200
ori_db		ori_db
Figure 1.jpg		Figure 1.jpg
GPCR_MTL_FeatureSelection.ipynb		GPCR_MTL_FeatureSelection.ipynb
GPCR_MTL_GPRC5A_pred.ipynb		GPCR_MTL_GPRC5A_pred.ipynb
GPCR_MTL_Mol2vec.ipynb		GPCR_MTL_Mol2vec.ipynb
GPCR_MTL_TRAINevaluate.ipynb		GPCR_MTL_TRAINevaluate.ipynb
GPCR_MTL_autogluontrain.ipynb		GPCR_MTL_autogluontrain.ipynb
GPCR_MTL_colorPDB.ipynb		GPCR_MTL_colorPDB.ipynb
GPCR_MTL_dataprocess.ipynb		GPCR_MTL_dataprocess.ipynb
GPCR_MTL_similarity.ipynb		GPCR_MTL_similarity.ipynb
README.md		README.md
extract_files.sh		extract_files.sh
python_env_setup.sh		python_env_setup.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Decrypting Orphan GPCR Drug Discovery via Multitask Learning

Abstract

Keywords

Python installation and environment setup

About

Languages

drhuangwc/GPCR

Folders and files

Latest commit

History

Repository files navigation

Decrypting Orphan GPCR Drug Discovery via Multitask Learning

Abstract

Keywords

Python installation and environment setup

About

Topics

Resources

Stars

Watchers

Forks

Languages