Skip to content
Scripts for "Efficient Hyperparameter Optimization by Using Bayesian Optimization for Drug-Target Interaction Prediction"
Python
Branch: master
Clone or download
Latest commit 5413c15 Feb 8, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
dataset
.DS_Store
LICENSE update Sep 27, 2018
PyDTI.py update Sep 27, 2018
README.md
blm.py update Sep 27, 2018
cmf.py update Sep 27, 2018
cv_eval.py
functions.py undate all Sep 27, 2018
kbmf.py update Sep 27, 2018
netlaprls.py
new_pairs.py update Sep 27, 2018
nrlmf.py update Sep 27, 2018
sat_analysis.py update Sep 27, 2018
wnngip.py

README.md

Scripts for "Efficient Hyperparameter Optimization by Using Bayesian Optimization for Drug-Target Interaction Prediction"

A Bayesian optimization technique enables a short search time for a complex prediction model that includes many hyperparameters while maintaining the accuracy of the prediction model. Here, we apply a Bayesian optimization technique to the drug-target interaction (DTI) prediction problem as a method for computational drug discovery. We target neighborhood regularized logistic matrix factorization (NRLMF) (Liu et al., 2016), which is a state-of-the-art DTI prediction method, and accelerated parameter searches with the Gaussian process mutual information (GP-MI). Experimental results with four general benchmark datasets show that our GP-MI-based method obtained an 8.94-fold decrease in the computational time on average and almost the same predicted area under the curve (AUC) for all datasets compared to those of a grid parameter search, which was generally used in DTI predictions. Moreover, if a slight accuracy reduction (approximately 0.002 for AUC) is allowed, an increase in the calculation speed of 18 times or more can be obtained. Our results show for the first time that Bayesian optimization works effectively for the DTI prediction problem. By accelerating the time-consuming parameter search, the most advanced model can be used even if the number of drug candidates and target proteins to be predicted increase.

Requirements

Python

You need to use Python 3.x for executing this scripts. We recommends that you use Anaconda 2.4.0 to set up python environment. This script was created by using Python 3.5.2. For Python 3.5.2 please refer to the following URL.
https://www.python.org/downloads/release/python-352/

Python packages

In addition, we use Numpy, scikit-learn (ver. 0.18.1 and above), scipy, pymatbridge (required only when using KBMF 2K) as Python package. For each package please refer to the following URL.
− Numpy: http://www.numpy.org/
− scikit-learn: http://scikit-learn.org/stable/
− scipy: http://www.scipy.org/
− pymatbridge: http://arokem.github.io/python-matlab-bridge/

Datasets

In order to execute the script, the Drug-Target Interaction data set created by Yamanishi et al. Is necessary. The data set can be downloaded from the following URL.
http://web.kuicr.kyoto-u.ac.jp/supp/yoshi/drugtarget/
− nr_admat_dgc.txt, nr_simmat_dc.txt, nr_simmat_dg.txt
− gpcr_admat_dgc.txt, gpcr_simmat_dc.txt, gpcr_simmat_dg.txt
− ic_admat_dgc.txt, ic_simmat_dc.txt, ic_simmat_dg.txt
− e_admat_dgc.txt, e_simmat_dc.txt, e_simmat_dg.txt

Installation

  1. Download the archive of BO-DTI-master from this repository.
  2. Extract the archive and cd into the extracted directory.
  3. Run make command.

Commands:

$ cd BO-DTI-master
$ mkdir dataset
$ cp ~/Downloads/*_admat_dgc.txt dataset
$ cp ~/Downloads/*_simmat_dc.txt dataset
$ cp ~/Downloads/*_simmat_dg.txt dataset

Usage

You can specify the following options

  • gpmi ... GPMI algorithm can be used instead of grid search
       - delta ... Adjust the balance between exploration and usage: delta > 0
       - max_iter ... Specify the maximum value of iteration (number of combinations of parameters): max_iter > 0
       - n_init ... Specify the initial number of samples: n_init > 0
  • seed ... Fix the division of cross validation
  • job-id ... Specify the job id
  • workdir ... Specify the directory to output log files

For other, please refer to PyDTI

Example

  1. Command to execute grid search
$ python PyDTI.py --method="nrlmf" --dataset="nr" --cvs=1 --specify-arg=0 --predict-num=0 --seed="1" --job-id="1" --workdir="."
  1. Command to execute GPMI algorithm
$ python PyDTI.py --method="nrlmf" --dataset="nr" --cvs=1 --specify-arg=0 --predict-num=0 --gpmi="delta=1e-100 max_iter=2688 n_init=1" --seed="1" --job-id="1" --workdir="."

Acknowledgement

This script was created based on PyDTI developed by Liu et al. PyDTI can be accessed from the following URL.
https://github.com/stephenliu0423/PyDTI.git

Contact

These scripts was implemented by Tomohiro Ban.
E-mail: ban@bi.c.titech.ac.jp

Department of Computer Science, School of Computing, Tokyo Institute of Technology, Japan
http://www.bi.cs.titech.ac.jp/

If you have any questions, please feel free to contact the author.

References

Tomohiro Ban, Masahito Ohue, Yutaka Akiyama: Efficient Hyperparameter Optimization by Using Bayesian Optimization for Drug-Target Interaction Prediction, In Proceedings of the 7th IEEE International Conference on Computational Advances in Bio and Medical Sciences (ICCABS 2017), 6 pages, Orlando, FL, USA, October 19-21, doi:10.1109/ICCABS.2017.8114299, 2017. https://doi.org/10.1109/ICCABS.2017.8114299

(Conference Website) http://www.iccabs.org/


Copyright © 2017 Akiyama Laboratory, Tokyo Institute of Technology, All Rights Reserved.

You can’t perform that action at this time.