-
This repository contains scripts, codes, and data for the following study by Lee, Emani, and Gerstein: https://arxiv.org/abs/2310.03946
-
The scripts and codes are shared without optimization in this repository, which may include analyses and results that were not reported in the study above.
- Docking tools: The docking tools, SMINA and Vinardo, have to be run on complexes with 3D structures. The code provided in the
empirical docking
folder provides Python-based tools to convert a file into the 3D SDF format required to runsmina
. Subsequently, code is provided therein to run the docking tools and output a docked complex. The log files can then be parsed to obtain the lowest-energy binding affinities predicted, which are part of meta-features for the meta-models. - Deep learning models: Our deep learning models are based on the DeepPurpose library in Python. The input data are ligand SMILES and protein amino acid sequences. We developed 4 families of de novo-trained and fine-tuned models using BindingDB and PDBbind. The
deep learning
folder contains example codes for selected models. Together with pre-trained models from DeepPurpose, we built up to 1,100 model variants from cross validations. Predicted binding affinities are part of meta-features for the meta-models with or without dimensionality reduction by PCA. - Molecular weight: The molecular weights of the ligands may be used as a meta-feature for the meta-models. One way to programmatically extract them is to use
OpenBabel
'sobprop
function. - Meta-models: The code for the meta-model prediction task is provided in the
meta-models
folder. It takes a directory containing deep-learning predictions and a spreadsheet containing the docking tool scores and molecular weights as inputs.
README.md
: This current pageSupplementaryTables.revision4.xlsx
: Supplementary Tables S1 to S11 associated with our article (v4) abovedata
: Data for model training and evaluationdocking
: Empirical docking toolsdeep learning
: Deep learning modelsmeta-models
: Meta-modelsLICENSE
: GNU General Public License v3.0
- Ho-Joon Lee, Ph.D.: ho-joon.lee[at]yale.edu
- Prashant Emani, Ph.D.: prashant.emani[at]yale.edu
Released under the GNU General Public License v3.0. See LICENSE.