EnOpt

Ensemble Optimizer (EnOpt) is a fast, accessible tool that streamlines ensemble-docking and consensus-score analysis. EnOpt takes as input a matrix of docking scores from an ensemble virtual screen, organized as compounds (rows) X protein conformations (columns). It uses simple, interpretable machine learning to identify most-predictive subensembles and an ensemble composite score.

Setup

Before using EnOpt, ensure that you have installed a python enviroment with all necessary packages (e.g., NumPy, Pandas, SciPy, etc.). We have provided a conda specification file to make it easier to set up an environment with all necessary packages:

conda create --name [environment name] --file conda_spec_file.txt

To print a guide with all standard options and their usage:

python ensemble_optimizer.py --help

Simple usage

An example of the simplest use of EnOpt:

python ensembe_optimizer.py -f [input file matrix]

Options

Input options

The input CSV file containing the ensemble docking score matrix (required):

-f INPUT_FILE

A file containing the names of known ligands, separated by commas:

-l KNOWN_LIGS, --knownLigs KNOWN_LIGS

A JSON file containing all user-specified EnOpt input parameters, as an alternative to the command line input:

--json_input JSON_INPUT

Output options

The prefix of the output file:

--outFile OUT_FILE

The number of known ligands to include in interactive output:

--top_known_out TOP_KNOWN_OUT

The number of unknowns (compounds that are not known ligands) to include in interactive output:

--top_unknown_out TOP_UNKNOWN_OUT

Scoring options

The scoring scheme to use for combining scores across conformations:

--scoringScheme SCORING_SCHEME

^{(One of "eA", "eB", "rA", or "rB". "eA" uses the average score across all
conformations in the ensemble. "eB" uses the best score across all
conformations. "rA" uses the average of the score rank for each conformation.
"rB" uses the best-ranked score across all conformations. Default: eA.)}

Whether to compute weights optimized using tree models:

--weightedScore

^{(EnOpt performs optimization using known ligands if included. Otherwise, it
uses score rankings; not recommended.)}

Whether higher (more positive) scores describing stronger binding:

--invertScoreSign

^{(The scheme depends on the docking program used; for example, smina uses
more negative scores to represent stronger binding. Default: False, meaning that
more negative scores represent stronger binding.)}

Optimization options

Method to determine weighted scores:

--optimizationMethod OPT_METHOD

^{(One of "RF", Random Forest, or "XGB", Gradient-boosted trees. Default: RF.)}

Number of top conformations to include in the "best subensemble":

--topConformations TOPN_CONFS

^{(Default: 3)}

Whether to perform hyperparameter optimization for tree models:

--hyperparam

^{Default: False (default tree model parameters will be used).}

Optional JSON file containing user-provided parameters for optimization:

--tree_params TREE_PARAMS

^{If not provided, default hyperparameter optimization options will be used.}

Paper/lab link

Find more tools for analysis of protein-ligand binding at https://durrantlab.pitt.edu/durrant-lab-software/.

Contact info

For questions, suggestions, or problems with the tool contact Roshni Bhatt at rob108@pitt.edu.

Acknowledgements

This work was supported by the National Institute of Health (1R01GM132353-01) and the University of Pittsburgh's Center for Research Computing, RRID:SCR_022735 (supported by NSFOAC-2117681). We would like to thank Yogindra Raghav for his contributions in generating initial proof-of-concept code. We also thank Darian Yang for assistance in collating and pruning ideas.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
EnsembleOptimizer		EnsembleOptimizer
demo_files		demo_files
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE.md		LICENSE.md
README.md		README.md
conda_spec_file.txt		conda_spec_file.txt
ensemble_optimizer.py		ensemble_optimizer.py
test.sh		test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EnOpt

Setup

Simple usage

Options

Input options

Output options

Scoring options

Optimization options

Paper/lab link

Contact info

Acknowledgements

About

Releases 2

Packages

Contributors 2

Languages

License

durrantlab/EnOpt

Folders and files

Latest commit

History

Repository files navigation

EnOpt

Setup

Simple usage

Options

Input options

Output options

Scoring options

Optimization options

Paper/lab link

Contact info

Acknowledgements

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

Packages