Skip to content

arnabpune/SPOTLIGHT

Repository files navigation

About

This folder contains the distributable of executable files for the SPOTLIGHT program. A template input file is given here input_template.dnvin. The source code is provided under ./dnv, and compilation instructions are given below

Requirements

To compile SPOTLIGHT, you require the following packages/libraries available - cmake (V 3.0+) - make - OpenMPI (V 3.1+) - ZeroMQ libraries for C++ (and preferably python) - zeromq and cppzmq - libtorch for C++ (specific version, packaged with SPOTLIGHT - Latest version will NOT work due to deprecated implementations having changes. The version packaged with SPOTLIGHT is available at: [Dropbox Link]) Add libtorch/libs to the LIBRARY_PATH and LD_LIBRARY_PATH variables before running

Compiling

The folder spotlight_pt_port has an automatic build script. Ideally, you want to run cmake and make using the build files in this folder.
You can follow these steps to compile SPOTLIGHT:

  • Download libtorch.zip from the link above and unzip it. Remember the path to the libtorch folder extracted from the zip file
  • Enter the SPOTLIGHT folder on a terminal.
  • Enter the spotlight_pt_port folder and open the autobuild.sh script
  • Set LIBTORCH_LOC to the path of the extracted libtorch folder (e.g. LIBTORCH_LOC=/home/user/cpp/libtorch)
  • Save the file and run ./autobuild.sh
  • If all the libraries are installed, this should work perfectly. Use BASH as your shell for best results.
  • If you are using Ubuntu, a template install script is provided, which should run with minimum debugging required.

Building with Docker

The Dockerfile is provided in the docker folder. Building this file should result in a successfully built docker image. Executing SPOTLIGHT follows the same instructions.
Note: SPOTLIGHT will be installed to /usr/share/SPOTLIGHT in the image, and executables will be copied to /usr/bin. If any errors are observed on directly running these executables, you can use the original copies at /usr/share/SPOTLIGHT/spotlight_pt_port/build

Running the programs

Each of the executables after compiling can be directly run from any LINUX/UNIX terminal. Please ensure that libzmq is installed and up-to-date, or that the library (given in the distributable zip file) is in the LIBRARY_PATH
The input format varies for each executable.
Note: These programs can produce a large amount of running on-screen text. You may choose to run them by adding 1> /dev/null at the end of the command to avoid filling your screen. Sample commands to run these programs are given below:

  • ./prodvacPT target_size num_mols variance # (See below for explanation of variance)
  • ./production_noPT input.dnvin
  • ./productionPT input.dnvin

The programs that require RL require the files convsave.pt and decsave.pt to be present in one folder behind the current directory (where the file is executed). If you want to use our models, these files are present in spotlight_data/model

Preparing the protein

It is necessary to use a convention similar to the CHARMM27 forcefield protein atom naming in GROMACS.
If you have GROMACS 2018 or later installed, use the following command:
gmx pdb2gmx -f input_file -o output_file.gro -ff charmm27 -ignh
It is necessary to manually remove any capping and the protonated amine (NH3+ atoms) from the N-terminal and acid (COO- atoms) from the C terminal for protein targets. Usually this should be fine as active sites are located far enough from terminal residues.
If this is not the case, consider manually adding parameters into the respective RTP files in spotlight_data manually, or change the terminal atom types to treat them like normal amino-acid atom types without charges (also to be done manually). Note: protein.gro referred to anywhere (even in the input DNVin template) will refer to the GRO file for the target protein generated by this protocol.

Preparing the input file

Use the spotlight_pt_port/generate_sample_input.sh to generate a sample input file.
Many of the available options are explained below. Please leave other options (if any are present) unchanged, as they might not yet be implemented, or may require other programs to be running.

  • Protein: Path to the protein.gro file as prepared above
  • ProteinFormat: Input format for protein file. Only "pdb" and "gro" are supported. Using pdb does not exempt the requirement of atom names being associated with the CHARMM27 forcefield parameters.
  • Residues: A comma-separated list of residue numbers
  • RestrainDistance: Maximum distance of a newly placed heavy atom from the selected residue atoms. Hydrogen atoms from the protein are still considered.
  • ProgramName: Pick a name for the program. Most files generated by this program will start with or contain this phrase.
  • Sizes: Range of heavy atom sizes to target. Each size will be processed one after the other. This value can be comma-separated (10,12,14) or a range (20-30)
  • Step: Increment the target size by step each time. Useful for targetting a larger size range quickly (10-20 with step 2 gives 10,12,14,16,18,20)
  • MolCount: Minimum number of accepted molecules for each size before termination.
  • Oscillations: Final acceptence of molecules after these many oscillations in the Monte Carlo energy.
  • SeedCount: Number of seed positions to start with. Usually better to use significantly large numbers (between 250-1000)
  • Strategy_Restrain: Restrain the molecule? (Yes/No). Setting this to No will disable restraining the ligand growth and can cause trailing parts in the ligand that do not directly interact with the protein. It might still be useful to disable it for cases where the active site is too vague.
  • SourceFolder: Path to the data folder provided as dnv/data under this project.
  • Optimize: Perform a quick gradient descent optimization after each ligand is generated (Yes/No)

A protein GRO file (used in our article) is provided in test_data.

Training your own model

SPOTLIGHT uses the PyTorch C++ API. You can find our current implementations of simple models at ./dnv/support/mytorch.h
You can implement your own models by extending this header file and recompiling after editing the spotlight_pt_port/prodvac_pytorch_trainable.cpp file to load and run your model. Remember to then include this implementation in all other CPP files for final execution.
Once you put in your model, you can run it by running:

  • ./prodvacPT_trainable mol_size num_mols mol_count

About

SPOTLIGHT: Structure-based Prediction and Optimization Tool for LIgand Generation on Hard-to-drug Targets - Combining Deep Reinforcement Learning with Physics-based de novo drug design

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors