This folder contains the distributable of executable files for the SPOTLIGHT program. A template input file is given here input_template.dnvin.
The source code is provided under ./dnv, and compilation instructions are given below
To compile SPOTLIGHT, you require the following packages/libraries available
- cmake (V 3.0+)
- make
- OpenMPI (V 3.1+)
- ZeroMQ libraries for C++ (and preferably python) - zeromq and cppzmq
- libtorch for C++ (specific version, packaged with SPOTLIGHT - Latest version will NOT work due to deprecated implementations having changes. The version packaged with SPOTLIGHT is available at: [Dropbox Link])
Add libtorch/libs to the LIBRARY_PATH and LD_LIBRARY_PATH variables before running
The folder spotlight_pt_port has an automatic build script. Ideally, you want to run cmake and make using the build files in this folder.
You can follow these steps to compile SPOTLIGHT:
- Download libtorch.zip from the link above and unzip it. Remember the path to the libtorch folder extracted from the zip file
- Enter the SPOTLIGHT folder on a terminal.
- Enter the
spotlight_pt_portfolder and open theautobuild.shscript - Set
LIBTORCH_LOCto the path of the extracted libtorch folder (e.g.LIBTORCH_LOC=/home/user/cpp/libtorch) - Save the file and run ./autobuild.sh
- If all the libraries are installed, this should work perfectly. Use BASH as your shell for best results.
- If you are using Ubuntu, a template install script is provided, which should run with minimum debugging required.
The Dockerfile is provided in the docker folder. Building this file should result in a successfully built docker image. Executing SPOTLIGHT follows the same instructions.
Note: SPOTLIGHT will be installed to /usr/share/SPOTLIGHT in the image, and executables will be copied to /usr/bin. If any errors are observed on directly running these executables, you can use the original copies at /usr/share/SPOTLIGHT/spotlight_pt_port/build
Each of the executables after compiling can be directly run from any LINUX/UNIX terminal. Please ensure that libzmq is installed and up-to-date, or that the library (given in the distributable zip file) is in the LIBRARY_PATH
The input format varies for each executable.
Note: These programs can produce a large amount of running on-screen text. You may choose to run them by adding 1> /dev/null at the end of the command to avoid filling your screen.
Sample commands to run these programs are given below:
- ./prodvacPT target_size num_mols variance # (See below for explanation of variance)
- ./production_noPT input.dnvin
- ./productionPT input.dnvin
The programs that require RL require the files convsave.pt and decsave.pt to be present in one folder behind the current directory (where the file is executed).
If you want to use our models, these files are present in spotlight_data/model
It is necessary to use a convention similar to the CHARMM27 forcefield protein atom naming in GROMACS.
If you have GROMACS 2018 or later installed, use the following command:
gmx pdb2gmx -f input_file -o output_file.gro -ff charmm27 -ignh
It is necessary to manually remove any capping and the protonated amine (NH3+ atoms) from the N-terminal and acid (COO- atoms) from the C terminal for protein targets. Usually this should be fine as active sites are located far enough from terminal residues.
If this is not the case, consider manually adding parameters into the respective RTP files in spotlight_data manually, or change the terminal atom types to treat them like normal amino-acid atom types without charges (also to be done manually).
Note: protein.gro referred to anywhere (even in the input DNVin template) will refer to the GRO file for the target protein generated by this protocol.
Use the spotlight_pt_port/generate_sample_input.sh to generate a sample input file.
Many of the available options are explained below. Please leave other options (if any are present) unchanged, as they might not yet be implemented, or may require other programs to be running.
Protein: Path to the protein.gro file as prepared aboveProteinFormat: Input format for protein file. Only "pdb" and "gro" are supported. Using pdb does not exempt the requirement of atom names being associated with the CHARMM27 forcefield parameters.Residues: A comma-separated list of residue numbersRestrainDistance: Maximum distance of a newly placed heavy atom from the selected residue atoms. Hydrogen atoms from the protein are still considered.ProgramName: Pick a name for the program. Most files generated by this program will start with or contain this phrase.Sizes: Range of heavy atom sizes to target. Each size will be processed one after the other. This value can be comma-separated (10,12,14) or a range (20-30)Step: Increment the target size by step each time. Useful for targetting a larger size range quickly (10-20 with step 2 gives 10,12,14,16,18,20)MolCount: Minimum number of accepted molecules for each size before termination.Oscillations: Final acceptence of molecules after these many oscillations in the Monte Carlo energy.SeedCount: Number of seed positions to start with. Usually better to use significantly large numbers (between 250-1000)Strategy_Restrain: Restrain the molecule? (Yes/No). Setting this to No will disable restraining the ligand growth and can cause trailing parts in the ligand that do not directly interact with the protein. It might still be useful to disable it for cases where the active site is too vague.SourceFolder: Path to the data folder provided asdnv/dataunder this project.Optimize: Perform a quick gradient descent optimization after each ligand is generated (Yes/No)
A protein GRO file (used in our article) is provided in test_data.
SPOTLIGHT uses the PyTorch C++ API. You can find our current implementations of simple models at ./dnv/support/mytorch.h
You can implement your own models by extending this header file and recompiling after editing the spotlight_pt_port/prodvac_pytorch_trainable.cpp file to load and run your model. Remember to then include this implementation in all other CPP files for final execution.
Once you put in your model, you can run it by running:
./prodvacPT_trainable mol_size num_mols mol_count