The repository contains the code and scripts to reproduce the results obtained in the paper "One class classification for the detection of β2 adrenergic receptor agonists using single-ligand dynamic interaction data"(J. Cheminform. 2022, 14, 74).
The scripts can be used to develop new models based on the same methodology.
The presented method uses protein-ligand interactions described as interaction pseudo-atoms (IPA) to rescore poses (J. Chem. Inf. Model. 2013, 53, 3, 623–637).
- Interaction detection requires a working installation of IChem 5.2.9
- A Linux/POSIX operating system
- Python 3 with the required dependencies and additional packages installed (see Install)
The installation package contains:
- The conda_environemnt.yaml file containing the dependencies
- The additional python packages required for the scripts
- The python scripts necessary to reproduce the paper's results
- Download the install package
$ git clone https://github.com/LIT-CCM-lab/OCSVM-ADRB2.git
$ cd OCSVM-ADRB2/
- Indicate the location of IChem Open the file pyichem/pyichem/software_path.yml and indicate where the desired executable version of IChem is located, an alias can also be indicated.
- Create a python virtual environment Using Conda/Anaconda:
$ conda env create -n ocsvm_adrb2 -f conda_environment.yml
$ conda activate ocsvm_adrb2
- Install the additional python packages
(ocsvm_adrb2) $ pip install mol2_trajectory/
(ocsvm_adrb2) $ pip install pyichem/
(ocsvm_adrb2) $ pip install ocsvm_training/
- Test installation
(ocsvm_adrb2) $ python -c "import mol2_trajectory"
(ocsvm_adrb2) $ python -c "import pyichem"
(ocsvm_adrb2) $ python -c "import ocsvm_training"
Before using the scripts for either testing or application it is necessary to activate the ocsvm_adrb2 conda environment.
Update the test.sh file with the location of the topology, trajectory, and .pdb file to use for model building, and the location of the docking-poses' interactions. The test.sh script can handle multiple trajectories with the same topology file, and a single file containing docking poses' interactions.
Outputs
- ichem_outputs/structures/ligand, folder containing the .mol2 file with the ligand structure
- ichem_outputs/structures/receptor, folder containing the .mol2 file with the receptor structure
- ichem_outputs/interactions, folder containing the .mol2 file with the IPA
- ichem_outputs/IFP, folder containing the raw ligands.ifp file generated by IChem
- ifp.csv, file containing the binary interaction fingerprints observed during the simulation
- ifp_map.csv, file containing the ligand and structure files corresponding to the different IFP
- interactions_map.csv, file containing the input and output file of the IPA. It is used as input for the training.py script
- interactions_map_newhyd.csv, file containing the input and output file of the IPA with the stricter definition of hydrophobic contacts. It is used as input for the training.py script
- mad_kernel_newhyd.sav, mad_kernel.sav, file containing the graph kernel obtained after MAD training
- qms2_kernel_newhyd.sav, qms2_kernel.sav, file containing the graph kernel obtained after QMS2 training
- mad_ocsvm_newhyd.sav, mad_ocsvm.sav, OCSVM models trained using the MAD heuristic
- qms2_ocsvm_newhyd.sav, qms2_ocsvm.sav, OCSVM models trained using the QMS2 heuristic
- MD_rescoring_0.csv, MD_rescoring_0_newhyd.csv, results of rescoring using the MAD trained models
- MD_rescoring_1.csv, MD_rescoring_1_newhyd.csv, results of rescoring using the QMS2 trained models
- rescoring_report.txt, rescoring_report_newhyd.txt, report containing the number of selected molecules by each model
- training_report.txt, training_report_newhyd.txt, report containing information on the model training
- trajectory_conversion_report.txt, report containing information on the trajectories converted to .mol2 files and possible errors encounetered during conversion
- Use the trajectory_converter.py script to convert the trajectory file in pairs of .mol2 files
- Use the compute_interactions.py script to compute the IPA, it is possbile to select between one or both the definitons of hydrophobic contacts
- Use the training.py script to train the model, it is possbile to select between one or both of the training heuristics
- Use scoring.py to rescore the IPA files previously obtained from the docking poses
The script has been tested with different type of inputs for topology and coordinates file.
- Topology
- AMBER TOP: .prmtop, .top, parm7
- CHARMM PSF: .psf
- Coordinate
- AMBER CRD: .inpcrd
- AMBER RST: .inprst
- AMBER TRJ: .trj
- AMBER NetCDF: .ncdf, .nc
- CHARMM DCD: .dcd
- GROMACS XTC: .xtc
- GROMACS TRR: .trr
The current implementation of the scripts accepts only topologies with atom types from the AMBER or CHARMM force fields, other force fields (including force fields using SYBYL atom types) are not supported. Some atom types of the aforementioned force field might not have been implemented in the current version of the script.
This work was supported by the Agence Nationale de Recherches (ANR) (grant number 2019 CE14 OCHRE to E.K.), Centre National de la Recherche Scientifique (CNRS), Institut du Médicament de Strasbourg (IMS), and Université de Strasbourg