Skip to content

ctlee/bccgc4

Repository files navigation

Code for BioChemCoRe GC4 Team

The Drug Design Data Resource (D3R) hosts community drug discovery challenges that include goals of pose prediction, ligand affinity ranking, and free energy calculation. We participated in the ligand affinity ranking aspect of Grand Challenge 4, subchallenge 2, which focused on the Cathepsin S (CatS) system and 459 ligands.

This is a repository of relevant analysis scripts and methods on our work before the challenge and much exploration that was performed after the challenge.

Molecular Dynamics and Clustering

We first generated Molecular Dynamics (MD) trajectories of the CatS protein (PDBID: 5qc4) without the ligand (apo MD) and with the ligand (holo MD).

We then clustered the MD trajectories by 3 different algorithms:

  • Time-lagged Independent Components Analysis and K-means (TICA) in PyEMMA
  • Principal Components Analysis and K-means (PCA) in PyEMMA
  • RMSD-Based Clustering in Gromacs (Gromos)

And by 2 different atom selections:

  • Backbone atom positions (TICA and PCA) or Carbon alphas (Gromos)
  • Binding Atoms (defined as within 2 angstroms of initial docked poses)

For a total of 6 clustering methods, which from each we extracted 10 docking structures as the centroid of each of 10 discrete clusters.

Docking

We initially docked to OpenEye's Fast Exhaustive Docking (FRED), however, we did not produce results that effectively ranked the ligands.

We changed our docking software to Schrodinger's Glide, and explored various modifications of the docking conditions.

To obtain a rank ordering, we took the minimum score of each ligand out of the 10 scores it received for each clustering method.

Analysis

Centroid Analysis

We analyzed the structural variations in the centroids we obtained from clustering the MD trajectories across various methods, through investigating the Root-Mean-Squared-Deviations (RMSD) and Root-Mean-Squared-Fluctuations (RMSF) of the centroids, in MDTraj and AMBER18's cpptraj.

Kendall's Tau and Scoring Analysis

To analyze the how effective the Kendall's Taus of our results were, we compared the them against a distribution of random rank ordering.

In addition, we explored different scoring schemes where we took the average or weighted average ligand score instead of the minimum of the ensemble.

Pose Analysis

To investigate the accuracy of our poses, we compared the common ligand core RMSDs to the original co-crystal ligand in Schrodinger.

Ligand Analysis

To justify the comparison of pose accuracy to cocrystal poses, we investigated the similarity of the given test set ligands to the current co-crystal ligands available in the RCSB PDB databank through Tanimoto Coefficient.

Citing this Repository

Please cite this article

Gan, J.L., Kumar, D., Chen, C. et al. Benchmarking ensemble docking methods in D3R Grand Challenge 4. J Comput Aided Mol Des 36, 87–99 (2022). https://doi.org/10.1007/s10822-021-00433-2

if you use these scripts or analyses in your work.

About

BioChemCoRe D3R Grand Challenge 4 Team Repository

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages