<a href="https://colab.research.google.com/github/BallesterGroup/MolecularDocking/blob/main/PandaDock.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

🧬 **PandaDock: A Physics-Based Molecular Docking Software**


<div align="center">

🧪 **Santiago Bolívar A.**  
**Qco., M.Sc., Ph.D.**  
📍 *National University of Rosario, Argentina*  
🔗 [GitHub Profile](https://github.com/Sbolivar16)

</div>

---

PandaDock is a cutting-edge, Python-based molecular docking toolkit designed to support high-precision **drug discovery**, **computational chemistry**, and **bioinformatics** workflows. Developed by [**Dr. Pritam Kumar Panda**](https://github.com/pritampanda15), PandaDock combines traditional docking methodologies with **physics-based scoring functions**, delivering a powerful and flexible platform for scientific research.

---

### 🔍 Key Features

- **🧾 Flexible Input Parsing**  
  Supports multiple molecular formats (PDB, MOL, SDF), enabling smooth integration across datasets.

- **🎯 Binding Site Detection**  
  Offers both manual and automated detection using **CASTp-like algorithms** for accurate binding pocket identification.

- **🧠 Multiple Scoring Models**  
  - *Basic*: Van der Waals and hydrogen bonding.  
  - *Enhanced*: Adds electrostatics, desolvation, and hydrophobic interactions.  
  - *Physics-Based*: MM-GBSA-inspired decomposition for a deep energetic profile.

- **⚡ Hardware Acceleration**  
  Utilizes **GPU computing** via PyTorch/CuPy and **multi-core CPU parallelization** to significantly speed up computations.

- **🧬 Advanced Search Algorithms**  
  Includes Genetic Algorithm, Monte Carlo Simulated Annealing, PANDADOCK (Simulated Annealing + Final Minimization), Random Search, Gradient-based methods, and Replica-Exchange techniques.

- **🔄 Flexible Docking**  
  Allows **auto-flex** and **custom-flex** options for accounting protein flexibility during the docking process.

- **📊 High-Throughput Screening**  
  Enables batch processing with automated scoring, filtering, and summarization for large compound libraries.

- **📈 Detailed Reporting**  
  Provides extensive output including:  
  - Energy component breakdown  
  - Interaction profiles  
  - Pose clustering  
  - RMSD validation  
  - Interactive HTML visualizations

- **🔧 Extensible Python API**  
  Offers modular integration into custom pipelines for experienced users.

---

### ⚖️ Advantages Over Traditional Docking Programs

Conventional docking tools often struggle with:  
- Limited accuracy in modeling **complex molecular interactions**  
- Inability to incorporate **protein flexibility** effectively  
- Slower performance without hardware acceleration  

**PandaDock** addresses these issues by:  
✅ Integrating **physics-based scoring**  
✅ Supporting **advanced conformational search algorithms**  
✅ Leveraging **GPU/CPU acceleration** for rapid computations

---

### ⚠️ Limitations

- **High Computational Demand**  
  Physics-based models and advanced algorithms require powerful hardware, ideally HPC or GPU-enabled systems.

- **Steep Learning Curve**  
  Due to its flexibility and depth, new users may need time to master the full functionality of the platform.

---

### 📚 Reference

PandaDock is developed and actively maintained by [**Dr. Pritam Kumar Panda**](https://github.com/pritampanda15).  
🔗 GitHub Repository: [https://github.com/pritampanda15/PandaDock](https://github.com/pritampanda15/PandaDock)  
📘 Documentation & Wiki: [https://github.com/pritampanda15/PandaDock/wiki](https://github.com/pritampanda15/PandaDock/wiki)

> *This overview provides an academic summary of PandaDock's architecture, features, and advantages, intended for integration in computational drug discovery and molecular modeling workflows.*

---


In [None]:
# 1. Downgrade NumPy before installing RDKit
!pip install numpy==1.24.4

# 2. Install RDKit compatible with NumPy 1.x
!pip install rdkit-pypi

# 3. Install PyTorch with GPU support (optional but recommended)
!pip install torch torchvision torchaudio

# 4. Install PandaDock
!pip install pandadock



Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch)
  Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.2.1.3 (from torch)
  Downloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-curand-cu12==10.3.5.147 (from torch)
  Downloading nvidia_curand_cu12-10.3.5

In [1]:
!pandadock --help


/bin/bash: line 1: pandadock: command not found


#Example

In [10]:
!git init
!pwd
!ls -a ..
#!pandadock -p prot.pdb -l lig.sdf -s -25.09 100.717 3.524 --grid-radius 10.0 --grid-spacing 0.375 -a random --verbose --log-file pandadock_debug.log --cpu-workers 2 --num-orientations 30 --md-steps 100


Reinitialized existing Git repository in /content/.git/
/content
.			    .dockerenv	media			  run
..			    etc		mnt			  sbin
bin			    home	NGC-DL-CONTAINER-LICENSE  srv
boot			    kaggle	opt			  sys
content			    lib		proc			  tmp
cuda-keyring_1.1-1_all.deb  lib32	python-apt		  tools
datalab			    lib64	python-apt.tar.xz	  usr
dev			    libx32	root			  var


<div align="center">

## 🔬 **Detailed Analysis of Energy Components in the Top Docking Poses**

</div>

The molecular docking simulation generated a set of ligand-protein poses, each evaluated and ranked based on the **estimated total binding energy**. This allowed for the identification of the most favorable conformations in terms of interaction stability and complementarity.

To further dissect the nature of these interactions, the **top 6 poses** were analyzed individually, breaking down their most relevant energetic contributions:

---

### 📊 **Energy Component Breakdown (Top 6 Poses)**

| 🧷 **Pose** | ⚡ **Clash** | 💧 **Desolvation** | 🔋 **Electrostatics** | ♻️ **Entropy** | 🧪 **H-Bonds** | 🌑 **Hydrophobic** | 🔬 **Van der Waals** |
|------------|-------------|--------------------|------------------------|----------------|----------------|--------------------|-----------------------|
| 1          | -0.46       | -0.00              | -0.09                  | -0.12          | -0.09          | -0.09              | -0.14                 |
| 2          | -0.93       | -0.00              | -0.19                  | -0.23          | -0.19          | -0.19              | -0.28                 |
| 3          | -1.39       | -0.01              | -0.28                  | -0.35          | -0.28          | -0.28              | -0.42                 |
| 4          | -1.86       | -0.01              | -0.37                  | -0.46          | -0.37          | -0.37              | -0.56                 |
| 5          | -2.32       | -0.01              | -0.46                  | -0.58          | -0.46          | -0.46              | -0.70                 |
| 6          | -2.78       | -0.01              | -0.56                  | -0.70          | -0.56          | -0.56              | -0.84                 |

---

### 🧠 **Component Interpretation**

- **🧱 Clash (Steric Conflicts)**  
  More negative values indicate reduced spatial interference, reflecting a **better geometrical fit** of the ligand into the protein's binding site. Pose 6 shows the least steric clash, pointing to optimized spatial accommodation.

- **💧 Desolvation**  
  Near-zero values across poses suggest that **solvent removal** upon binding is not a major differentiator, possibly due to consistent polarity profiles across poses.

- **🔋 Electrostatics**  
  Increasingly negative values suggest **stronger charge-dipole interactions**, enhancing the electrostatic stabilization of the complex.

- **♻️ Entropy**  
  A progressive reduction in entropy reflects **loss of ligand conformational freedom**, which is expected during binding and is often balanced by other favorable energies.

- **🧪 Hydrogen Bonds**  
  The trend toward more negative values indicates an **increase in hydrogen bonding strength and/or frequency**, favoring complex stability and specificity.

- **🌑 Hydrophobic Interactions**  
  These become more favorable in better-ranked poses, suggesting improved **hydrophobic complementarity** between ligand and binding pocket.

- **🔬 Van der Waals**  
  These interactions are essential for fine molecular recognition, and their increasing contribution highlights a **tighter molecular fit**.

---

### ✅ **Conclusion**

The energy decomposition highlights that the best docking poses are characterized by:

- ⚡ Reduced steric clashes  
- 🔋 Stronger electrostatic and hydrogen bonding interactions  
- 🌑 Enhanced hydrophobic effects  
- 🔬 Improved van der Waals contacts  

Despite the expected entropy loss, the **overall energetic profile supports a stable and specific ligand-protein interaction**. These findings validate the top-ranked poses as **promising candidates** for further refinement and validation through molecular dynamics simulations.

---


<div align="center">

## 🧬 **Examples of Commands in PandaDock**

</div>

The following examples illustrate how to run **PandaDock** using different execution modes depending on your objectives — from standard docking to virtual screening, with options for GPU acceleration and enhanced scoring.

---

### ⚙️ Basic Docking (Default Genetic Algorithm)

Runs a standard docking using the **genetic algorithm**, with defined grid coordinates and GPU acceleration:

```bash
!pandadock -p protein.pdb -l ligand.sdf -s -15.7 -17.7 8.1 --grid-radius 10.0 --grid-spacing 0.375 -i 10 -a genetic --use-gpu




### 🧠 PandaDock with MMGBSA Scoring

Uses the **pandadock** algorithm with conformer generation and MMGBSA rescoring for improved accuracy:

```bash
!pandadock -p protein.pdb -l ligand.sdf -a pandadock --physics-based


### ⚡ GPU Acceleration

Run docking with GPU support to speed up calculations, combined with physics-based options:

```bash
!pandadock -p protein.pdb -l ligand.sdf --use-gpu --physics-based -s -15.7 -17.7 8.1 --grid-radius 10.0 --grid-spacing 0.375
```


🤖 Automatic Algorithm Selection
Uses the pandadock tool to automatically select the best algorithm for the docking task:

```bash
!pandadock -p protein.pdb -l ligand.sdf --auto-algorithm -s -15.7 -17.7 8.1 --grid-radius 10.0 --grid-
```

#🧪 Virtual Screening
Performs high-throughput virtual screening using a ligand library, parallel processing, and GPU acceleration:


```bash
!pandadock --virtual-screening --ligand-library ligands/ --parallel-screening -p receptor.pdb -s -15.7 -17.7 8.1 --use-gpu -a random --num-modes 9 --vs-exhaustiveness 1 --screening-processes 36 --top-hits 10 --grid-spacing 0.3


#🚀 Fast Mode
Enables quick docking with minimal enhancements for speed:

```bash
!pandadock -p protein.pdb -l ligand.sdf --enhanced-scoring --local-opt --exhaustiveness 5 --prepare-molecules --flex-residues -s -15.7 -17.7 8.1 --grid-radius 10.0


#🎯 Enhanced Mode
Combines scoring enhancements, local optimization, and flexible residues:

```bash
!pandadock -p protein.pdb -l ligand.sdf --enhanced-scoring --local-opt --exhaustiveness 5 --prepare-molecules --flex-residues -s -15.7 -17.7 8.1 --grid-radius 10.0


#⚛️ Physics Mode
Uses a physics-based algorithm for more accurate docking results:

```bash
!pandadock -p protein.pdb -l ligand.sdf --physics-based -s -15.7 -17.7 8.1 --grid-radius 10.0


#🐼 PandaDock Mode
Runs docking using the pandadock algorithm with conformer generation:

```bash
!pandadock -p protein.pdb -l ligand.sdf -a pandadock --prepare-molecules -s -15.7 -17.7 8.1 --grid-radius 10.0

