PocketHotspot is a tool for structure-based drug design that extends MolSnapper by providing advanced methods for selecting initial atom placement and pharmacophore-guided molecule generation.
PocketHotspot provides enhanced sampling capabilities for generating drug-like molecules in protein pockets. It includes:
- Multiple initial atom selection methods: score-based, pharmacophore locator, H-bond predictor, and random selection
- Cavity detection: Multiple modes including pyKVFinder, ligand proximity, and ligand coordinates
PocketHotspot requires [MolSnapper] as a dependency. Both repositories should be cloned as sibling directories.
First, clone the MolSnapper repository (the original repository):
cd /path/to/your/workspace
git clone https://github.com/oxpig/MolSnapper.git MolSnapper
cd MolSnapperClone PocketHotspot as a sibling directory to MolSnapper:
cd /path/to/your/workspace
git clone https://github.com/doesm/PocketHotspot.git PocketHotspotYour directory structure should look like this:
your_workspace/
├── MolSnapper/ # External dependency
│ ├── models/
│ ├── configs/
│ ├── utils/
│ ├── ckpt/
│ │ ├── MolDiff.pt
│ │ └── bond_predictor.pt
│ └── ...
└── PocketHotspot/ # This repository
├── utils/
├── methods/
├── data/
├── sample_pocket.py
├── sample_dataset.py
└── ...
The environment uses Python 3.9, PyTorch 2.0 and CUDA 11.7.
# Create and activate the conda environment (recommended)
conda env create -f PocketHotspot_env.yml
conda activate PocketHotspotUse sample_pocket.py to generate molecules for a single protein pocket:
cd PocketHotspot
python sample_pocket.py \
--receptor path/to/receptor.pdb \
--ligand path/to/ligand.sdf \
--device cuda:0 \
--ref_atoms_method pharmacophore_locator \
--config ../MolSnapper/configs/sample/sample_MolDiff.yml--receptor: Path to receptor PDB/PDBQT file (required)--ligand: Path to ligand SDF/PDBQT/MOL2 file. Required forscore_basedand for cavity modesligand_coords,ligand_proximity,kvfinder_with_ligand(and whenautouses them); optional otherwise.--config: Path to MolSnapper config file (default:../MolSnapper/configs/sample/sample_MolDiff.yml)--device: Device (cuda:0orcpu; default:cuda:0)--batch_size: Batch size for generation (default: 8)--mol_size: Target molecule size (default: 20)--clash_rate: Clash rate for pipeline (default: 0.1)--ref_atoms_method: Method to select reference atoms (default:hbond_predictor)pharmacophore_locator: Pharmacophore featuresscore_based: Affinity scores (requires--ligand)hbond_predictor: H-bond prediction modelrandom: Random atom placement
--pocket_detection: Cavity detection mode (default:ligand_coords). Modes that require--ligand:ligand_coords,ligand_proximity,kvfinder_with_ligand.kvfinder: pyKVFinder blind detection (no ligand)kvfinder_interactive: pyKVFinder interactive (no ligand)kvfinder_with_ligand: pyKVFinder with ligandligand_coords: Use ligand coordinates directlyligand_proximity: Grid around ligandauto: Try modes in order until one works
--ligand_proximity_radius: Radius (Å) forligand_proximitymode (default: 4.0)
Method-specific options (--atom_fraction, --cutoff, --top_k_per_type, --hbond_model_path, --random_num_atoms, etc.) are listed in Methods.
python sample_pocket.py \
--receptor data/5NGZ/5ngz_A_rec.pdb \
--ligand data/5NGZ/5ngz_A_rec_5ngz_2bg_lig_tt_min_0.sdf \
--device cuda:0 \
--ref_atoms_method hbond_predictor \
--pocket_detection ligand_proximity \
--ligand_proximity_radius 5.0Use sample_dataset.py to generate molecules for multiple pockets from CrossDocked datasets:
cd PocketHotspot
python sample_dataset.py \
--config ../MolSnapper/configs/sample/sample_MolDiff.yml \
--dataset_dir ./data/crossdocked \
--device cuda:0 \
--ref_atoms_method pharmacophore_locator \
--batch_size 8 \
--clash_rate 0.1--config: Path to MolSnapper config file (default:../MolSnapper/configs/sample/sample_MolDiff.ymlor./configs/sample/sample_MolDiff.yml)--dataset_dir: Directory with CrossDocked dataset files (default:./data/crossdocked)--outdir: Output directory (default:./outputs)--device: Device (cuda:0orcpu; default:cuda:0)--batch_size: Batch size for generation (0 = use config default; default: 0)--clash_rate: Clash rate for pipeline (default: 0.1)--ref_atoms_method: Method to select reference atoms (default:pharmacophore_locator). Same choices assample_pocket.py--pocket_detection: Cavity detection mode (default:ligand_proximity). Same choices assample_pocket.py--ligand_proximity_radius: Radius (Å) forligand_proximitymode (default: 4.0)
Method-specific options (--atom_fraction, --cutoff, --top_k_per_type, --hbond_model_path, --cavity_max_dist, --random_*) are listed in Methods.
-
pharmacophore_locator: Selects atoms based on pharmacophore features
--cutoff: Maximum distance to sum contributions (default: 6.0)--top_k_per_type: Maximum pharmacophores per type (default: 3)
-
score_based: Selects atoms with best affinity scores
- Requires
--ligandargument --atom_fraction: Fraction of best atoms to select (default: 0.2)
- Requires
-
hbond_predictor: Uses trained EGNN model to predict H-bond sites
--hbond_model_path: Path to trained EGNN model (default:trained_hbond_predictor/best_model.pt)--cavity_max_dist: Maximum distance to cavity in Å (default: 4.0;sample_dataset.pyonly)
-
random: Random atom placement
--random_num_atoms: Number of atoms (default: 5)--random_min_distance: Minimum distance between atoms in Å (default: 1.5)--random_element_type: Element type (O,N,C, orrandom)--random_seed: Seed for reproducibility (optional)
Generated molecules are saved in the output directory with:
SMILES.txt: SMILES strings of generated moleculessamples_all.pt: PyTorch file with all generated molecules*_SDF/: Directory with SDF files of generated moleculesreference_atoms_initial.sdf: Initial reference atoms usedpocket.pdb: Detected/generated pocket structurecavity_points.pdb: Cavity points (if detected)