PocketHotspot

PocketHotspot is a tool for structure-based drug design that extends MolSnapper by providing advanced methods for selecting initial atom placement and pharmacophore-guided molecule generation.

Overview

PocketHotspot provides enhanced sampling capabilities for generating drug-like molecules in protein pockets. It includes:

Multiple initial atom selection methods: score-based, pharmacophore locator, H-bond predictor, and random selection
Cavity detection: Multiple modes including pyKVFinder, ligand proximity, and ligand coordinates

Prerequisites

PocketHotspot requires [MolSnapper] as a dependency. Both repositories should be cloned as sibling directories.

Installation

1. Clone MolSnapper

First, clone the MolSnapper repository (the original repository):

cd /path/to/your/workspace
git clone https://github.com/oxpig/MolSnapper.git MolSnapper
cd MolSnapper

2. Clone PocketHotspot

Clone PocketHotspot as a sibling directory to MolSnapper:

cd /path/to/your/workspace
git clone https://github.com/doesm/PocketHotspot.git PocketHotspot

3. Verify Directory Structure

Your directory structure should look like this:

your_workspace/
├── MolSnapper/          # External dependency
│   ├── models/
│   ├── configs/
│   ├── utils/
│   ├── ckpt/
│   │   ├── MolDiff.pt
│   │   └── bond_predictor.pt
│   └── ...
└── PocketHotspot/       # This repository
    ├── utils/
    ├── methods/
    ├── data/
    ├── sample_pocket.py
    ├── sample_dataset.py
    └── ...

4. Install Dependencies

The environment uses Python 3.9, PyTorch 2.0 and CUDA 11.7.

# Create and activate the conda environment (recommended)
conda env create -f PocketHotspot_env.yml
conda activate PocketHotspot

Usage

Sampling for a Single Pocket

Use sample_pocket.py to generate molecules for a single protein pocket:

cd PocketHotspot
python sample_pocket.py \
    --receptor path/to/receptor.pdb \
    --ligand path/to/ligand.sdf \
    --device cuda:0 \
    --ref_atoms_method pharmacophore_locator \
    --config ../MolSnapper/configs/sample/sample_MolDiff.yml

Key Arguments

--receptor: Path to receptor PDB/PDBQT file (required)
--ligand: Path to ligand SDF/PDBQT/MOL2 file. Required for score_based and for cavity modes ligand_coords, ligand_proximity, kvfinder_with_ligand (and when auto uses them); optional otherwise.
--config: Path to MolSnapper config file (default: ../MolSnapper/configs/sample/sample_MolDiff.yml)
--device: Device (cuda:0 or cpu; default: cuda:0)
--batch_size: Batch size for generation (default: 8)
--mol_size: Target molecule size (default: 20)
--clash_rate: Clash rate for pipeline (default: 0.1)
--ref_atoms_method: Method to select reference atoms (default: hbond_predictor)
- pharmacophore_locator: Pharmacophore features
- score_based: Affinity scores (requires --ligand)
- hbond_predictor: H-bond prediction model
- random: Random atom placement
--pocket_detection: Cavity detection mode (default: ligand_coords). Modes that require --ligand: ligand_coords, ligand_proximity, kvfinder_with_ligand.
- kvfinder: pyKVFinder blind detection (no ligand)
- kvfinder_interactive: pyKVFinder interactive (no ligand)
- kvfinder_with_ligand: pyKVFinder with ligand
- ligand_coords: Use ligand coordinates directly
- ligand_proximity: Grid around ligand
- auto: Try modes in order until one works
--ligand_proximity_radius: Radius (Å) for ligand_proximity mode (default: 4.0)

Method-specific options (--atom_fraction, --cutoff, --top_k_per_type, --hbond_model_path, --random_num_atoms, etc.) are listed in Methods.

Example

python sample_pocket.py \
    --receptor data/5NGZ/5ngz_A_rec.pdb \
    --ligand data/5NGZ/5ngz_A_rec_5ngz_2bg_lig_tt_min_0.sdf \
    --device cuda:0 \
    --ref_atoms_method hbond_predictor \
    --pocket_detection ligand_proximity \
    --ligand_proximity_radius 5.0

Sampling with Datasets

Use sample_dataset.py to generate molecules for multiple pockets from CrossDocked datasets:

cd PocketHotspot
python sample_dataset.py \
    --config ../MolSnapper/configs/sample/sample_MolDiff.yml \
    --dataset_dir ./data/crossdocked \
    --device cuda:0 \
    --ref_atoms_method pharmacophore_locator \
    --batch_size 8 \
    --clash_rate 0.1

Key Arguments

--config: Path to MolSnapper config file (default: ../MolSnapper/configs/sample/sample_MolDiff.yml or ./configs/sample/sample_MolDiff.yml)
--dataset_dir: Directory with CrossDocked dataset files (default: ./data/crossdocked)
--outdir: Output directory (default: ./outputs)
--device: Device (cuda:0 or cpu; default: cuda:0)
--batch_size: Batch size for generation (0 = use config default; default: 0)
--clash_rate: Clash rate for pipeline (default: 0.1)
--ref_atoms_method: Method to select reference atoms (default: pharmacophore_locator). Same choices as sample_pocket.py
--pocket_detection: Cavity detection mode (default: ligand_proximity). Same choices as sample_pocket.py
--ligand_proximity_radius: Radius (Å) for ligand_proximity mode (default: 4.0)

Method-specific options (--atom_fraction, --cutoff, --top_k_per_type, --hbond_model_path, --cavity_max_dist, --random_*) are listed in Methods.

Methods

Initial Atom Selection Methods

pharmacophore_locator: Selects atoms based on pharmacophore features
- --cutoff: Maximum distance to sum contributions (default: 6.0)
- --top_k_per_type: Maximum pharmacophores per type (default: 3)
score_based: Selects atoms with best affinity scores
- Requires --ligand argument
- --atom_fraction: Fraction of best atoms to select (default: 0.2)
hbond_predictor: Uses trained EGNN model to predict H-bond sites
- --hbond_model_path: Path to trained EGNN model (default: trained_hbond_predictor/best_model.pt)
- --cavity_max_dist: Maximum distance to cavity in Å (default: 4.0; sample_dataset.py only)
random: Random atom placement
- --random_num_atoms: Number of atoms (default: 5)
- --random_min_distance: Minimum distance between atoms in Å (default: 1.5)
- --random_element_type: Element type (O, N, C, or random)
- --random_seed: Seed for reproducibility (optional)

Output

Generated molecules are saved in the output directory with:

SMILES.txt: SMILES strings of generated molecules
samples_all.pt: PyTorch file with all generated molecules
*_SDF/: Directory with SDF files of generated molecules
reference_atoms_initial.sdf: Initial reference atoms used
pocket.pdb: Detected/generated pocket structure
cavity_points.pdb: Cavity points (if detected)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PocketHotspot

Overview

Prerequisites

Installation

1. Clone MolSnapper

2. Clone PocketHotspot

3. Verify Directory Structure

4. Install Dependencies

Usage

Sampling for a Single Pocket

Key Arguments

Example

Sampling with Datasets

Key Arguments

Methods

Initial Atom Selection Methods

Output

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
bin		bin
data		data
methods		methods
trained_hbond_predictor		trained_hbond_predictor
utils		utils
.gitignore		.gitignore
PocketHotspot_env.yml		PocketHotspot_env.yml
README.md		README.md
requirements.txt		requirements.txt
sample_dataset.py		sample_dataset.py
sample_pocket.py		sample_pocket.py

Folders and files

Latest commit

History

Repository files navigation

PocketHotspot

Overview

Prerequisites

Installation

1. Clone MolSnapper

2. Clone PocketHotspot

3. Verify Directory Structure

4. Install Dependencies

Usage

Sampling for a Single Pocket

Key Arguments

Example

Sampling with Datasets

Key Arguments

Methods

Initial Atom Selection Methods

Output

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages