This repository is the official implementation of DePPA, a structure-based molecule optimization method.
conda create -n deppa python=3.10
conda install -c conda-forge rdkit=2022.9.5 numpy=1.26.4 scipy biopython=1.79 openbabel imageio seaborn wandb debugpy matplotlib spyrmsd qvina=2.1.0
conda install -c conda-forge prolif datamol
conda install -c mx reduce
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118
pip install torch-scatter -f https://data.pyg.org/whl/torch-2.1.0+cu118.html
pip install torch-geometric==2.6.1 pytorch-lightning==2.5.5
pip install posecheck
pip install hydride- Download the dataset archive crossdocked_pocket10.tar.gz and the split file split_by_name.pt from this link.
- Extract the TAR archive using the command: tar -xzvf crossdocked_pocket10.tar.gz.
The pretrained model for RL to optimize is to be downloaded from this link and saved to folder checkpoints/
python -u batch_generate_ligands_rl.py checkpoints/crossdocked_fullatom_cond.ckpt --dataset_dir datasets/processed_crossdock_noH_full_temp/test --sanitize --n_samples 32 --rollouts 100 --inference_interval 5 --wandb_mode online --w_qed 0.2 --w_sa 0.2 --w_vina_score 0.5 --w_distance 0.1 --w_strain 0.0Evaluate final sampling results This evaluates QuickVina2 metrics (Vina Score, Vina Min, Vina Dock, scRMSD), PoseCheck metrics (Strain Energy, Steric Clash), diversity
python post_metrics.py --results_dir {path/to/results} --csv_name raw_eval.csv --pocket_pdb_dir {path/to/testset}
python summarize_results.py --results_dir {path/to/results} --csv_name {path/to/post_filled_csv}Evaluate the top-N ligands
python rerank_summarize_results.py --results_dir {path/to/results} --top_n 10
python python post_metrics.py --results_dir {path/to/results} --csv_name {top_n_csv}
python summarize_results.py --results_dir {path/to/results} --csv_name {path/to/post_filled_csv}Compute high-affinity rate in two steps.
First, compute the binding affinity of the reference ligand molecules given test set directory.
python eval_high_affinity.py --compute_ref_affinity --pocket_pdb_dir {path/to/testset}This writes the reference affinity CSV to affinity_ref.csv.
Second, compute the high-affinity rate for generated ligands using the sampling results directory, the per-pocket CSV file name, the reference affinity CSV, and the Vina metric to compare.
python eval_high_affinity.py --results_dir {path/to/results} --csv_name {csv_name} --ref_affinity_csv affinity_ref.csv --vina_mode vina_dockUse --vina_mode vina_score or --vina_mode vina_dock depending on which metric should define high affinity.
