Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Graph Neural Network Based Coarse-Grained Mapping Prediction

Open In Colab


This is a PyTorch implementation of Deep Supervised Graph Partitioning Model (DSGPM) described in paper Graph Neural Network Based Coarse-Grained Mapping Prediction.

DSGPM generates coarse-grained mapping (i.e., graph partitioning) for molecular graph. To achieve this, we collect Human-annotated Mappings (HAM) dataset to annotate mappings (i.e., graph partioning) of 1206 molecular graphs. DSGPM learns mapping from HAM dataset to predict mappings for unseen molecular graphs.


Here are the steps for environment setup:

  1. Install PyTorch 1.4

    conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch
  2. Install RDKit

    conda install -c conda-forge rdkit
  3. Install PyTorch Geometric

    Please follow steps for installing PyTorch Geometric at Please make sure that it's compatible with the installed CUDA version and PyTorch version.

  4. Install METIS

  5. Other packages

    pip install -r requirements.txt

HAM dataset

Download HAM Dataset via Then, follow the instruction to extract the tar ball.

More details of HAM dataset can be seen here:


To generate coarse-grained mapping (graph partitioning of molecular graph), please follow these steps:

  1. Download pretrained model from:

  2. run the following command:

    python \
      --pretrained_ckpt /path/to/DSGPM.pth \
      --data_root /path/to/dataset/ \  # folder containing json files
      --json_output_dir /path/to/folder/for/output/json_files \
      --num_cg_beads 5 6 7 \ # number of CG beads (graph partitions) of the generated mapping. If this option is not given, DSGPM will generate mapping in all possible number of CG beads (from 2 to number of atoms - 1).
      --no_automorphism # (optional) add this flag to turn off computing automorphism of the molecular graph, which is time-comsuming when it comes to large ones.


To predict CG mappings without downloading DSGPM Open In Colab


Please run this command to visualize coarse-grained mappings:

python \
	--vis_root /path/to/generated/images \
	--data_root /path/to/generated/json/files \ # folder containing json files
	--svg  # add this flag to generate svg images

Note: "smiles" field is required in json files.


To replicate our result, please run the following command:

python \
	--data_root /path/to/dataset \ # folder containing json files
	--batch_size 100 \
	--num_workers 20 \
	--ckpt /path/to/store/checkpoints \
	--tb_root /path/to/store/tensorboard/file

Generate input files

A user can convert molecules saved in PDB format or a SMILES to JSON format using This script currently reads only SMILES given in a text file.

python generate_input_files/ --pdb /path/to/pdb/containing/folder #for PDB to json conversion

python generate_input_files/ --smiles /path/to/SMILES/containing/textFile #for SMILES to json conversion


Please cite our paper if you use our code, pretrained model or dataset.

author ="Li, Zhiheng and Wellawatte, Geemi P. and Chakraborty, Maghesree and Gandhi, Heta A. and Xu, Chenliang and White, Andrew D.",
title  ="Graph neural network based coarse-grained mapping prediction",
journal  ="Chem. Sci.",
year  ="2020",
pages  ="-",
publisher  ="The Royal Society of Chemistry",
doi  ="10.1039/D0SC02458A",
url  ="",
abstract  ="The selection of coarse-grained (CG) mapping operators is a critical step for CG molecular dynamics (MD) simulation. It is still an open question about what is optimal for this choice and there is a need for theory. The current state-of-the art method is mapping operators manually selected by experts. In this work{,} we demonstrate an automated approach by viewing this problem as supervised learning where we seek to reproduce the mapping operators produced by experts. We present a graph neural network based CG mapping predictor called Deep Supervised Graph Partitioning Model (DSGPM) that treats mapping operators as a graph segmentation problem. DSGPM is trained on a novel dataset{,} Human-annotated Mappings (HAM){,} consisting of 1180 molecules with expert annotated mapping operators. HAM can be used to facilitate further research in this area. Our model uses a novel metric learning objective to produce high-quality atomic features that are used in spectral clustering. The results show that the DSGPM outperforms state-of-the-art methods in the field of graph segmentation. Finally{,} we find that predicted CG mapping operators indeed result in good CG MD models when used in simulation."}


Deep Supervised Graph Partitioning Model







No releases published


No packages published