# Deep Supervised Graph Partitioning Model (DSGPM)

This Collab illustrates how to use pre-trained DSGPM model to predict CG mappings. DSGPM is a graph neural network graph partitioning model that can predict CG mappings of small to arbitrarily large molecules. 

## Citation


```
@Article{D0SC02458A,
author ="Li, Zhiheng and Wellawatte, Geemi P. and Chakraborty, Maghesree and Gandhi, Heta A. and Xu, Chenliang and White, Andrew D.",
title  ="Graph neural network based coarse-grained mapping prediction",
journal  ="Chem. Sci.",
year  ="2020",
pages  ="-",
publisher  ="The Royal Society of Chemistry",
doi  ="10.1039/D0SC02458A",
url  ="http://dx.doi.org/10.1039/D0SC02458A",
}

```




## Setup the conda environment and install dependencies

In [None]:
#@title
from IPython.utils import io
import os
import subprocess
import tqdm.notebook

TQDM_BAR_FORMAT = '{l_bar}{bar}| {n_fmt}/{total_fmt} [elapsed: {elapsed} remaining: {remaining}]'

try:
  with tqdm.notebook.tqdm(total=100, bar_format=TQDM_BAR_FORMAT) as pbar:
    with io.capture_output() as captured:

      %shell rm -rf /opt/conda
      %shell wget -q -P /tmp \
        https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \
          && bash /tmp/Miniconda3-latest-Linux-x86_64.sh -b -p /opt/conda \
          && rm /tmp/Miniconda3-latest-Linux-x86_64.sh
      pbar.update(10)

      PATH=%env PATH
      %env PATH=/opt/conda/bin:{PATH}
      %shell conda update -qy conda && conda install -qy -c conda-forge python=3.7
      pbar.update(15)
      #%shell conda install -qy -c pytorch cudatoolkit=10.2 
      pbar.update(15)
      %shell conda install -qy -c conda-forge rdkit
      pbar.update(15)
      %shell git clone https://github.com/rochesterxugroup/DSGPM.git
      pbar.update(15)
      %shell python -m pip install torchvision 
      pbar.update(15)
      %shell python -m pip install scikit-learn~=0.21.3 numpy~=1.19.1 scipy~=1.3.1  networkx~=2.4 tqdm~=4.47.0
      pbar.update(15)
except subprocess.CalledProcessError:
  print(captured)
  raise

HBox(children=(FloatProgress(value=0.0), HTML(value='')))




To install torch-geometry you must install the relevant packages and resolve version mismactches. We can do this by printing out the torch and cuda versions prior to instalation. 

```
pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-${TORCH}+${CUDA}.html
pip install torch-sparse -f https://pytorch-geometric.com/whl/torch-${TORCH}+${CUDA}.html
pip install torch-geometric
```
where {CUDA} and {TORCH} should be replaced by the specific CUDA version (cpu, cu92, cu101, cu102, cu110, cu111) and PyTorch version (1.4.0, 1.5.0, 1.6.0, 1.7.0, 1.7.1, 1.8.0, 1.8.1, 1.9.0), respectively.

For more details please visit PyTorch geometric [documentation!](https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html)

In [None]:
#@title
%shell python -c "import torch; print(torch.__version__)"
%shell python -c "import torch; print(torch.version.cuda)"

1.9.0+cu102
10.2




In [None]:
#@title
%shell python -m pip install pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-1.9.0+cu102.html
%shell python -m pip install torch-sparse -f https://pytorch-geometric.com/whl/torch-1.9.0+cu102.html
%shell python -m pip install torch-cluster -f https://pytorch-geometric.com/whl/torch-1.9.0+cu102.html
%shell python -m pip install torch-spline-conv -f https://pytorch-geometric.com/whl/torch-1.9.0+cu102.html
%shell python -m pip install torch-geometric

10.2




# Using DSGPM to generate CG mappings

In this example let's use SMILES strings as inputs to DSGPM. Save one or many SMILES string in text file in your local computer and upload here. If you wish to generate mappings for arbitrarily large molecules, you can use PDB format. Please refer to our [documentation](https://github.com/rochesterxugroup/DSGPM#readme) on how to work with PDB inputs. 




In [None]:
#@title Upload file with SMILES strings
from google.colab import files

uploaded = files.upload()

for fn in uploaded.keys():
  txt_input = fn

Saving SMILES.txt to SMILES.txt


In [None]:
#@title Set input and output paths
smiles_path = os.path.join(os.getcwd(),str(txt_input))
out_path = os.path.join('/content','dsgpm_out')


In [None]:
#@title Convert your SMILES string into a molecular graph
%shell python /content/DSGPM/generate_input_files/convert_to_json.py --smiles $smiles_path

CCC(=O)O

conversion complete
C(CCN)CC(C(=O)O)N

conversion complete




In [None]:
#@title Predict the CG mapping
%shell python /content/DSGPM/inference.py --pretrained_ckpt /content/DSGPM/model/DSGPM_trained.pth  --data_root /content/mol_graph/ --json_output_dir $out_path --num_cg_beads 3 --no_automorphism 

  0% 0/2 [00:00<?, ?it/s]100% 2/2 [00:00<00:00, 94.98it/s]




## To take a look at the output:


```
cd /content/dsgpm_out/dsgpm/
%shell cat <file name>
```

Our [documentation](https://github.com/rochesterxugroup/DSGPM#readme) provides a code snippet to generate SVG images of the outputs. Note: "smiles" field is required in json files to create images.


{
    "cgnodes": [
        [
            0,
            1
        ],
        [
            2,
            3
        ],
        [
            4
        ]
    ],
    "edges": [
        {
            "bondtype": 1.0,
            "source": 0,
            "target": 1
        },
        {
            "bondtype": 1.0,
            "source": 1,
            "target": 2
        },
        {
            "bondtype": 2.0,
            "source": 2,
            "target": 3
        },
        {
            "bondtype": 1.0,
            "source": 2,
            "target": 4
        }
    ],
    "nodes": [
        {
            "cg": 0,
            "element": "C",
            "id": 0
        },
        {
            "cg": 0,
            "element": "C",
            "id": 1
        },
        {
            "cg": 1,
            "element": "C",
            "id": 2
        },
        {
            "cg": 1,
            "element": "O",
            "id": 3
        },
        {
            "cg": 2,
            "elem

