<a href="https://colab.research.google.com/github/gcourtade/GeqShift/blob/main/GeqShift.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##GeqShift

Easy to use carbohydrate <sup>13</sup>C NMR chemical shift prediction using GeqShift: an E(3) equivariant graph neural network.

The original GeqShift code is available at https://github.com/mariabankestad/GeqShift.

The dataset of 1H and 13C NMR chemical shifts are available at https://github.com/mariabankestad/GeqShift.

Please read and cite the GeqShit paper:
[Bånkestad M., Dorst K. M., Widmalm G., Rönnols J. Carbohydrate NMR chemical shift prediction by GeqShift employing E(3) equivariant graph neural networks
*RSC Advances*, 2024](https://doi.org/10.1039/D4RA03428G)

##### Disclaimer
I made this Google Colab notebook for my own use and have no connection with the authors of the GeqShift paper. This notebook was is heavily inspired by and uses code from the [ColabFold](https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/AlphaFold2.ipynb#scrollTo=mbaIO9pWjaN0) notebook. The model was trained using 100 conformations of the carbohydrates in the training set and 20 epochs. I cannot guarantee the correctness of the results generated using this code.

--[Gaston Courtade](https://folk.ntnu.no/courtade), 2024-08-24

In [33]:
#@title Input carbohydrate SMILES, then hit `Runtime` -> `Run all`


query_smiles = 'O=C([O-])[C@H]4O[C@@H](O[C@H]1[C@@H](O)[C@H](O)[C@@H](CO)O[C@@H]1O[C@H]3[C@H](O)[C@@H](CO)O[C@@H](O[C@H]2[C@H](O)[C@@H](O)[C@@H](O)O[C@@H]2CO)[C@@H]3O)[C@H](O)[C@@H](O)[C@@H]4O[C@@H]5O[C@H](CO)[C@@H](O)[C@H](O)[C@@H]5O' #@param {type:"string"}
#@markdown  - Tip: Use [SMILES generator/checker](https://www.cheminfo.org/flavor/malaria/Utilities/SMILES_generator___checker/index.html) to edit SMILES.
jobname = 'NS-xanthan' #@param {type:"string"}
# number of models to use
num_conformations = 10 #@param {type: "integer"}
#@markdown - Specify how many conformations should be generated in the ensemble for chemical shift prediction
#@markdown - Tip: Best results are expected with around 100 conformations


In [34]:
#@title Install dependencies

import torch
import os

torch_url = "https://pytorch-geometric.com/whl/torch-{}.html".format(torch.__version__).replace('+', '%2B')
!pip install torch-cluster torch-geometric torch-scatter -f $torch_url
!pip install e3nn
!pip install rdkit
!pip install mdanalysis
!pip install py3Dmol

if not os.path.exists("GeqShift"):
  !git clone https://github.com/gcourtade/GeqShift.git

checkpoint_file = "_checkpoint_epoch_6.pkl"
if not os.path.exists(checkpoint_file):
  checkpoint_url = "https://folk.ntnu.no/courtade/GeqShift_models/" + checkpoint_file
  !curl -O $checkpoint_url

!curl -O https://folk.ntnu.no/courtade/GeqShift_models/predict_gpu.py



Looking in links: https://pytorch-geometric.com/whl/torch-2.3.1%2Bcu121.html
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 12131  100 12131    0     0  10261      0  0:00:01  0:00:01 --:--:-- 10263


In [None]:
#@title Run prediction
!python predict_gpu.py --smiles_list "$query_smiles" --mol_name "$jobname" --checkpoint_path $checkpoint_file --nbr_confs $num_conformations

{1: 0.0, 3: 0.0, 5: 0.0, 7: 0.0, 8: 0.0, 10: 0.0, 12: 0.0, 13: 0.0, 16: 0.0, 18: 0.0, 19: 0.0, 21: 0.0, 22: 0.0, 25: 0.0, 27: 0.0, 28: 0.0, 30: 0.0, 32: 0.0, 35: 0.0, 36: 0.0, 38: 0.0, 40: 0.0, 42: 0.0, 44: 0.0, 46: 0.0, 48: 0.0, 49: 0.0, 51: 0.0, 53: 0.0, 55: 0.0, }


In [None]:
#@title Display 3D structure with <sup>13</sup>C chemical shifts

import py3Dmol
import glob

def parse_pdb(pdb_file):
    atoms = []
    with open(pdb_file, 'r') as file:
        for line in file:
            if line.startswith("HETATM"):
                # Extract atom name, coordinates, and B-factor
                atom_name = line[12:16].strip()
                x = float(line[30:38].strip())
                y = float(line[38:46].strip())
                z = float(line[46:54].strip())
                bfactor = float(line[60:66].strip())
                atoms.append((atom_name, x, y, z, bfactor))
    return atoms

pdb_filename = f"/content/predict/{jobname}_shifts.pdb"
pdb_file = glob.glob(pdb_filename)
atoms = parse_pdb(pdb_filename)

view = py3Dmol.view(js='https://3dmol.org/build/3Dmol.js',)
view.addModel(open(pdb_file[0],'r').read(),'pdb')

view.setStyle({'stick': {}})

for atom_name, x, y, z, bfactor in atoms:
    if bfactor != 0:
        label_content = f"{bfactor:.2f}"
        view.addLabel(label_content, {'position': {'x': x, 'y': y, 'z': z}, 'fontSize': 12})

view.zoomTo()

view.show()


In [None]:
#@title Download the results
!zip -r /content/${jobname}.zip /content/predict
from google.colab import files
files.download("/content/predict.zip")