Tutorial for Running Atomic Convolutions (ACNNs) #1179

rbharath · 2018-03-24T17:12:51Z

We currently lack a tutorial for running atomic convolutions. Here's some of the material it would be good to cover in such a tutorial:

Introduce the basic structure of the Atomic Conv model. The code for this model is here: https://github.com/deepchem/deepchem/blob/master/deepchem/models/tensorgraph/models/atomic_conv.py.
Create an IPython notebook in https://github.com/deepchem/deepchem/tree/master/examples/notebooks for the atomic conv tutorial. This notebook can be modeled on https://deepchem.io/docs/notebooks/protein_ligand_complex_notebook.html, but with ACNNs rather than grid fingerprints.
The tutorial should featurize a selection of the pdbbind dataset and train a simple ACNN model. You might find the code in https://github.com/deepchem/deepchem/tree/master/contrib/atomicconv useful.

nitinprakash96 · 2018-03-24T17:19:39Z

Can I work on this?

rbharath · 2018-03-24T17:25:06Z

@nitinprakash96 Yes please go for it! Could you open a "WIP" pull request to track your on-going progress?

nitinprakash96 · 2018-03-24T19:54:05Z

#1180

nitinprakash96 · 2018-05-19T13:49:06Z

Hi @rbharath sorry I've been caught up with exams and semester project reports. So took a little late to implement this. But its over now and I can start writing this tutorial.
Just to get my doubt cleared, I wrote the structure of ACNN as described by the paper(I thought it explained the best). And as for the featurization, how exactly do we need to present the tutorial? I mean the featurization part?

rbharath · 2018-05-21T23:49:41Z

Great to hear you can work on this again!

For the featurization part, I think it can be presented as a method of processing 3D structures to extract the neighbor information.

nitinprakash96 · 2018-05-30T04:53:29Z

I tried some of the ways to do this. I tried constructing a Nx3 matrix of cartesian coordinates and compute neighbour list using that. But I keep running into ValueError: Bad Conformor Id whenever I try to convert the Atomic units from Angstrom to Bohr.

Is there a way to tackle this? I'm using https://github.com/deepchem/deepchem/blob/master/deepchem/feat/atomic_coordinates.py as my reference.

rbharath · 2018-05-30T22:44:27Z

@nitinprakash96 Can you post a minimal failing example? Will take a look

nitinprakash96 · 2018-06-01T17:14:08Z

import numpy as np
import os
import tensorflow as tf
import deepchem as dc
from deepchem.feat import Featurizer
import rdkit

data_dir = os.path.join(dc.utils.get_data_dir())
dataset_file= os.path.join(dc.utils.get_data_dir(), "pdbbind_core_df.csv.gz")
raw_dataset = dc.utils.save.load_from_disk(dataset_file)
df = raw_dataset['smiles'][:20]

for i in range(20):
    #print(df[i])
    mol = rdkit.Chem.MolFromSmiles(df[i])
    #print(mol)
    N = mol.GetNumAtoms()
    print(N)
    coords = np.zeros((N, 3))
    #print(coords)
        
    coords_raw = [mol.GetConformer(0).GetAtomPosition(i) for i in range(N)]
    for atom in range(N):
        coords[atom, 0] = coords_raw[atom].x
        coords[atom, 1] = coords_raw[atom].y
        coords[atom, 2] = coords_raw[atom].z
    coords = [coords]

And the error is ValueError: Bad Conformer Id

rbharath · 2018-06-03T17:41:27Z

I think we deprecated pdbbind_core_df.csv.gz a while back. Try instead using the function featurize_pdbbind from dc.molnet.load_function.pdbbind_datasets.featurize_pdbbind to featurize an atomic conv dataset.

nitinprakash96 · 2018-06-11T03:57:16Z

Hi, So while the featurization part is okay and I got it working. But in the same pdbbind_dataset module. I had a query.
Supposing if we run the following snippet,

split = "random"
subset = "full"
pdbbind_tasks, pdbbind_datasets, transformers = load_pdbbind_grid(
    split=split, subset=subset)
train_dataset, valid_dataset, test_dataset = pdbbind_datasets

It returns NoneType object is not iterable. Is this a bug or I'm missing something here?

nitinprakash96 · 2018-06-11T05:18:00Z

So looking into it a little more, if we use grid featurizer with load_pdbbind_grid we get caught into the above error. And doing it manually is no problem.

nitinprakash96 · 2018-06-14T03:01:39Z

@rbharath Is there a way to use TensorflowFragmentRegressor for acnns?

nitinprakash96 · 2018-06-26T16:03:44Z

@rbharath can you please help me out here? The following example should train a simple acnn model on the pdbbind dataset. But it throws the following error:

Exception in thread Thread-9:
Traceback (most recent call last):
  File "/home/nitin/anaconda3/envs/deepchem/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/home/nitin/anaconda3/envs/deepchem/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "/home/nitin/Documents/deepchem/deepchem/deepchem/models/tensorgraph/tensor_graph.py", line 1006, in _enqueue_batch
    for feed_dict in generator:
  File "/home/nitin/Documents/deepchem/deepchem/deepchem/models/tensorgraph/models/atomic_conv.py", line 238, in feed_dict_generator
    num_features = F_b[0][0].shape[1]
IndexError: tuple index out of range

Example code:

pdbind_dataset, tasks = featurize_pdbbind(feat='grid', subset='core')

split = 'random'
splitters = {
        'index': dc.splits.IndexSplitter(),
        'random': dc.splits.RandomSplitter(),
        'time': dc.splits.TimeSplitterPDBbind(pdbbind_dataset.ids)
    }
splitter = splitters[split]
train_dataset, valid_dataset, test_dataset = splitter.train_valid_test_split(pdbbind_dataset)
transformers = []

y_train = train_dataset.y

y_train *= -1 * 2.479 / 4.184
train_dataset = dc.data.DiskDataset.from_numpy(
    train_dataset.X,
    y_train,
    train_dataset.w,
    train_dataset.ids,
    tasks=tasks)

tg, feed_dict_generator, label = atomic_conv_model()
tg.fit_generator(feed_dict_generator(train_dataset, batch_size=24, epochs=10))

rbharath · 2018-06-27T22:17:03Z

Not sure what's happening here. Will need to find some time to sit down and take a crack at this code. Hope to do so later this week.

rbharath · 2023-04-06T21:39:20Z

Closing since implemented in #2431

rbharath added Good First Contribution Contribution Welcome labels Mar 24, 2018

rbharath mentioned this issue Mar 24, 2018

Problem loading Atomic conv (ACNN) datasets #856

Closed

ncfrey mentioned this issue Mar 9, 2021

[WIP] Tutorial 14 - Modeling Protein-Ligand Interactions with Atomic Convolutions #2431

Merged

1 task

rbharath closed this as completed Apr 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tutorial for Running Atomic Convolutions (ACNNs) #1179

Tutorial for Running Atomic Convolutions (ACNNs) #1179

rbharath commented Mar 24, 2018

nitinprakash96 commented Mar 24, 2018

rbharath commented Mar 24, 2018

nitinprakash96 commented Mar 24, 2018

nitinprakash96 commented May 19, 2018

rbharath commented May 21, 2018

nitinprakash96 commented May 30, 2018

rbharath commented May 30, 2018

nitinprakash96 commented Jun 1, 2018

rbharath commented Jun 3, 2018

nitinprakash96 commented Jun 11, 2018 •

edited

nitinprakash96 commented Jun 11, 2018

nitinprakash96 commented Jun 14, 2018

nitinprakash96 commented Jun 26, 2018

rbharath commented Jun 27, 2018

rbharath commented Apr 6, 2023

Tutorial for Running Atomic Convolutions (ACNNs) #1179

Tutorial for Running Atomic Convolutions (ACNNs) #1179

Comments

rbharath commented Mar 24, 2018

nitinprakash96 commented Mar 24, 2018

rbharath commented Mar 24, 2018

nitinprakash96 commented Mar 24, 2018

nitinprakash96 commented May 19, 2018

rbharath commented May 21, 2018

nitinprakash96 commented May 30, 2018

rbharath commented May 30, 2018

nitinprakash96 commented Jun 1, 2018

rbharath commented Jun 3, 2018

nitinprakash96 commented Jun 11, 2018 • edited

nitinprakash96 commented Jun 11, 2018

nitinprakash96 commented Jun 14, 2018

nitinprakash96 commented Jun 26, 2018

rbharath commented Jun 27, 2018

rbharath commented Apr 6, 2023

nitinprakash96 commented Jun 11, 2018 •

edited