Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorial for Running Atomic Convolutions (ACNNs) #1179

Closed
rbharath opened this issue Mar 24, 2018 · 15 comments
Closed

Tutorial for Running Atomic Convolutions (ACNNs) #1179

rbharath opened this issue Mar 24, 2018 · 15 comments

Comments

@rbharath
Copy link
Member

We currently lack a tutorial for running atomic convolutions. Here's some of the material it would be good to cover in such a tutorial:

@nitinprakash96
Copy link
Member

Can I work on this?

@rbharath
Copy link
Member Author

@nitinprakash96 Yes please go for it! Could you open a "WIP" pull request to track your on-going progress?

@nitinprakash96
Copy link
Member

#1180

@nitinprakash96
Copy link
Member

Hi @rbharath sorry I've been caught up with exams and semester project reports. So took a little late to implement this. But its over now and I can start writing this tutorial.
Just to get my doubt cleared, I wrote the structure of ACNN as described by the paper(I thought it explained the best). And as for the featurization, how exactly do we need to present the tutorial? I mean the featurization part?

@rbharath
Copy link
Member Author

Great to hear you can work on this again!

For the featurization part, I think it can be presented as a method of processing 3D structures to extract the neighbor information.

@nitinprakash96
Copy link
Member

I tried some of the ways to do this. I tried constructing a Nx3 matrix of cartesian coordinates and compute neighbour list using that. But I keep running into ValueError: Bad Conformor Id whenever I try to convert the Atomic units from Angstrom to Bohr.

Is there a way to tackle this? I'm using https://github.com/deepchem/deepchem/blob/master/deepchem/feat/atomic_coordinates.py as my reference.

@rbharath
Copy link
Member Author

@nitinprakash96 Can you post a minimal failing example? Will take a look

@nitinprakash96
Copy link
Member

import numpy as np
import os
import tensorflow as tf
import deepchem as dc
from deepchem.feat import Featurizer
import rdkit

data_dir = os.path.join(dc.utils.get_data_dir())
dataset_file= os.path.join(dc.utils.get_data_dir(), "pdbbind_core_df.csv.gz")
raw_dataset = dc.utils.save.load_from_disk(dataset_file)
df = raw_dataset['smiles'][:20]

for i in range(20):
    #print(df[i])
    mol = rdkit.Chem.MolFromSmiles(df[i])
    #print(mol)
    N = mol.GetNumAtoms()
    print(N)
    coords = np.zeros((N, 3))
    #print(coords)
        
    coords_raw = [mol.GetConformer(0).GetAtomPosition(i) for i in range(N)]
    for atom in range(N):
        coords[atom, 0] = coords_raw[atom].x
        coords[atom, 1] = coords_raw[atom].y
        coords[atom, 2] = coords_raw[atom].z
    coords = [coords]

And the error is ValueError: Bad Conformer Id

@rbharath
Copy link
Member Author

rbharath commented Jun 3, 2018

I think we deprecated pdbbind_core_df.csv.gz a while back. Try instead using the function featurize_pdbbind from dc.molnet.load_function.pdbbind_datasets.featurize_pdbbind to featurize an atomic conv dataset.

@nitinprakash96
Copy link
Member

nitinprakash96 commented Jun 11, 2018

Hi, So while the featurization part is okay and I got it working. But in the same pdbbind_dataset module. I had a query.
Supposing if we run the following snippet,

split = "random"
subset = "full"
pdbbind_tasks, pdbbind_datasets, transformers = load_pdbbind_grid(
    split=split, subset=subset)
train_dataset, valid_dataset, test_dataset = pdbbind_datasets

It returns NoneType object is not iterable. Is this a bug or I'm missing something here?

@nitinprakash96
Copy link
Member

So looking into it a little more, if we use grid featurizer with load_pdbbind_grid we get caught into the above error. And doing it manually is no problem.

@nitinprakash96
Copy link
Member

@rbharath Is there a way to use TensorflowFragmentRegressor for acnns?

@nitinprakash96
Copy link
Member

@rbharath can you please help me out here? The following example should train a simple acnn model on the pdbbind dataset. But it throws the following error:

Exception in thread Thread-9:
Traceback (most recent call last):
  File "/home/nitin/anaconda3/envs/deepchem/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/home/nitin/anaconda3/envs/deepchem/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "/home/nitin/Documents/deepchem/deepchem/deepchem/models/tensorgraph/tensor_graph.py", line 1006, in _enqueue_batch
    for feed_dict in generator:
  File "/home/nitin/Documents/deepchem/deepchem/deepchem/models/tensorgraph/models/atomic_conv.py", line 238, in feed_dict_generator
    num_features = F_b[0][0].shape[1]
IndexError: tuple index out of range

Example code:

pdbind_dataset, tasks = featurize_pdbbind(feat='grid', subset='core')

split = 'random'
splitters = {
        'index': dc.splits.IndexSplitter(),
        'random': dc.splits.RandomSplitter(),
        'time': dc.splits.TimeSplitterPDBbind(pdbbind_dataset.ids)
    }
splitter = splitters[split]
train_dataset, valid_dataset, test_dataset = splitter.train_valid_test_split(pdbbind_dataset)
transformers = []

y_train = train_dataset.y

y_train *= -1 * 2.479 / 4.184
train_dataset = dc.data.DiskDataset.from_numpy(
    train_dataset.X,
    y_train,
    train_dataset.w,
    train_dataset.ids,
    tasks=tasks)

tg, feed_dict_generator, label = atomic_conv_model()
tg.fit_generator(feed_dict_generator(train_dataset, batch_size=24, epochs=10))

@rbharath
Copy link
Member Author

Not sure what's happening here. Will need to find some time to sit down and take a crack at this code. Hope to do so later this week.

@rbharath
Copy link
Member Author

rbharath commented Apr 6, 2023

Closing since implemented in #2431

@rbharath rbharath closed this as completed Apr 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants