-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tutorial for Running Atomic Convolutions (ACNNs) #1179
Comments
Can I work on this? |
@nitinprakash96 Yes please go for it! Could you open a "WIP" pull request to track your on-going progress? |
Hi @rbharath sorry I've been caught up with exams and semester project reports. So took a little late to implement this. But its over now and I can start writing this tutorial. |
Great to hear you can work on this again! For the featurization part, I think it can be presented as a method of processing 3D structures to extract the neighbor information. |
I tried some of the ways to do this. I tried constructing a Nx3 matrix of cartesian coordinates and compute neighbour list using that. But I keep running into Is there a way to tackle this? I'm using https://github.com/deepchem/deepchem/blob/master/deepchem/feat/atomic_coordinates.py as my reference. |
@nitinprakash96 Can you post a minimal failing example? Will take a look |
import numpy as np
import os
import tensorflow as tf
import deepchem as dc
from deepchem.feat import Featurizer
import rdkit
data_dir = os.path.join(dc.utils.get_data_dir())
dataset_file= os.path.join(dc.utils.get_data_dir(), "pdbbind_core_df.csv.gz")
raw_dataset = dc.utils.save.load_from_disk(dataset_file)
df = raw_dataset['smiles'][:20]
for i in range(20):
#print(df[i])
mol = rdkit.Chem.MolFromSmiles(df[i])
#print(mol)
N = mol.GetNumAtoms()
print(N)
coords = np.zeros((N, 3))
#print(coords)
coords_raw = [mol.GetConformer(0).GetAtomPosition(i) for i in range(N)]
for atom in range(N):
coords[atom, 0] = coords_raw[atom].x
coords[atom, 1] = coords_raw[atom].y
coords[atom, 2] = coords_raw[atom].z
coords = [coords] And the error is |
I think we deprecated |
Hi, So while the featurization part is okay and I got it working. But in the same split = "random"
subset = "full"
pdbbind_tasks, pdbbind_datasets, transformers = load_pdbbind_grid(
split=split, subset=subset)
train_dataset, valid_dataset, test_dataset = pdbbind_datasets It returns |
So looking into it a little more, if we use |
@rbharath Is there a way to use |
@rbharath can you please help me out here? The following example should train a simple acnn model on the pdbbind dataset. But it throws the following error: Exception in thread Thread-9:
Traceback (most recent call last):
File "/home/nitin/anaconda3/envs/deepchem/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/home/nitin/anaconda3/envs/deepchem/lib/python3.5/threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
File "/home/nitin/Documents/deepchem/deepchem/deepchem/models/tensorgraph/tensor_graph.py", line 1006, in _enqueue_batch
for feed_dict in generator:
File "/home/nitin/Documents/deepchem/deepchem/deepchem/models/tensorgraph/models/atomic_conv.py", line 238, in feed_dict_generator
num_features = F_b[0][0].shape[1]
IndexError: tuple index out of range Example code: pdbind_dataset, tasks = featurize_pdbbind(feat='grid', subset='core')
split = 'random'
splitters = {
'index': dc.splits.IndexSplitter(),
'random': dc.splits.RandomSplitter(),
'time': dc.splits.TimeSplitterPDBbind(pdbbind_dataset.ids)
}
splitter = splitters[split]
train_dataset, valid_dataset, test_dataset = splitter.train_valid_test_split(pdbbind_dataset)
transformers = []
y_train = train_dataset.y
y_train *= -1 * 2.479 / 4.184
train_dataset = dc.data.DiskDataset.from_numpy(
train_dataset.X,
y_train,
train_dataset.w,
train_dataset.ids,
tasks=tasks)
tg, feed_dict_generator, label = atomic_conv_model()
tg.fit_generator(feed_dict_generator(train_dataset, batch_size=24, epochs=10)) |
Not sure what's happening here. Will need to find some time to sit down and take a crack at this code. Hope to do so later this week. |
Closing since implemented in #2431 |
We currently lack a tutorial for running atomic convolutions. Here's some of the material it would be good to cover in such a tutorial:
The text was updated successfully, but these errors were encountered: