Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GraphConvModel compute_saliency #1706

Open
abanerjee10 opened this issue Nov 4, 2019 · 1 comment

Comments

@abanerjee10
Copy link

@abanerjee10 abanerjee10 commented Nov 4, 2019

Enviroment
Platform: Linux64
DeepChem=2.3
numpy=1.15.4
tensorflow=1.14.0
Installed with conda

Hello, I followed the tutorial at https://github.com/deepchem/deepchem/blob/master/examples/notebooks/graph_convolutional_networks_for_tox21.ipynb to construct a graphconv model as shown below.

import deepchem as dc
from deepchem.models.graph_models import GraphConvModel
# Load Tox21 dataset
tox21_tasks, tox21_datasets, transformers = dc.molnet.load_tox21(featurizer='GraphConv')
train_dataset, valid_dataset, test_dataset = tox21_datasets
n_tasks = len(tox21_tasks)
model = GraphConvModel(n_tasks, batch_size=50, mode='classification')
num_epochs = 10
losses = []
for i in range(num_epochs):
 loss = model.fit(train_dataset, nb_epoch=1)
 print("Epoch %d loss: %f" % (i, loss))
 losses.append(loss)

The Issue
After doing this, I would like to find the gradients of the model with respect to its inputs. I thought this would be supported by the compute_saliency method. I cannot find any documentation or examples indicating how to do this.

Minimum Failing Examples
I have tried

model.compute_saliency(train_dataset)

with AttributeError: 'DiskDataset' object has no attribute 'shape'.
I have also tried

model.compute_saliency(train_dataset.X)

with TypeError: float() argument must be a string or a number, not 'ConvMol'.
Based on the source code I found for the model as well the TensorGraphTutorial, I also tried

from deepchem.feat.mol_graphs import ConvMol
import numpy as np
for ind, (X_b, y_b, w_b, ids_b) in enumerate(
        train_dataset.iterbatches(
            batch_size=1, pad_batches=True, deterministic=True)):
    multiConvMol = ConvMol.agglomerate_mols(X_b)
    inputs = [multiConvMol.get_atom_features(), multiConvMol.deg_slice, np.array(multiConvMol.membership)]
    for i in range(1, len(multiConvMol.get_deg_adjacency_lists())):
        inputs.append(multiConvMol.get_deg_adjacency_lists()[i])
model.compute_saliency(inputs[0])

with TypeError: Failed to convert object of type <class 'list'> to Tensor. Contents: [<tensorflow.python.framework.ops.IndexedSlices object at 0x7fbe46f66f28>]. Consider casting elements to a supported type.

I am out of ideas. I have the same problems when trying to make predictions with the model. What should the model's input look like?

Thank you!

@peastman

This comment has been minimized.

Copy link
Contributor

@peastman peastman commented Nov 5, 2019

compute_saliency() doesn't work with GraphConvModel. It's designed for models that take a simple numpy array as input and produce one or more arrays as output. It computes the gradient of each output element with respect to each input element. But GraphConvModel takes a ConvMol object as input, then implements default_generator() to construct a bunch of arrays describing the molecule in a somewhat complicated way. To make compute_saliency() work with that, we would somehow have to reverse that process and compute gradients with respect to... I'm not really sure what.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.