[View in Colaboratory](https://colab.research.google.com/github/christianmerkwirth/colabs/blob/master/graph_convolutional_networks_for_tox21_on_colab.ipynb)

# Graph Convolutions For Tox21 on Google Colaboratory
In this Colab notebook, we first show how to install DeepChem on a Colab with Py27 or Py36 runtime. We then explore the use of TensorGraph to create graph convolutional models with DeepChem on the Tox21 dataset.

Let's start with installing [DeepChem](https://deepchem.io).

In [0]:
%%bash
PYV=`python -c "import sys;t='{v[0]}.{v[1]}'.format(v=list(sys.version_info[:2]));sys.stdout.write(t)";`
echo "Python version $PYV detected"
if [ $PYV == "2.7" ]
then
  # Installing DeepChem for Python 2.7.
  apt-get install -y libxrender-dev
  apt-get install python-rdkit librdkit1 rdkit-data       # Install RDkit

  pip install joblib simdna

  git clone https://github.com/deepchem/deepchem.git      # Clone deepchem source code from GitHub
  cd deepchem && python setup.py install  

  ls -la /usr/local/lib/python2.7/dist-packages/deepchem/
else
  # Installing DeepChem for Python 3.6 using MiniConda.
  wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O anaconda.sh;
  chmod +x anaconda.sh
  bash ./anaconda.sh -b -f -p /usr/local
  conda install -y --prefix /usr/local -c conda-forge rdkit joblib simdna

  git clone https://github.com/deepchem/deepchem.git      # Clone deepchem source code from GitHub
  cd deepchem && python setup.py install
  ls -la /usr/local/lib/python3.6/site-packages/deepchem
fi

In [0]:
# Last step of installing DeepChem under Python 3.6

import sys
if sys.version_info[0] >= 3:
    sys.path.append('/usr/local/lib/python3.6/site-packages/')
sys.path

Let's start with some basic imports to see if the install was successful. 

Note: Sometimes it is necessary to restart the runtime once after the initial install. After restarting, continue from the cell below.

In [0]:
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals

import numpy as np 
import tensorflow as tf
import deepchem as dc
from deepchem.models.tensorgraph.models.graph_models import GraphConvModel

Now, let's use MoleculeNet to load the Tox21 dataset. We need to make sure to process the data in a way that graph convolutional networks can use For that, we make sure to set the featurizer option to 'GraphConv'. The MoleculeNet call will return a training set, an validation set, and a test set for us to use. The call also returns `transformers`, a list of data transformations that were applied to preprocess the dataset. (Most deep networks are quite finicky and require a set of data transformations to ensure that training proceeds stably.)

In [0]:
# Load Tox21 dataset
tox21_tasks, tox21_datasets, transformers = dc.molnet.load_tox21(featurizer='GraphConv')
train_dataset, valid_dataset, test_dataset = tox21_datasets

Let's now train a graph convolutional network on this dataset. DeepChem has the class `GraphConvModel` that wraps a standard graph convolutional architecture underneath the hood for user convenience. Let's instantiate an object of this class and train it on our dataset.

In [0]:
model = GraphConvModel(
    len(tox21_tasks), batch_size=50, mode='classification')
# Set nb_epoch=10 for better results.
model.fit(train_dataset, nb_epoch=1)

Let's try to evaluate the performance of the model we've trained. For this, we need to define a metric, a measure of model performance. `dc.metrics` holds a collection of metrics already. For this dataset, it is standard to use the ROC-AUC score, the area under the receiver operating characteristic curve (which measures the tradeoff between precision and recall). Luckily, the ROC-AUC score is already available in DeepChem. 

To measure the performance of the model under this metric, we can use the convenience function `model.evaluate()`.

In [0]:
metric = dc.metrics.Metric(
    dc.metrics.roc_auc_score, np.mean, mode="classification")

print("Evaluating model")
train_scores = model.evaluate(train_dataset, [metric], transformers)
print("Training ROC-AUC Score: %f" % train_scores["mean-roc_auc_score"])
valid_scores = model.evaluate(valid_dataset, [metric], transformers)
print("Validation ROC-AUC Score: %f" % valid_scores["mean-roc_auc_score"])