### Code on this notebook is based on chapter 3 of Deep Learning for the Life Sciences
Building a model to predict molecule toxicity.

When using deepchem, start with a new environment. Conda (not pip) install deepchem and then its dependencies. God willing, this code will still work next time I come to run it.

In [2]:
import numpy as np
import deepchem as dc

DeepChem has a module names dc.molnet (molnet = MoleculeNet) which contains preprocessed datasets for ML.

In [3]:
# load tox21 toxicity dataset
tox21_tasks, tox21_datasets, transformers = dc.molnet.load_tox21()
training_dataset, validsation_dataset, test_dataset = tox21_datasets

DeepChem's dc.models contains many different life science-specific models

In [9]:
# build and train model
model = dc.models.MultitaskClassifier(n_tasks=12, n_features=1024, layer_sizes=[10])
model.fit(training_dataset, nb_epoch=10)

0.8749472935994466

In [10]:
# evaluate performance using ROC (receiver opeating characteristic) AUC across all tasks
metric = dc.metrics.Metric(dc.metrics.roc_auc_score, np.mean)

# get scores using model.evaluate
training_scores = model.evaluate(training_dataset, [metric], transformers)
test_scores = model.evaluate(test_dataset, [metric], transformers)

print(f"Training {training_scores}")
print(f"Test {test_scores}")

Training {'mean-roc_auc_score': 0.8617081529130873}
Test {'mean-roc_auc_score': 0.6928166727159947}


# MNIST Case Study
In this section, we'll create a new deep learning architecture.