# Training SageMaker Models for Molecular Property Prediction Using DGL with PyTorch Backend

The **SageMaker Python SDK** makes it easy to train DGL models. In this example, we train a simple graph neural network for molecular toxicity prediction using [DGL](https://github.com/dmlc/dgl) and Tox21 dataset.

The dataset contains qualitative toxicity measurement for 8014 compounds on 12 different targets, including nuclear 
receptors and stress response pathways. Each target yields a binary classification problem. We can model the problem as a graph classification problem. 

## Setup

We need to define a few variables that will be needed later in the example.

In [None]:
import sagemaker
from sagemaker import get_execution_role
from sagemaker.session import Session

# Setup session
sess = sagemaker.Session()

# S3 bucket for saving code and model artifacts.
# Feel free to specify a different bucket here if you wish.
bucket = sess.default_bucket()

# Location to put your custom code.
custom_code_upload_location = 'customcode'

# IAM execution role that gives SageMaker access to resources in your AWS account.
# We can use the SageMaker Python SDK to get the role from our notebook environment. 
role = get_execution_role()

## Training Script

`main.py` provides all the code we need for training a SageMaker model.

In [None]:
!cat main.py