## Sixth Session (Related to the Course Project)

---------------

## Graph Regression with [Deep Graph Library (DGL)](https://docs.dgl.ai/index.html) for the graduate course "[Graph Machine learning](https://github.com/zahta/graph_ml)"

### Dataset: PDBbind-C

##### by [Zahra Taheri](https://github.com/zahta), 06 June 2023

---------------

### This Tutorial Is Prepared Based on the Following References

- [FunQG: Molecular Representation Learning via Quotient Graphs](https://pubs.acs.org/doi/10.1021/acs.jcim.3c00445)
- [Supporting Information of FunQG](https://pubs.acs.org/doi/suppl/10.1021/acs.jcim.3c00445/suppl_file/ci3c00445_si_001.pdf)
- [GitHub Repository of FunQG](https://github.com/hhaji/funqg)

In [1]:
pip install  dgl -f https://data.dgl.ai/wheels/repo.html

Looking in links: https://data.dgl.ai/wheels/repo.html
Collecting dgl
  Downloading dgl-1.1.1-cp310-cp310-manylinux1_x86_64.whl (6.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.3/6.3 MB[0m [31m82.2 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: dgl
Successfully installed dgl-1.1.1


In [None]:
#pip install  dgl -f https://data.dgl.ai/wheels/cu113/repo.html

In [None]:
#pip install  dglgo -f https://data.dgl.ai/wheels-test/repo.html

In [2]:
pip install  dglgo -f https://data.dgl.ai/wheels-test/repo.html

Looking in links: https://data.dgl.ai/wheels-test/repo.html
Collecting dglgo
  Downloading dglgo-0.0.2-py3-none-any.whl (63 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m63.5/63.5 kB[0m [31m6.5 MB/s[0m eta [36m0:00:00[0m
Collecting isort>=5.10.1 (from dglgo)
  Downloading isort-5.12.0-py3-none-any.whl (91 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m91.2/91.2 kB[0m [31m9.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting autopep8>=1.6.0 (from dglgo)
  Downloading autopep8-2.0.2-py2.py3-none-any.whl (45 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.2/45.2 kB[0m [31m5.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting numpydoc>=1.1.0 (from dglgo)
  Downloading numpydoc-1.5.0-py3-none-any.whl (52 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m52.4/52.4 kB[0m [31m6.8 MB/s[0m eta [36m0:00:00[0m
Collecting ruamel.yaml>=0.17.20 (from dglgo)
  Downloading ruamel.yaml-0.17.32-py3-none-any.whl (1

In [3]:
%matplotlib inline
import os

os.environ["DGLBACKEND"] = "pytorch"
import dgl
import numpy as np
import networkx as nx
import torch
import torch.nn as nn
import dgl.function as fn
import torch.nn.functional as F
import shutil
from torch.utils.data import DataLoader
import cloudpickle
from dgl.nn import GraphConv
from dgl.nn import GINConv
from dgl.nn import SAGEConv
from dgl.nn import GATConv
from sklearn.preprocessing import StandardScaler
from sklearn import preprocessing

#### Set Path

In [4]:
!pip install unzip

Collecting unzip
  Downloading unzip-1.0.0.tar.gz (704 bytes)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: unzip
  Building wheel for unzip (setup.py) ... [?25l[?25hdone
  Created wheel for unzip: filename=unzip-1.0.0-py3-none-any.whl size=1279 sha256=576716018462b8b9227b0e5acbc1d8cdf9068d10bf67f0949b3522c042490a45
  Stored in directory: /root/.cache/pip/wheels/80/dc/7a/f8af45bc239e7933509183f038ea8d46f3610aab82b35369f4
Successfully built unzip
Installing collected packages: unzip
Successfully installed unzip-1.0.0


In [5]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [6]:
!unzip /content/drive/MyDrive/graph_data0.zip

Archive:  /content/drive/MyDrive/graph_data0.zip
  inflating: scaffold_0_smiles_train.pickle  
  inflating: scaffold_0_test.bin     
  inflating: scaffold_0_val.bin      
  inflating: scaffold_0_smiles_val.pickle  
  inflating: scaffold_0_smiles_test.pickle  
  inflating: scaffold_0_train.bin    


In [7]:
# Define the path to the current directory where the ZIP file is located.
current_dir = "/content/drive/MyDrive/graph_data0.zip"

# Create the path to the directory where model checkpoints will be saved.
checkpoint_path = current_dir + "save_models/model_checkpoints/" + "checkpoint"
os.makedirs(checkpoint_path, exist_ok=True)

# Define the path to the directory where the best model will be saved.
best_model_path = current_dir + "save_models/best_model/"

# Create a temporary folder path for data manipulation.
folder_data_temp = current_dir + "data_temp/"

# Remove the temporary folder if it exists, ignoring any errors if it does not.
shutil.rmtree(folder_data_temp, ignore_errors=True)

# Define the path to save the unpacked files from the ZIP archive.
path_save = current_dir

# Unpack the contents of the ZIP archive to the temporary folder.
shutil.unpack_archive(path_save, folder_data_temp)


#### Custom PyTorch Datasets

In [8]:
# Define a custom dataset class for regression datasets.
""" Regression Dataset """
class DGLDatasetReg(torch.utils.data.Dataset):
    def __init__(self, address, transform=None, train=True, scaler=None , scaler_regression=None):
        # Initialize the dataset with relevant parameters.
        self.train = train
        self.scaler = scaler

        # Load the graphs from the given binary file.
        self.data_set, train_labels_masks_globals = dgl.load_graphs(address+".bin")

        # Extract labels, masks, and global features for each graph.
        num_graphs = len(self.data_set)
        self.labels = train_labels_masks_globals["labels"].view(num_graphs,-1)
        self.masks = train_labels_masks_globals["masks"].view(num_graphs,-1)
        self.globals = train_labels_masks_globals["globals"].view(num_graphs,-1)

        # Store the data transformation function (if provided).
        self.transform = transform
        self.scaler_regression = scaler_regression

    def scaler_method(self):
        # Create and fit a standard scaler to normalize the labels during training.
        if self.train:
            scaler = preprocessing.StandardScaler().fit(self.labels)
            return scaler
        else:
            return None

    def __len__(self):
        # Return the number of graphs in the dataset.
        return len(self.data_set)

    def __getitem__(self, idx):
        if self.scaler_regression:
            # Apply the scaler to the labels if specified.
            """ With Scaler """
            return self.data_set[idx], torch.tensor(self.scaler.transform(self.labels)[idx]).float(), self.masks[idx], self.globals[idx]
        else:
            # Return the data without applying the scaler.
            """ Without Scaler """
            return self.data_set[idx], self.labels[idx].float(), self.masks[idx], self.globals[idx]


#### Defining Train, Validation, and Test Set

In [9]:
# Create an instance of the StandardScaler class for label scaling.
scaler = StandardScaler()

# Define the path to the temporary data folder with a specific scaffold and index.
path_data_temp = folder_data_temp + "scaffold" + "_" + str(0)

# Create the training dataset using the DGLDatasetReg class and the specified data address.
train_set = DGLDatasetReg(address=path_data_temp + "_train")

# Fit the scaler using the transformed labels from the training set.
scaler.fit(train_set.scaler_method().transform(train_set.labels))

# Create the validation dataset using the DGLDatasetReg class, the specified data address, and the scaler.
val_set = DGLDatasetReg(address=path_data_temp + "_val", scaler=scaler)

# Create the test dataset using the DGLDatasetReg class, the specified data address, and the scaler.
test_set = DGLDatasetReg(address=path_data_temp + "_test", scaler=scaler)

# Print the lengths of the training, validation, and test sets.
print(len(train_set), len(val_set), len(test_set))


134 16 18


#### Data Loader

In [10]:
# Define a collate function to process a batch of data samples.
def collate(batch):
    # Extract the graphs from the batch and create a batched graph using dgl.batch.
    graphs = [e[0] for e in batch]
    g = dgl.batch(graphs)

    # Extract the labels from the batch and stack them into a tensor.
    labels = [e[1] for e in batch]
    labels = torch.stack(labels, 0)

    # Extract the masks from the batch and stack them into a tensor.
    masks = [e[2] for e in batch]
    masks = torch.stack(masks, 0)

    # Extract the global features from the batch and stack them into a tensor.
    globals = [e[3] for e in batch]
    globals = torch.stack(globals, 0)

    # Return the batched graph, labels, masks, and globals.
    return g, labels, masks, globals


# Define a loader function to create data loaders for the training, validation, and test sets.
def loader(batch_size=64):
    # Create a data loader for the training set.
    train_dataloader = DataLoader(train_set,
                                  batch_size=batch_size,
                                  collate_fn=collate,
                                  drop_last=False,
                                  shuffle=True,
                                  num_workers=1)

    # Create a data loader for the validation set.
    val_dataloader = DataLoader(val_set,
                                batch_size=batch_size,
                                collate_fn=collate,
                                drop_last=False,
                                shuffle=False,
                                num_workers=1)

    # Create a data loader for the test set.
    test_dataloader = DataLoader(test_set,
                                 batch_size=batch_size,
                                 collate_fn=collate,
                                 drop_last=False,
                                 shuffle=False,
                                 num_workers=1)

    # Return the data loaders for training, validation, and test sets.
    return train_dataloader, val_dataloader, test_dataloader


In [11]:
# Create data loaders for the training, validation, and test sets with a batch size of 64.
train_dataloader, val_dataloader, test_dataloader = loader(batch_size=64)

#### Defining A GNN

##### Some Variables

In [12]:
#Bace dataset has 1 task. Some other datasets may have some more number of tasks, e.g., tox21 has 12 tasks.
num_tasks = 1

# Size of global feature of each graph
global_size = 200

# Number of epochs to train the model
num_epochs = 100

# Number of steps to wait if the model performance on the validation set does not improve
patience = 10

#Configurations to instantiate the model
config = {"node_feature_size":127, "edge_feature_size":12, "hidden_size":100}


In [13]:
#Define a GNN (Graph Neural Network) class as a subclass of nn.Module
class GNN(nn.Module):
    def __init__(self, config, global_size = 200, num_tasks = 1):
        super().__init__()
        self.config = config
        self.num_tasks = num_tasks

        # Node feature size
        self.node_feature_size = self.config.get('node_feature_size', 127)

        # Edge feature size
        self.edge_feature_size = self.config.get('edge_feature_size', 12)

        # Hidden size
        self.hidden_size = self.config.get('hidden_size', 100)

        # Create the first GraphConv layer with input size equal to the node feature size and output size equal to the hidden size.
        self.conv1 = GraphConv(self.node_feature_size, self.hidden_size)

        # Create the second GraphConv layer with input size equal to the hidden size and output size equal to the number of tasks
        self.conv2 = GraphConv(self.hidden_size, self.num_tasks)

    # def forward(self, g, in_feat):
    def forward(self, mol_dgl_graph, globals):
        mol_dgl_graph.ndata["v"]= mol_dgl_graph.ndata["v"][:,:self.node_feature_size]
        mol_dgl_graph.edata["e"] = mol_dgl_graph.edata["e"][:,:self.edge_feature_size]
        h = self.conv1(mol_dgl_graph, mol_dgl_graph.ndata["v"])
        h = F.relu(h)
        h = self.conv2(mol_dgl_graph, h)
        mol_dgl_graph.ndata["h"] = h
        return dgl.mean_nodes(mol_dgl_graph, "h")

#### Function to Compute Score of the Model

In [14]:
import math
from sklearn.metrics import mean_squared_error
def compute_score(model, data_loader, val_size, num_tasks):
  model.eval()
  loss_sum = nn.MSELoss(reduction='sum') # MSE with sum instead of mean, i.e., sum_i[(y_i)^2-(y'_i)^2]
  final_loss = 0
  state = torch.get_rng_state()
  with torch.no_grad():
            for i, (mol_dgl_graph, labels, masks, globals) in enumerate(data_loader):
                prediction = model(mol_dgl_graph, globals)
                prediction = torch.tensor(scaler.inverse_transform(prediction.detach().cpu()))
                labels = torch.tensor(scaler.inverse_transform(labels.cpu()))
                loss = loss_sum(prediction, labels)
                final_loss += loss.item()
            final_loss /= val_size
            final_loss = math.sqrt(final_loss)
  return final_loss / num_tasks

In [15]:
# Import the required modules.
import math
from sklearn.metrics import mean_squared_error

# Define a function to compute the score using the given model, data loader, validation size, and number of tasks.
def compute_score(model, data_loader, val_size, num_tasks):
    # Set the model to evaluation mode.
    model.eval()

    # Define the loss function as Mean Squared Error (MSE) with sum reduction.
    loss_sum = nn.MSELoss(reduction='sum')

    # Initialize the final loss variable.
    final_loss = 0

    # Get the current random number generator state.
    state = torch.get_rng_state()

    # Disable gradient calculation since we are in evaluation mode.
    with torch.no_grad():
        # Iterate over the data loader.
        for i, (mol_dgl_graph, labels, masks, globals) in enumerate(data_loader):
            # Make predictions using the model.
            prediction = model(mol_dgl_graph, globals)

            # Inverse transform the predictions using the scaler.
            prediction = torch.tensor(scaler.inverse_transform(prediction.detach().cpu()))

            # Inverse transform the labels using the scaler.
            labels = torch.tensor(scaler.inverse_transform(labels.cpu()))

            # Compute the loss between the predictions and labels.
            loss = loss_sum(prediction, labels)

            # Accumulate the loss.
            final_loss += loss.item()

        # Compute the average loss.
        final_loss /= val_size

        # Take the square root of the average loss to obtain the final score.
        final_loss = math.sqrt(final_loss) #RMSE

    # Return the final score divided by the number of tasks.
    return final_loss / num_tasks


#### Loss Function

In [16]:
# Define a function to compute the loss using the given output, label, mask, and number of tasks.
def loss_func(output, label, mask, num_tasks):
    # Create a tensor of ones with shape (1, num_tasks) as the positive weight.
    pos_weight = torch.ones((1, num_tasks))

    # Create a criterion using Mean Squared Error (MSE) loss with no reduction.
    criterion = nn.MSELoss(reduction='none')

    # Compute the element-wise loss by multiplying the mask with the criterion output.
    loss = mask * criterion(output, label)

    # Compute the average loss by summing the masked losses and dividing by the sum of the mask.
    loss = loss.sum() / mask.sum()

    # Return the computed loss.
    return loss


#### Training and Evaluation

##### Training Function

In [17]:
# Define a function to train a single epoch using the given training data loader, model, and optimizer.
def train_epoch(train_dataloader, model, optimizer):
    # Initialize the epoch train loss and iterations.
    epoch_train_loss = 0
    iterations = 0

    # Set the model to train mode.
    model.train()

    # Iterate over the training data loader.
    for i, (mol_dgl_graph, labels, masks, globals) in enumerate(train_dataloader):
        # Make predictions using the model.
        prediction = model(mol_dgl_graph, globals)

        # Compute the training loss using the loss function.
        loss_train = loss_func(prediction, labels, masks, num_tasks)

        # Zero the gradients of the model parameters.
        optimizer.zero_grad(set_to_none=True)

        # Perform backpropagation to compute gradients.
        loss_train.backward()

        # Update the model parameters using the optimizer.
        optimizer.step()

        # Accumulate the training loss.
        epoch_train_loss += loss_train.detach().item()

        # Increment the iterations count.
        iterations += 1

    # Compute the average epoch train loss.
    epoch_train_loss /= iterations

    # Return the average epoch train loss.
    return epoch_train_loss


In [45]:
# Define a function to train and evaluate the model.
def train_evaluate():
    # Create a new instance of the GNN model with the given configuration.
    model = GNN(config, global_size, num_tasks)

    # Create an Adam optimizer for training the model.
    optimizer = torch.optim.Adam(model.parameters(), lr=0.0001)

    # Initialize variables for tracking the best validation score and patience count.
    best_val = float('inf')
    patience_count = 1
    epoch = 1

    # Continue training until reaching the maximum number of epochs.
    while epoch <= num_epochs:
        # Check if the patience count is within the allowed limit.
        if patience_count <= patience:
            # Set the model to train mode and compute the training loss for the current epoch.
            model.train()
            loss_train = train_epoch(train_dataloader, model, optimizer)

            # Set the model to eval mode and compute the validation score.
            model.eval()
            score_val = compute_score(model, val_dataloader, len(val_set), num_tasks)

            # Check if the current validation score is better than the best validation score so far.
            if score_val > best_val:
                best_val = score_val
                print("Save checkpoint")
                path = os.path.join(checkpoint_path, 'checkpoint.pth')

                # Create a dictionary to store the checkpoint information.
                dict_checkpoint = {"score_val": score_val}
                dict_checkpoint.update({"model_state_dict": model.state_dict(), "optimizer_state": optimizer.state_dict()})

                # Save the checkpoint to a file using cloudpickle.
                with open(path, "wb") as outputfile:
                    cloudpickle.dump(dict_checkpoint, outputfile)

                patience_count = 1
            else:
                print("Patience", patience_count)
                patience_count += 1

            # Print the training and validation scores for the current epoch.
            print("Epoch: {}/{} | Training Loss: {:.3f} | Valid Score: {:.3f}".format(epoch, num_epochs, loss_train, score_val))
            print(" ")
            print("Epoch: {}/{} | Best Valid Score Until Now: {:.3f}".format(epoch, num_epochs, best_val), "\n")

        epoch += 1

    # Save the best model by copying the checkpoint directory.
    shutil.rmtree(best_model_path, ignore_errors=True)
    shutil.copytree(checkpoint_path, best_model_path)

    # Print the final results.
    print("Final results:")
    print("Average Valid Score: {:.3f}".format(np.mean(best_val)), "\n")


##### Function to compute test set score of the final saved model

##### Train the model and evaluate its performance

In [46]:
def test_evaluate():
    # Create the final model
    final_model = GNN(config, global_size, num_tasks)

    # Set the path to the best model checkpoint file
    path = os.path.join(best_model_path, 'checkpoint.pth')

    # Open the best model checkpoint file and load it
    with open(path, 'rb') as f:
        checkpoint = cloudpickle.load(f)

    # Load the state dictionary of the best model
    final_model.load_state_dict(checkpoint["model_state_dict"])

    # Set the final model to evaluation mode
    final_model.eval()

    # Compute the test score
    test_score = compute_score(final_model, test_dataloader, len(test_set), num_tasks)

    # Print the test score
    print("Test Score: {:.3f}".format(test_score), "\n")

    # Print the execution time
    print("Execution time: {:.3f} seconds".format(time.time() - start_time))


In [47]:
#This line imports the time module, which provides various time-related functions
import time
#This line records the current time using time.time() and assigns it to the variable start_time. It serves as the starting point for measuring the execution time
start_time = time.time()
#This line calls the train_evaluate() function, which is likely responsible for training and evaluating a model.
train_evaluate()
#This line calls the test_evaluate() function, which probably performs evaluation on a separate test dataset.
test_evaluate()

  return F.mse_loss(input, target, reduction=self.reduction)
  return F.mse_loss(input, target, reduction=self.reduction)


Patience 1
Epoch: 1/100 | Training Loss: 7231.685 | Valid Score: 58.796
 
Epoch: 1/100 | Best Valid Score Until Now: inf 



  return F.mse_loss(input, target, reduction=self.reduction)


Patience 2
Epoch: 2/100 | Training Loss: 7166.269 | Valid Score: 58.571
 
Epoch: 2/100 | Best Valid Score Until Now: inf 

Patience 3
Epoch: 3/100 | Training Loss: 7102.994 | Valid Score: 58.351
 
Epoch: 3/100 | Best Valid Score Until Now: inf 

Patience 4
Epoch: 4/100 | Training Loss: 7042.010 | Valid Score: 58.133
 
Epoch: 4/100 | Best Valid Score Until Now: inf 

Patience 5
Epoch: 5/100 | Training Loss: 6982.257 | Valid Score: 57.914
 
Epoch: 5/100 | Best Valid Score Until Now: inf 

Patience 6
Epoch: 6/100 | Training Loss: 6922.615 | Valid Score: 57.696
 
Epoch: 6/100 | Best Valid Score Until Now: inf 

Patience 7
Epoch: 7/100 | Training Loss: 6862.157 | Valid Score: 57.477
 
Epoch: 7/100 | Best Valid Score Until Now: inf 

Patience 8
Epoch: 8/100 | Training Loss: 6802.364 | Valid Score: 57.258
 
Epoch: 8/100 | Best Valid Score Until Now: inf 

Patience 9
Epoch: 9/100 | Training Loss: 6743.176 | Valid Score: 57.040
 
Epoch: 9/100 | Best Valid Score Until Now: inf 

Patience 10
Epoc

  return F.mse_loss(input, target, reduction=self.reduction)


#GNN2

In [21]:
class GNN(nn.Module):
    def __init__(self, config, global_size = 200, num_tasks = 1):
        super().__init__()
        self.config = config
        self.num_tasks = num_tasks

        # Node feature size
        self.node_feature_size = self.config.get('node_feature_size', 127)

        # Edge feature size
        self.edge_feature_size = self.config.get('edge_feature_size', 12)

        # Hidden size
        self.hidden_size = self.config.get('hidden_size', 100)

        self.conv1 = SAGEConv(self.node_feature_size, self.hidden_size, aggregator_type='mean')
        self.conv2 = SAGEConv(self.hidden_size, self.num_tasks, aggregator_type='mean')

    # def forward(self, g, in_feat):
    def forward(self, mol_dgl_graph, globals):
        mol_dgl_graph.ndata["v"]= mol_dgl_graph.ndata["v"][:,:self.node_feature_size]
        mol_dgl_graph.edata["e"] = mol_dgl_graph.edata["e"][:,:self.edge_feature_size]
        h = self.conv1(mol_dgl_graph, mol_dgl_graph.ndata["v"])
        h = F.relu(h)
        h = self.conv2(mol_dgl_graph, h)
        mol_dgl_graph.ndata["h"] = h
        return dgl.mean_nodes(mol_dgl_graph, "h")

In [22]:
import math
from sklearn.metrics import mean_squared_error
def compute_score(model, data_loader, val_size, num_tasks):
  model.eval()
  loss_sum = nn.MSELoss(reduction='sum') # MSE with sum instead of mean, i.e., sum_i[(y_i)^2-(y'_i)^2]
  final_loss = 0
  state = torch.get_rng_state()
  with torch.no_grad():
            for i, (mol_dgl_graph, labels, masks, globals) in enumerate(data_loader):
                prediction = model(mol_dgl_graph, globals)
                prediction = torch.tensor(scaler.inverse_transform(prediction.detach().cpu()))
                labels = torch.tensor(scaler.inverse_transform(labels.cpu()))
                loss = loss_sum(prediction, labels)
                final_loss += loss.item()
            final_loss /= val_size
            final_loss = math.sqrt(final_loss) # RMSE
  return final_loss / num_tasks

In [23]:
def loss_func(output, label, mask, num_tasks):
    pos_weight = torch.ones((1, num_tasks))
    pos_weight
    criterion = nn.MSELoss(reduction='none')
    loss = mask*criterion(output,label)
    loss = loss.sum() / mask.sum()
    return loss

In [24]:
def train_epoch(train_dataloader, model, optimizer):
    epoch_train_loss = 0
    iterations = 0
    model.train() # Prepare model for training
    for i, (mol_dgl_graph, labels, masks, globals) in enumerate(train_dataloader):
        prediction = model(mol_dgl_graph, globals)
        loss_train = loss_func(prediction, labels, masks, num_tasks)
        optimizer.zero_grad(set_to_none=True)
        loss_train.backward()
        optimizer.step()
        epoch_train_loss += loss_train.detach().item()
        iterations += 1
    epoch_train_loss /= iterations
    return epoch_train_loss

In [25]:
def train_evaluate():

    model = GNN(config, global_size, num_tasks)
    optimizer = torch.optim.Adam(model.parameters(), lr = 0.0001)

    best_val = 0
    patience_count = 1
    epoch = 1

    while epoch <= num_epochs:
        if patience_count <= patience:
            model.train()
            loss_train = train_epoch(train_dataloader, model, optimizer)
            model.eval()
            score_val = compute_score(model, val_dataloader, len(val_set), num_tasks)
            if score_val > best_val:
                best_val = score_val
                print("Save checkpoint")
                path = os.path.join(checkpoint_path, 'checkpoint.pth')
                dict_checkpoint = {"score_val": score_val}
                dict_checkpoint.update({"model_state_dict": model.state_dict(), "optimizer_state": optimizer.state_dict()})
                with open(path, "wb") as outputfile:
                    cloudpickle.dump(dict_checkpoint, outputfile)
                patience_count = 1
            else:
                print("Patience", patience_count)
                patience_count += 1

            print("Epoch: {}/{} | Training Loss: {:.3f} | Valid Score: {:.3f}".format(
            epoch, num_epochs, loss_train, score_val))

            print(" ")
            print("Epoch: {}/{} | Best Valid Score Until Now: {:.3f}".format(epoch, num_epochs, best_val), "\n")
        epoch += 1

    # best model save
    shutil.rmtree(best_model_path, ignore_errors=True)
    shutil.copytree(checkpoint_path, best_model_path)

    print("Final results:")
    print("Average Valid Score: {:.3f}".format(np.mean(best_val)), "\n")


In [26]:
def test_evaluate():
    final_model = GNN(config, global_size, num_tasks)
    path = os.path.join(best_model_path, 'checkpoint.pth')
    with open(path, 'rb') as f:
        checkpoint = cloudpickle.load(f)
    final_model.load_state_dict(checkpoint["model_state_dict"])
    final_model.eval()
    test_score = compute_score(final_model, test_dataloader, len(test_set), num_tasks)

    print("Test Score: {:.3f}".format(test_score), "\n")
    print("Execution time: {:.3f} seconds".format(time.time() - start_time))

In [27]:
import time
start_time = time.time()

train_evaluate()
test_evaluate()

Save checkpoint
Epoch: 1/100 | Training Loss: 26.925 | Valid Score: 6.431
 
Epoch: 1/100 | Best Valid Score Until Now: 6.431 

Patience 1
Epoch: 2/100 | Training Loss: 29.046 | Valid Score: 6.353
 
Epoch: 2/100 | Best Valid Score Until Now: 6.431 

Patience 2
Epoch: 3/100 | Training Loss: 38.460 | Valid Score: 6.276
 
Epoch: 3/100 | Best Valid Score Until Now: 6.431 

Patience 3
Epoch: 4/100 | Training Loss: 24.076 | Valid Score: 6.201
 
Epoch: 4/100 | Best Valid Score Until Now: 6.431 

Patience 4
Epoch: 5/100 | Training Loss: 25.473 | Valid Score: 6.127
 
Epoch: 5/100 | Best Valid Score Until Now: 6.431 

Patience 5
Epoch: 6/100 | Training Loss: 31.716 | Valid Score: 6.054
 
Epoch: 6/100 | Best Valid Score Until Now: 6.431 

Patience 6
Epoch: 7/100 | Training Loss: 25.288 | Valid Score: 5.980
 
Epoch: 7/100 | Best Valid Score Until Now: 6.431 

Patience 7
Epoch: 8/100 | Training Loss: 25.859 | Valid Score: 5.908
 
Epoch: 8/100 | Best Valid Score Until Now: 6.431 

Patience 8
Epoch: 9

#GNN3

In [28]:
class GNN(nn.Module):
    def __init__(self, config, global_size = 200, num_tasks = 1):
        super().__init__()
        self.config = config
        self.num_tasks = num_tasks

        # Node feature size
        self.node_feature_size = self.config.get('node_feature_size', 127)

        # Edge feature size
        self.edge_feature_size = self.config.get('edge_feature_size', 12)

        # Hidden size
        self.hidden_size = self.config.get('hidden_size', 100)

        self.conv1 = SAGEConv(self.node_feature_size, self.hidden_size, aggregator_type='mean')
        self.conv2 = SAGEConv(self.hidden_size, self.num_tasks, aggregator_type='mean')

    # def forward(self, g, in_feat):
    def forward(self, mol_dgl_graph, globals):
        mol_dgl_graph.ndata["v"]= mol_dgl_graph.ndata["v"][:,:self.node_feature_size]
        mol_dgl_graph.edata["e"] = mol_dgl_graph.edata["e"][:,:self.edge_feature_size]
        h = self.conv1(mol_dgl_graph, mol_dgl_graph.ndata["v"])
        h = F.relu(h)
        h = self.conv2(mol_dgl_graph, h)
        mol_dgl_graph.ndata["h"] = h
        return dgl.mean_nodes(mol_dgl_graph, "h")

In [29]:
import math
from sklearn.metrics import mean_squared_error
def compute_score(model, data_loader, val_size, num_tasks, scaler=None):
  model.eval()
  loss_sum = nn.MSELoss(reduction='sum') # MSE with sum instead of mean, i.e., sum_i[(y_i)^2-(y'_i)^2]
  final_loss = 0
  state = torch.get_rng_state()
  with torch.no_grad():
            for i, (mol_dgl_graph, labels, masks, globals) in enumerate(data_loader):
                prediction = model(mol_dgl_graph, globals)
                if scaler is not None:
                  prediction = torch.tensor(scaler.inverse_transform(prediction.detach().cpu()))
                  labels = torch.tensor(scaler.inverse_transform(labels.cpu()))
                loss = loss_sum(prediction, labels)
                final_loss += loss.item()
            final_loss /= val_size
            final_loss = math.sqrt(final_loss) # RMSE
  return final_loss / num_tasks

In [30]:
def loss_func(output, label, mask, num_tasks):
    pos_weight = torch.ones((1, num_tasks))
    pos_weight
    criterion = nn.MSELoss(reduction='none')
    loss = mask*criterion(output,label)
    loss = loss.sum() / mask.sum()
    return loss

In [31]:
def train_epoch(train_dataloader, model, optimizer):
    epoch_train_loss = 0
    iterations = 0
    model.train() # Prepare model for training
    for i, (mol_dgl_graph, labels, masks, globals) in enumerate(train_dataloader):
        prediction = model(mol_dgl_graph, globals)
        loss_train = loss_func(prediction, labels, masks, num_tasks)
        optimizer.zero_grad(set_to_none=True)
        loss_train.backward()
        optimizer.step()
        epoch_train_loss += loss_train.detach().item()
        iterations += 1
    epoch_train_loss /= iterations
    return epoch_train_loss

In [32]:
def train_evaluate():

    model = GNN(config, global_size, num_tasks)
    optimizer = torch.optim.Adam(model.parameters(), lr = 0.0001)

    best_val = 0
    patience_count = 1
    epoch = 1

    while epoch <= num_epochs:
        if patience_count <= patience:
            model.train()
            loss_train = train_epoch(train_dataloader, model, optimizer)
            model.eval()
            score_val = compute_score(model, val_dataloader, len(val_set), num_tasks)
            if score_val > best_val:
                best_val = score_val
                print("Save checkpoint")
                path = os.path.join(checkpoint_path, 'checkpoint.pth')
                dict_checkpoint = {"score_val": score_val}
                dict_checkpoint.update({"model_state_dict": model.state_dict(), "optimizer_state": optimizer.state_dict()})
                with open(path, "wb") as outputfile:
                    cloudpickle.dump(dict_checkpoint, outputfile)
                patience_count = 1
            else:
                print("Patience", patience_count)
                patience_count += 1

            print("Epoch: {}/{} | Training Loss: {:.3f} | Valid Score: {:.3f}".format(
            epoch, num_epochs, loss_train, score_val))

            print(" ")
            print("Epoch: {}/{} | Best Valid Score Until Now: {:.3f}".format(epoch, num_epochs, best_val), "\n")
        epoch += 1

    # best model save
    shutil.rmtree(best_model_path, ignore_errors=True)
    shutil.copytree(checkpoint_path, best_model_path)

    print("Final results:")
    print("Average Valid Score: {:.3f}".format(np.mean(best_val)), "\n")


In [33]:
def test_evaluate():
    final_model = GNN(config, global_size, num_tasks)
    path = os.path.join(best_model_path, 'checkpoint.pth')
    with open(path, 'rb') as f:
        checkpoint = cloudpickle.load(f)
    final_model.load_state_dict(checkpoint["model_state_dict"])
    final_model.eval()
    test_score = compute_score(final_model, test_dataloader, len(test_set), num_tasks)

    print("Test Score: {:.3f}".format(test_score), "\n")
    print("Execution time: {:.3f} seconds".format(time.time() - start_time))

In [34]:
import time
start_time = time.time()

train_evaluate()
test_evaluate()

Save checkpoint
Epoch: 1/100 | Training Loss: 36.900 | Valid Score: 7.426
 
Epoch: 1/100 | Best Valid Score Until Now: 7.426 

Patience 1
Epoch: 2/100 | Training Loss: 41.333 | Valid Score: 7.346
 
Epoch: 2/100 | Best Valid Score Until Now: 7.426 

Patience 2
Epoch: 3/100 | Training Loss: 35.836 | Valid Score: 7.267
 
Epoch: 3/100 | Best Valid Score Until Now: 7.426 

Patience 3
Epoch: 4/100 | Training Loss: 39.173 | Valid Score: 7.188
 
Epoch: 4/100 | Best Valid Score Until Now: 7.426 

Patience 4
Epoch: 5/100 | Training Loss: 35.245 | Valid Score: 7.109
 
Epoch: 5/100 | Best Valid Score Until Now: 7.426 

Patience 5
Epoch: 6/100 | Training Loss: 35.816 | Valid Score: 7.030
 
Epoch: 6/100 | Best Valid Score Until Now: 7.426 

Patience 6
Epoch: 7/100 | Training Loss: 32.535 | Valid Score: 6.952
 
Epoch: 7/100 | Best Valid Score Until Now: 7.426 

Patience 7
Epoch: 8/100 | Training Loss: 33.585 | Valid Score: 6.875
 
Epoch: 8/100 | Best Valid Score Until Now: 7.426 

Patience 8
Epoch: 9

#GNN4

In [35]:
class GNN(nn.Module):
    def __init__(self, config, global_size = 200, num_tasks = 1):
        super().__init__()
        self.config = config
        self.num_tasks = num_tasks
        self.num_heads = 4
        # Node feature size
        self.node_feature_size = self.config.get('node_feature_size', 127)

        # Edge feature size
        self.edge_feature_size = self.config.get('edge_feature_size', 12)

        # Hidden size
        self.hidden_size = self.config.get('hidden_size', 100)

        self.conv1 = GATConv(self.node_feature_size, self.hidden_size, num_heads=self.num_heads)
        self.conv2 = GATConv(self.hidden_size, self.num_tasks, num_heads=1)

    # def forward(self, g, in_feat):
    def forward(self, mol_dgl_graph, globals):
        mol_dgl_graph.ndata["v"]= mol_dgl_graph.ndata["v"][:,:self.node_feature_size]
        mol_dgl_graph.edata["e"] = mol_dgl_graph.edata["e"][:,:self.edge_feature_size]
        h = self.conv1(mol_dgl_graph, mol_dgl_graph.ndata["v"])
        h = F.relu(h)
        h = self.conv2(mol_dgl_graph, h)
        mol_dgl_graph.ndata["h"] = h
        return dgl.mean_nodes(mol_dgl_graph, "h")

In [36]:
import math
from sklearn.metrics import mean_squared_error
def compute_score(model, data_loader, val_size, num_tasks, scaler=None):
  model.eval()
  loss_sum = nn.MSELoss(reduction='sum') # MSE with sum instead of mean, i.e., sum_i[(y_i)^2-(y'_i)^2]
  final_loss = 0
  state = torch.get_rng_state()
  with torch.no_grad():
            for i, (mol_dgl_graph, labels, masks, globals) in enumerate(data_loader):
                prediction = model(mol_dgl_graph, globals)
                if scaler is not None:
                  prediction = torch.tensor(scaler.inverse_transform(prediction.detach().cpu()))
                  labels = torch.tensor(scaler.inverse_transform(labels.cpu()))
                loss = loss_sum(prediction, labels)
                final_loss += loss.item()
            final_loss /= val_size
            final_loss = math.sqrt(final_loss) # RMSE
  return final_loss / num_tasks

In [37]:
def loss_func(output, label, mask, num_tasks):
    pos_weight = torch.ones((1, num_tasks))
    pos_weight
    criterion = nn.MSELoss(reduction='none')
    loss = mask*criterion(output,label)
    loss = loss.sum() / mask.sum()
    return loss

In [38]:
def train_epoch(train_dataloader, model, optimizer):
    epoch_train_loss = 0
    iterations = 0
    model.train() # Prepare model for training
    for i, (mol_dgl_graph, labels, masks, globals) in enumerate(train_dataloader):
        prediction = model(mol_dgl_graph, globals)
        loss_train = loss_func(prediction, labels, masks, num_tasks)
        optimizer.zero_grad(set_to_none=True)
        loss_train.backward()
        optimizer.step()
        epoch_train_loss += loss_train.detach().item()
        iterations += 1
    epoch_train_loss /= iterations
    return epoch_train_loss

In [42]:
def train_evaluate():

    model = GNN(config, global_size, num_tasks)
    optimizer = torch.optim.Adam(model.parameters(), lr = 0.0001)

    best_val = math.inf
    patience_count = 1
    epoch = 1

    while epoch <= num_epochs:
        if patience_count <= patience:
            model.train()
            loss_train = train_epoch(train_dataloader, model, optimizer)
            model.eval()
            score_val = compute_score(model, val_dataloader, len(val_set), num_tasks)
            if score_val > best_val:
                best_val = score_val
                print("Save checkpoint")
                path = os.path.join(checkpoint_path, 'checkpoint.pth')
                dict_checkpoint = {"score_val": score_val}
                dict_checkpoint.update({"model_state_dict": model.state_dict(), "optimizer_state": optimizer.state_dict()})
                with open(path, "wb") as outputfile:
                    cloudpickle.dump(dict_checkpoint, outputfile)
                patience_count = 1
            else:
                print("Patience", patience_count)
                patience_count += 1

            print("Epoch: {}/{} | Training Loss: {:.3f} | Valid Score: {:.3f}".format(
            epoch, num_epochs, loss_train, score_val))

            print(" ")
            print("Epoch: {}/{} | Best Valid Score Until Now: {:.3f}".format(epoch, num_epochs, best_val), "\n")
        epoch += 1

    # best model save
    shutil.rmtree(best_model_path, ignore_errors=True)
    shutil.copytree(checkpoint_path, best_model_path)

    print("Final results:")
    print("Average Valid Score: {:.3f}".format(np.mean(best_val)), "\n")


In [43]:
def test_evaluate():
    final_model = GNN(config, global_size, num_tasks)
    path = os.path.join(best_model_path, 'checkpoint.pth')
    with open(path, 'rb') as f:
        checkpoint = cloudpickle.load(f)
    final_model.load_state_dict(checkpoint["model_state_dict"])
    final_model.eval()
    test_score = compute_score(final_model, test_dataloader, len(test_set), num_tasks)

    print("Test Score: {:.3f}".format(test_score), "\n")
    print("Execution time: {:.3f} seconds".format(time.time() - start_time))

In [44]:
import time
start_time = time.time()

train_evaluate()
test_evaluate()

  return F.mse_loss(input, target, reduction=self.reduction)
  return F.mse_loss(input, target, reduction=self.reduction)


Patience 1
Epoch: 1/100 | Training Loss: 6255.016 | Valid Score: 55.058
 
Epoch: 1/100 | Best Valid Score Until Now: inf 



  return F.mse_loss(input, target, reduction=self.reduction)


Patience 2
Epoch: 2/100 | Training Loss: 6202.146 | Valid Score: 54.868
 
Epoch: 2/100 | Best Valid Score Until Now: inf 

Patience 3
Epoch: 3/100 | Training Loss: 6155.103 | Valid Score: 54.679
 
Epoch: 3/100 | Best Valid Score Until Now: inf 

Patience 4
Epoch: 4/100 | Training Loss: 6105.192 | Valid Score: 54.491
 
Epoch: 4/100 | Best Valid Score Until Now: inf 

Patience 5
Epoch: 5/100 | Training Loss: 6057.119 | Valid Score: 54.303
 
Epoch: 5/100 | Best Valid Score Until Now: inf 

Patience 6
Epoch: 6/100 | Training Loss: 6009.181 | Valid Score: 54.115
 
Epoch: 6/100 | Best Valid Score Until Now: inf 

Patience 7
Epoch: 7/100 | Training Loss: 5959.991 | Valid Score: 53.927
 
Epoch: 7/100 | Best Valid Score Until Now: inf 

Patience 8
Epoch: 8/100 | Training Loss: 5911.547 | Valid Score: 53.738
 
Epoch: 8/100 | Best Valid Score Until Now: inf 

Patience 9
Epoch: 9/100 | Training Loss: 5860.475 | Valid Score: 53.549
 
Epoch: 9/100 | Best Valid Score Until Now: inf 

Patience 10
Epoc

  return F.mse_loss(input, target, reduction=self.reduction)
