# 1 PyTorch Geometric (Datasets and Data)


PyTorch Geometric generally has two classes for storing or transforming the graphs into tensor format. One is the `torch_geometric.datasets`, which contains a variety of common graph datasets. Another one is `torch_geometric.data` that provides the data handling of graphs in PyTorch tensors.

In this section, we will learn how to use the `torch_geometric.datasets` and `torch_geometric.data`.

## PyG Datasets

The `torch_geometric.datasets` has many common graph datasets. Here we will explore the usage by using one example dataset.

In [1]:
from torch_geometric.datasets import TUDataset

root = './enzymes'
name = 'ENZYMES'

# The ENZYMES dataset
pyg_dataset= TUDataset('./enzymes', 'ENZYMES')

# You can find that there are 600 graphs in this dataset
print(pyg_dataset)

ENZYMES(600)


In [2]:
type(pyg_dataset)

torch_geometric.datasets.tu_dataset.TUDataset

In [3]:
pyg_dataset.num_classes

6

## Question 1: What is the number of classes and number of features in the ENZYMES dataset? (5 points)

In [4]:
def get_num_classes(pyg_dataset):
    # TODO: Implement this function that takes a PyG dataset object
    # and return the number of classes for that dataset.

    num_classes = 0

    ############# Your code here ############
    ## (~1 line of code)
    ## Note
    ## 1. Colab autocomplete functionality might be useful.
    num_classes = pyg_dataset.num_classes
    #########################################

    return num_classes

def get_num_features(pyg_dataset):
    # TODO: Implement this function that takes a PyG dataset object
    # and return the number of features for that dataset.

    num_features = 0

    ############# Your code here ############
    ## (~1 line of code)
    ## Note
    ## 1. Colab autocomplete functionality might be useful.
    num_features = pyg_dataset.num_features
    #########################################

    return num_features

    # You may find that some information need to be stored in the dataset level,
    # specifically if there are multiple graphs in the dataset

    num_classes = get_num_classes(pyg_dataset)
    num_features = get_num_features(pyg_dataset)
    print("{} dataset has {} classes".format(name, num_classes))
    print("{} dataset has {} features".format(name, num_features))

## PyG Data

Each PyG dataset usually stores a list of `torch_geometric.data.Data` objects. Each `torch_geometric.data.Data` object usually represents a graph. You can easily get the `Data` object by indexing on the dataset.

For more information such as what will be stored in `Data` object, please refer to the [documentation](https://pytorch-geometric.readthedocs.io/en/latest/modules/data.html#torch_geometric.data.Data).

## Question 2: What is the label of the graph (index 100 in the ENZYMES dataset)? (5 points)

In [5]:
def get_graph_class(pyg_dataset, idx):
    # TODO: Implement this function that takes a PyG dataset object,
    # the index of the graph in dataset, and returns the class/label 
    # of the graph (in integer).

    label = -1

    ############# Your code here ############
    ## (~1 line of code)
    label = int(pyg_dataset[idx]['y'])
    #########################################

    return label

# Here pyg_dataset is a dataset for graph classification
graph_0 = pyg_dataset[0]
print(graph_0)
idx = 100
label = get_graph_class(pyg_dataset, idx)
print('Graph with index {} has label {}'.format(idx, label))

Data(edge_index=[2, 168], x=[37, 3], y=[1])
Graph with index 100 has label 4


## Question 3: What is the number of edges for the graph (index 200 in the ENZYMES dataset)? (5 points)

In [6]:
pyg_dataset[2]['edge_index'].shape

torch.Size([2, 92])

In [7]:
pyg_dataset[2]

Data(edge_index=[2, 92], x=[25, 3], y=[1])

In [8]:
pyg_dataset[2].to_dict()

{'edge_index': tensor([[ 0,  0,  0,  1,  1,  1,  2,  2,  2,  3,  3,  3,  4,  4,  4,  4,  5,  5,
           5,  5,  6,  6,  6,  6,  7,  7,  7,  7,  7,  7,  8,  8,  8,  8,  9,  9,
           9, 10, 10, 10, 10, 10, 11, 11, 11, 12, 12, 12, 12, 13, 13, 13, 14, 14,
          14, 15, 15, 15, 15, 15, 15, 16, 16, 16, 17, 17, 17, 18, 18, 18, 19, 19,
          19, 20, 20, 20, 21, 21, 21, 22, 22, 22, 22, 22, 23, 23, 23, 23, 24, 24,
          24, 24],
         [ 1, 19, 22,  0, 19, 22,  3, 22, 23,  2,  4, 23,  3,  5, 23, 24,  4,  6,
           7, 24,  5,  7,  8, 24,  5,  6,  8,  9, 10, 24,  6,  7,  9, 10,  7,  8,
          10,  7,  8,  9, 11, 21, 10, 20, 21, 13, 14, 15, 20, 12, 14, 15, 12, 13,
          15, 12, 13, 14, 16, 17, 18, 15, 17, 18, 15, 16, 18, 15, 16, 17,  0,  1,
          22, 11, 12, 21, 10, 11, 20,  0,  1,  2, 19, 23,  2,  3,  4, 22,  4,  5,
           6,  7]]),
 'x': tensor([[1., 0., 0.],
         [1., 0., 0.],
         [1., 0., 0.],
         [1., 0., 0.],
         [1., 0., 0.],
      

In [9]:
def get_graph_num_edges(pyg_dataset, idx):
    # TODO: Implement this function that takes a PyG dataset object,
    # the index of the graph in dataset, and returns the number of 
    # edges in the graph (in integer). You should not count an edge 
    # twice if the graph is undirected. For example, in an undirected 
    # graph G, if two nodes v and u are connected by an edge, this edge
    # should only be counted once.

    num_edges = 0

    ############# Your code here ############
    ## Note:
    ## 1. You can't return the data.num_edges directly
    ## 2. We assume the graph is undirected
    ## (~4 lines of code)
    num_edges = int(pyg_dataset[idx]['edge_index'].shape[1] / 2)
    #########################################

    return num_edges

idx = 200
num_edges = get_graph_num_edges(pyg_dataset, idx)
print('Graph with index {} has {} edges'.format(idx, num_edges))

Graph with index 200 has 53 edges



# 2 Open Graph Benchmark (OGB)

The Open Graph Benchmark (OGB) is a collection of realistic, large-scale, and diverse benchmark datasets for machine learning on graphs. Its datasets are automatically downloaded, processed, and split using the OGB Data Loader. The model performance can also be evaluated by using the OGB Evaluator in a unified manner.

In [10]:
import torch_geometric.transforms as T
from ogb.nodeproppred import PygNodePropPredDataset

dataset_name = 'ogbn-arxiv'
# Load the dataset and transform it to sparse tensor
dataset = PygNodePropPredDataset(name=dataset_name,
                                 transform=T.ToSparseTensor())
print('The {} dataset has {} graph'.format(dataset_name, len(dataset)))

# Extract the graph
data = dataset[0]
print(data)


The ogbn-arxiv dataset has 1 graph
Data(num_nodes=169343, x=[169343, 128], node_year=[169343, 1], y=[169343, 1], adj_t=[169343, 169343, nnz=1166243])


## Question 4: What is the number of features in the ogbn-arxiv graph? (5 points)

In [11]:
def graph_num_features(data):
    # TODO: Implement this function that takes a PyG data object,
    # and returns the number of features in the graph (in integer).

    num_features = 0

    ############# Your code here ############
    ## (~1 line of code)
    num_features = data.num_features
    #########################################

    return num_features

num_features = graph_num_features(data)
print('The graph has {} features'.format(num_features))

The graph has 128 features


In [12]:
data.to_dict()

{'num_nodes': 169343,
 'x': tensor([[-0.0579, -0.0525, -0.0726,  ...,  0.1734, -0.1728, -0.1401],
         [-0.1245, -0.0707, -0.3252,  ...,  0.0685, -0.3721, -0.3010],
         [-0.0802, -0.0233, -0.1838,  ...,  0.1099,  0.1176, -0.1399],
         ...,
         [-0.2205, -0.0366, -0.4022,  ...,  0.1134, -0.1614, -0.1452],
         [-0.1382,  0.0409, -0.2518,  ..., -0.0893, -0.0413, -0.3761],
         [-0.0299,  0.2684, -0.1611,  ...,  0.1208,  0.0776, -0.0910]]),
 'node_year': tensor([[2013],
         [2015],
         [2014],
         ...,
         [2020],
         [2020],
         [2020]]),
 'y': tensor([[ 4],
         [ 5],
         [28],
         ...,
         [10],
         [ 4],
         [ 1]]),
 'adj_t': SparseTensor(row=tensor([     0,      0,      0,  ..., 169341, 169341, 169341]),
              col=tensor([   411,    640,   1162,  ...,  30351,  35711, 103121]),
              size=(169343, 169343), nnz=1166243, density=0.00%)}

# 3 GNN: Node Property Prediction

In this section we will build our first graph neural network by using PyTorch Geometric and apply it on node property prediction (node classification).

We will build the graph neural network by using GCN operator ([Kipf et al. (2017)](https://arxiv.org/pdf/1609.02907.pdf)).

You should use the PyG built-in `GCNConv` layer directly. 

In [13]:
import torch
import torch.nn.functional as F
print(torch.__version__)

# The PyG built-in GCNConv
from torch_geometric.nn import GCNConv

import torch_geometric.transforms as T
from ogb.nodeproppred import PygNodePropPredDataset, Evaluator

1.11.0


In [14]:
dataset_name = 'ogbn-arxiv'
dataset = PygNodePropPredDataset(name=dataset_name,
                                 transform=T.ToSparseTensor())
data = dataset[0]

# Make the adjacency matrix to symmetric
data.adj_t = data.adj_t.to_symmetric()

device = 'cuda' if torch.cuda.is_available() else 'cpu'

# If you use GPU, the device should be cuda
print('Device: {}'.format(device))

data = data.to(device)
split_idx = dataset.get_idx_split()
train_idx = split_idx['train'].to(device)

Device: cuda


## GCN Model

Now we will implement our GCN model!

Please follow the figure below to implement your `forward` function.


![test](https://drive.google.com/uc?id=128AuYAXNXGg7PIhJJ7e420DoPWKb-RtL)

In [15]:
#         gcn_convs = [GCNConv(in_channels = input_dim, out_channels = num_features)]
#         for i in range(num_layers-2):
#             gcn_convs.append(GCNConv(in_channels=num_features, out_channels=num_features))
#         gcn_convs.append(GCNConv(in_channels = num_features, out_channels = output_dim))
#         self.bns = nn.ModuleList([nn.BatchNorm1d(num_features) for i in range(num_layers-1)])


In [16]:
import torch.nn as nn

class GCN(torch.nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim, num_layers,
                 dropout, return_embeds=False):
        # TODO: Implement this function that initializes self.convs, 
        # self.bns, and self.softmax.

        super(GCN, self).__init__()


        ############# Your code here ############
        ## Note:
        ## 1. You should use torch.nn.ModuleList for self.convs and self.bns
        ## 2. self.convs has num_layers GCNConv layers
        ## 3. self.bns has num_layers - 1 BatchNorm1d layers
        ## 4. You should use torch.nn.LogSoftmax for self.softmax
        ## 5. The parameters you can set for GCNConv include 'in_channels' and 
        ## 'out_channels'. More information please refer to the documentation:
        ## https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html#torch_geometric.nn.conv.GCNConv
        ## 6. The only parameter you need to set for BatchNorm1d is 'num_features'
        ## More information please refer to the documentation: 
        ## https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm1d.html
        ## (~10 lines of code)
        
        self.num_layers = num_layers
        
        # A list of GCNConv layers
        gcn_convs = [GCNConv(in_channels = input_dim, out_channels = num_features)]
        for i in range(num_layers-2):
            gcn_convs.append(GCNConv(in_channels = num_features // (2**i),
                                     out_channels=num_features // (2**(i+1))))
        gcn_convs.append(GCNConv(in_channels = num_features // (2**(num_layers-2)), out_channels = output_dim))
        self.convs = nn.ModuleList(gcn_convs)

        # A list of 1D batch normalization layers
        self.bns = nn.ModuleList([nn.BatchNorm1d(num_features // (2 ** i)) for i in range(num_layers-1)])

        # The log softmax layer
        self.softmax = torch.nn.LogSoftmax(dim=1)

        #########################################

        # Probability of an element to be zeroed
        self.dropout = dropout

        # Skip classification layer and return node embeddings
        self.return_embeds = return_embeds

    def reset_parameters(self):
        for conv in self.convs:
            conv.reset_parameters()
        for bn in self.bns:
            bn.reset_parameters()

    def forward(self, x, adj_t):
        # TODO: Implement this function that takes the feature tensor x,
        # edge_index tensor adj_t and returns the output tensor as
        # shown in the figure.

        ############# Your code here ############
        ## Note:
        ## 1. Construct the network as showing in the figure
        ## 2. torch.nn.functional.relu and torch.nn.functional.dropout are useful
        ## More information please refer to the documentation:
        ## https://pytorch.org/docs/stable/nn.functional.html
        ## 3. Don't forget to set F.dropout training to self.training
        ## 4. If return_embeds is True, then skip the last softmax layer
        ## (~7 lines of code)
        
        for i in range(self.num_layers-1):
            x = self.convs[i](x, adj_t)
            x = self.bns[i](x)
            x = F.relu(x)
            x = F.dropout(x, p=self.dropout, training=self.training)
        x = self.convs[self.num_layers-1](x, adj_t)
        if self.return_embeds:
            return x 
        return self.softmax(x)
    
        #########################################


In [17]:
def train(model, data, train_idx, optimizer, loss_fn):
    # TODO: Implement this function that trains the model by 
    # using the given optimizer and loss_fn.
    model.train()
    loss = 0

    ############# Your code here ############
    ## Note:
    ## 1. Zero grad the optimizer
    ## 2. Feed the data into the model
    ## 3. Slicing the model output and label by train_idx
    ## 4. Feed the sliced output and label to loss_fn
    ## (~4 lines of code)
    
    optimizer.zero_grad()
    out = model.forward(data.x, data.adj_t)
    loss = F.nll_loss(out[split_idx['train']], data.y[split_idx['train']].squeeze())
    
    #########################################

    loss.backward()
    optimizer.step()

    return loss.item()

In [18]:
# Test function here
@torch.no_grad()
def test(model, data, split_idx, evaluator):
    # TODO: Implement this function that tests the model by 
    # using the given split_idx and evaluator.
    model.eval()

    # The output of model on all data
    out = None

    ############# Your code here ############
    ## (~1 line of code)
    ## Note:
    ## 1. No index slicing here
    out = model.forward(data.x, data.adj_t)
    #########################################

    y_pred = out.argmax(dim=-1, keepdim=True)

    train_acc = evaluator.eval({
        'y_true': data.y[split_idx['train']],
        'y_pred': y_pred[split_idx['train']],
    })['acc']
    valid_acc = evaluator.eval({
        'y_true': data.y[split_idx['valid']],
        'y_pred': y_pred[split_idx['valid']],
    })['acc']
    test_acc = evaluator.eval({
        'y_true': data.y[split_idx['test']],
        'y_pred': y_pred[split_idx['test']],
    })['acc']

    return train_acc, valid_acc, test_acc

In [19]:
# Please do not change the args
args = {
    'device': device,
    'num_layers': 3,
    'hidden_dim': 256,
    'dropout': 0.5,
    'lr': 0.01,
    'epochs': 100,
}

model = GCN(data.num_features, args['hidden_dim'],
            dataset.num_classes, args['num_layers'],
            args['dropout']).to(device)
evaluator = Evaluator(name='ogbn-arxiv')

In [20]:
import copy

# reset the parameters to initial random value
model.reset_parameters()

optimizer = torch.optim.Adam(model.parameters(), lr=args['lr'])
loss_fn = F.nll_loss

best_model = None
best_valid_acc = 0

for epoch in range(1, 1 + args["epochs"]):
    loss = train(model, data, train_idx, optimizer, loss_fn)
    result = test(model, data, split_idx, evaluator)
    train_acc, valid_acc, test_acc = result
    if valid_acc > best_valid_acc:
        best_valid_acc = valid_acc
        best_model = copy.deepcopy(model)
    print(f'Epoch: {epoch:02d}, '
            f'Loss: {loss:.4f}, '
            f'Train: {100 * train_acc:.2f}%, '
            f'Valid: {100 * valid_acc:.2f}% '
            f'Test: {100 * test_acc:.2f}%')

Epoch: 01, Loss: 4.2124, Train: 16.52%, Valid: 24.91% Test: 23.01%
Epoch: 02, Loss: 3.1466, Train: 25.01%, Valid: 28.83% Test: 25.92%
Epoch: 03, Loss: 2.6837, Train: 29.08%, Valid: 31.97% Test: 29.89%
Epoch: 04, Loss: 2.4358, Train: 41.06%, Valid: 45.84% Test: 44.34%
Epoch: 05, Loss: 2.2674, Train: 43.22%, Valid: 43.98% Test: 41.88%
Epoch: 06, Loss: 2.1560, Train: 41.83%, Valid: 41.80% Test: 38.93%
Epoch: 07, Loss: 2.0690, Train: 41.39%, Valid: 41.90% Test: 39.07%
Epoch: 08, Loss: 1.9896, Train: 41.28%, Valid: 42.28% Test: 39.80%
Epoch: 09, Loss: 1.9209, Train: 41.49%, Valid: 43.37% Test: 41.85%
Epoch: 10, Loss: 1.8672, Train: 42.19%, Valid: 44.81% Test: 44.97%
Epoch: 11, Loss: 1.8093, Train: 43.79%, Valid: 47.07% Test: 48.09%
Epoch: 12, Loss: 1.7668, Train: 45.86%, Valid: 49.42% Test: 50.68%
Epoch: 13, Loss: 1.7240, Train: 48.45%, Valid: 51.97% Test: 52.77%
Epoch: 14, Loss: 1.6790, Train: 51.41%, Valid: 54.49% Test: 54.74%
Epoch: 15, Loss: 1.6411, Train: 54.55%, Valid: 56.61% Test: 56

## Question 5: What are your `best_model` validation and test accuracy? Please report them on Gradescope. For example, for an accuracy such as 50.01%, just report 50.01 and please don't include the percent sign. (20 points)

In [21]:
best_valid_acc

0.7108292224571294

# 4 GNN: Graph Property Prediction

In this section we will create a graph neural network for graph property prediction (graph classification)


In [22]:
from ogb.graphproppred import PygGraphPropPredDataset, Evaluator
from torch_geometric.data import DataLoader
from tqdm import tqdm

# Load the dataset 
dataset = PygGraphPropPredDataset(name='ogbg-molhiv')

device = 'cuda' if torch.cuda.is_available() else 'cpu'
print('Device: {}'.format(device))

split_idx = dataset.get_idx_split()

# Check task type
print('Task type: {}'.format(dataset.task_type))

# Load the data sets into dataloader
# We will train the graph classification task on a batch of 32 graphs
# Shuffle the order of graphs for training set
train_loader = DataLoader(dataset[split_idx["train"]], batch_size=32, shuffle=True, num_workers=0)
valid_loader = DataLoader(dataset[split_idx["valid"]], batch_size=32, shuffle=False, num_workers=0)
test_loader = DataLoader(dataset[split_idx["test"]], batch_size=32, shuffle=False, num_workers=0)


# Please do not change the args
args = {
    'device': device,
    'num_layers': 5,
    'hidden_dim': 256,
    'dropout': 0.5,
    'lr': 0.001,
    'epochs': 30,
}
args

Device: cuda
Task type: binary classification




{'device': 'cuda',
 'num_layers': 5,
 'hidden_dim': 256,
 'dropout': 0.5,
 'lr': 0.001,
 'epochs': 30}

In [23]:
from ogb.graphproppred.mol_encoder import AtomEncoder
from torch_geometric.nn import global_add_pool, global_mean_pool

### GCN to predict graph property
class GCN_Graph(torch.nn.Module):
    def __init__(self, hidden_dim, output_dim, num_layers, dropout):
        super(GCN_Graph, self).__init__()

        # Load encoders for Atoms in molecule graphs
        self.node_encoder = AtomEncoder(hidden_dim)

        # Node embedding model
        # Note that the input_dim and output_dim are set to hidden_dim
        self.gnn_node = GCN(hidden_dim, hidden_dim,
            hidden_dim, num_layers, dropout, return_embeds=True)

        self.pool = None

        ############# Your code here ############
        ## Note:
        ## 1. Initialize the self.pool to global mean pooling layer
        ## More information please refer to the documentation:
        ## https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html#global-pooling-layers
        ## (~1 line of code)
        
        self.pool = global_mean_pool

        #########################################

        # Output layer
        self.linear = torch.nn.Linear(hidden_dim, output_dim)


    def reset_parameters(self):
        self.gnn_node.reset_parameters()
        self.linear.reset_parameters()

    def forward(self, batched_data):
        # TODO: Implement this function that takes the input tensor batched_data,
        # returns a batched output tensor for each graph.
        x, edge_index, batch = batched_data.x, batched_data.edge_index, batched_data.batch
        embed = self.node_encoder(x)

        out = None

        ############# Your code here ############
        ## Note:
        ## 1. Construct node embeddings using existing GCN model
        ## 2. Use global pooling layer to construct features for the whole graph
        ## More information please refer to the documentation:
        ## https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html#global-pooling-layers
        ## 3. Use a linear layer to predict the graph property 
        ## (~3 lines of code)
        
        x = self.gnn_node(embed, edge_index)
        x = self.pool(x, batch)
        out = self.linear(x)

        #########################################

        return out

In [24]:
def train(model, device, data_loader, optimizer, loss_fn):
    # TODO: Implement this function that trains the model by 
    # using the given optimizer and loss_fn.
    model.train()
    loss = 0

    for step, batch in enumerate(tqdm(data_loader, desc="Iteration")):
        batch = batch.to(device)

        if batch.x.shape[0] == 1 or batch.batch[-1] == 0:
            pass
        else:
            ## ignore nan targets (unlabeled) when computing training loss.
            is_labeled = batch.y == batch.y

            ############# Your code here ############
            ## Note:
            ## 1. Zero grad the optimizer
            ## 2. Feed the data into the model
            ## 3. Use `is_labeled` mask to filter output and labels
            ## 4. You might change the type of label
            ## 5. Feed the output and label to loss_fn
            ## (~3 lines of code)
            
            optimizer.zero_grad()
            out = model.forward(batch)
            loss = loss_fn(out[is_labeled], batch.y[is_labeled].float())
            
            #########################################

            loss.backward()
            optimizer.step()

    return loss.item()

# The evaluation function
def eval(model, device, loader, evaluator):
    model.eval()
    y_true = []
    y_pred = []

    for step, batch in enumerate(tqdm(loader, desc="Iteration")):
        batch = batch.to(device)

        if batch.x.shape[0] == 1:
            pass
        else:
            with torch.no_grad():
                pred = model(batch)

            y_true.append(batch.y.view(pred.shape).detach().cpu())
            y_pred.append(pred.detach().cpu())

    y_true = torch.cat(y_true, dim = 0).numpy()
    y_pred = torch.cat(y_pred, dim = 0).numpy()

    input_dict = {"y_true": y_true, "y_pred": y_pred}

    return evaluator.eval(input_dict)

In [25]:
model = GCN_Graph(args['hidden_dim'],
            dataset.num_tasks, args['num_layers'],
            args['dropout']).to(device)
evaluator = Evaluator(name='ogbg-molhiv')

In [26]:
import copy

model.reset_parameters()

optimizer = torch.optim.Adam(model.parameters(), lr=args['lr'])
loss_fn = torch.nn.BCEWithLogitsLoss()

best_model = None
best_valid_acc = 0

for epoch in range(1, 1 + args["epochs"]):
    print('Training...')
    loss = train(model, device, train_loader, optimizer, loss_fn)

    print('Evaluating...')
    train_result = eval(model, device, train_loader, evaluator)
    val_result = eval(model, device, valid_loader, evaluator)
    test_result = eval(model, device, test_loader, evaluator)

    train_acc, valid_acc, test_acc = train_result[dataset.eval_metric], val_result[dataset.eval_metric], test_result[dataset.eval_metric]
    if valid_acc > best_valid_acc:
        best_valid_acc = valid_acc
        best_model = copy.deepcopy(model)
    print(f'Epoch: {epoch:02d}, '
        f'Loss: {loss:.4f}, '
        f'Train: {100 * train_acc:.2f}%, '
        f'Valid: {100 * valid_acc:.2f}% '
        f'Test: {100 * test_acc:.2f}%')

Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:07<00:00, 142.71it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 303.50it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 250.75it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 210.55it/s]


Epoch: 01, Loss: 0.0377, Train: 71.41%, Valid: 74.75% Test: 66.73%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 162.57it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 305.07it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 289.53it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 290.64it/s]


Epoch: 02, Loss: 0.0403, Train: 74.18%, Valid: 74.95% Test: 73.29%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 161.83it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 300.81it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 281.53it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 311.73it/s]


Epoch: 03, Loss: 0.6944, Train: 73.70%, Valid: 71.51% Test: 73.23%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 163.86it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 303.72it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 286.34it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 315.95it/s]


Epoch: 04, Loss: 0.0520, Train: 75.65%, Valid: 76.78% Test: 69.59%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 163.18it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 305.51it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 299.60it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 295.67it/s]


Epoch: 05, Loss: 0.0172, Train: 75.42%, Valid: 74.90% Test: 72.76%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 161.95it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 303.43it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 290.25it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 305.90it/s]


Epoch: 06, Loss: 0.9050, Train: 76.36%, Valid: 72.33% Test: 73.09%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 163.27it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 302.97it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 304.65it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 315.62it/s]


Epoch: 07, Loss: 0.0297, Train: 77.42%, Valid: 73.01% Test: 72.37%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 163.31it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 305.44it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 300.70it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 310.32it/s]


Epoch: 08, Loss: 0.0256, Train: 77.99%, Valid: 76.75% Test: 71.09%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 163.34it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 304.36it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 304.57it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 310.37it/s]


Epoch: 09, Loss: 0.0587, Train: 78.31%, Valid: 77.95% Test: 72.29%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 163.31it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 303.96it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 297.90it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 311.20it/s]


Epoch: 10, Loss: 0.0508, Train: 78.68%, Valid: 76.66% Test: 71.41%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 161.90it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 298.97it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 287.15it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 313.40it/s]


Epoch: 11, Loss: 0.0262, Train: 79.03%, Valid: 78.10% Test: 73.43%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 161.69it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 302.94it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 297.24it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 308.23it/s]


Epoch: 12, Loss: 0.0205, Train: 78.83%, Valid: 75.85% Test: 71.03%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 162.91it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 306.20it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 305.73it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 312.99it/s]


Epoch: 13, Loss: 0.0298, Train: 79.43%, Valid: 76.11% Test: 72.65%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 163.12it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 305.30it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 300.16it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 315.82it/s]


Epoch: 14, Loss: 0.0443, Train: 79.29%, Valid: 76.74% Test: 71.44%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 163.01it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 304.99it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 312.03it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 315.26it/s]


Epoch: 15, Loss: 0.0514, Train: 80.04%, Valid: 78.42% Test: 73.26%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 162.90it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 305.42it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 309.99it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 311.72it/s]


Epoch: 16, Loss: 0.8854, Train: 79.41%, Valid: 77.08% Test: 72.93%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 163.39it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 299.24it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 298.31it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 303.96it/s]


Epoch: 17, Loss: 0.0257, Train: 79.47%, Valid: 77.28% Test: 70.12%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 163.33it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 304.92it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 294.94it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 318.41it/s]


Epoch: 18, Loss: 0.0218, Train: 80.58%, Valid: 75.84% Test: 70.69%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 163.98it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 300.49it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 289.22it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 308.09it/s]


Epoch: 19, Loss: 0.0167, Train: 78.49%, Valid: 77.13% Test: 73.16%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 163.49it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 304.86it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 308.13it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 311.05it/s]


Epoch: 20, Loss: 0.6799, Train: 79.85%, Valid: 75.30% Test: 74.12%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 162.31it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 303.83it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 281.23it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 306.37it/s]


Epoch: 21, Loss: 0.9018, Train: 80.25%, Valid: 73.79% Test: 73.42%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 163.21it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 302.50it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 284.88it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 307.51it/s]


Epoch: 22, Loss: 0.0241, Train: 80.31%, Valid: 78.87% Test: 72.33%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 163.38it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 306.47it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 274.52it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 304.88it/s]


Epoch: 23, Loss: 0.0324, Train: 80.82%, Valid: 75.91% Test: 73.09%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 163.13it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 303.56it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 268.27it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 304.69it/s]


Epoch: 24, Loss: 0.8580, Train: 80.18%, Valid: 76.15% Test: 73.13%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 163.69it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 305.02it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 279.82it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 304.68it/s]


Epoch: 25, Loss: 0.0311, Train: 81.02%, Valid: 72.96% Test: 70.78%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 162.69it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 304.22it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 314.30it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 307.68it/s]


Epoch: 26, Loss: 0.0177, Train: 81.04%, Valid: 77.32% Test: 74.04%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 162.19it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 303.03it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 280.81it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 306.25it/s]


Epoch: 27, Loss: 0.9750, Train: 81.50%, Valid: 77.56% Test: 73.91%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 162.77it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 304.74it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 301.25it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 297.44it/s]


Epoch: 28, Loss: 0.4485, Train: 80.84%, Valid: 77.23% Test: 72.52%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 162.02it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 304.80it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 308.56it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 317.56it/s]


Epoch: 29, Loss: 0.0283, Train: 82.07%, Valid: 76.53% Test: 73.76%
Training...


Iteration: 100%|███████████████████████████| 1029/1029 [00:06<00:00, 163.61it/s]


Evaluating...


Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 304.12it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 304.81it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 311.89it/s]

Epoch: 30, Loss: 0.7446, Train: 81.52%, Valid: 76.25% Test: 73.15%





In [27]:
train_acc = eval(best_model, device, train_loader, evaluator)[dataset.eval_metric]
valid_acc = eval(best_model, device, valid_loader, evaluator)[dataset.eval_metric]
test_acc = eval(best_model, device, test_loader, evaluator)[dataset.eval_metric]

print(f'Best model: '
      f'Train: {100 * train_acc:.2f}%, '
      f'Valid: {100 * valid_acc:.2f}% '
      f'Test: {100 * test_acc:.2f}%')

Iteration: 100%|███████████████████████████| 1029/1029 [00:03<00:00, 299.66it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 299.43it/s]
Iteration: 100%|█████████████████████████████| 129/129 [00:00<00:00, 316.93it/s]

Best model: Train: 80.31%, Valid: 78.87% Test: 72.33%





## Question 6: What are your `best_model` validation and test ROC-AUC score? Please report them on Gradescope. For example, for an ROC-AUC score such as 50.01%, just report 50.01 and please don't include the percent sign. (20 points)

In [28]:
print(dataset.eval_metric)
print(valid_acc)
print(test_acc)


rocauc
0.7886629188712522
0.7233376465362404


## Question 7 (Optional): Experiment with other two global pooling layers other than mean pooling in Pytorch Geometric.

## References 
---
- http://snap.stanford.edu/class/cs224w-2020/
- https://colab.research.google.com/drive/1Aa0eKSmyYef1gORvlHv7EeQzSVRb30eL?usp=sharing
- https://pytorch-geometric.readthedocs.io/en/latest/notes/introduction.html