# **CS224W - Colab 2**

In Colab 2, we will work to construct our own graph neural network using PyTorch Geometric (PyG) and then apply that model on two Open Graph Benchmark (OGB) datasets. These two datasets will be used to benchmark your model's performance on two different graph-based tasks: 1) node property prediction, predicting properties of single nodes and 2) graph property prediction, predicting properties of entire graphs or subgraphs.

First, we will learn how PyTorch Geometric stores graphs as PyTorch tensors.

Then, we will load and inspect one of the Open Graph Benchmark (OGB) datasets by using the `ogb` package. OGB is a collection of realistic, large-scale, and diverse benchmark datasets for machine learning on graphs. The `ogb` package not only provides data loaders for each dataset but also model evaluators.

Lastly, we will build our own graph neural network using PyTorch Geometric. We will then train and evaluate our model on the OGB node property prediction and graph property prediction tasks.

**Note**: Make sure to **sequentially run all the cells in each section**, so that the intermediate variables / packages will carry over to the next cell

We recommend you save a copy of this colab in your drive so you don't lose progress!

The expected time to finish this Colab is 2 hours. However, debugging training loops can easily take a while. So, don't worry at all if it takes you longer! Have fun and good luck on Colab 2 :)

# Device
You might need to use a GPU for this Colab to run quickly.

Please click `Runtime` and then `Change runtime type`. Then set the `hardware accelerator` to **GPU**.

# Setup
As discussed in Colab 0, the installation of PyG on Colab can be a little bit tricky. First let us check which version of PyTorch you are running

In [None]:
import torch
import os
print("PyTorch has version {}".format(torch.__version__))

PyTorch has version 2.4.0+cu121


In [None]:
if 'IS_GRADESCOPE_ENV' not in os.environ:
  !pip install torch==2.4.0



Download the necessary packages for PyG. Make sure that your version of torch matches the output from the cell above. In case of any issues, more information can be found on the [PyG's installation page](https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html).

In [None]:
# Install torch geometric
if 'IS_GRADESCOPE_ENV' not in os.environ:
  torch_version = str(torch.__version__)
  scatter_src = f"https://pytorch-geometric.com/whl/torch-{torch_version}.html"
  sparse_src = f"https://pytorch-geometric.com/whl/torch-{torch_version}.html"
  !pip install torch-scatter -f $scatter_src
  !pip install torch-sparse -f $sparse_src
  !pip install torch-geometric
  !pip install ogb

Looking in links: https://pytorch-geometric.com/whl/torch-2.4.0+cu121.html
Looking in links: https://pytorch-geometric.com/whl/torch-2.4.0+cu121.html


# 1) PyTorch Geometric (Datasets and Data)


PyTorch Geometric has two classes for storing and/or transforming graphs into tensor format. One is `torch_geometric.datasets`, which contains a variety of common graph datasets. Another is `torch_geometric.data`, which provides the data handling of graphs in PyTorch tensors.

In this section, we will learn how to use `torch_geometric.datasets` and `torch_geometric.data` together.

## PyG Datasets

The `torch_geometric.datasets` class has many common graph datasets. Here we will explore its usage through one example dataset.

In [None]:
from torch_geometric.datasets import TUDataset

if 'IS_GRADESCOPE_ENV' not in os.environ:
  root = './enzymes'
  name = 'ENZYMES'

  # The ENZYMES dataset
  pyg_dataset= TUDataset(root, name)

  # You will find that there are 600 graphs in this dataset
  print(pyg_dataset)

Downloading https://www.chrsmrrs.com/graphkerneldatasets/ENZYMES.zip
Processing...


ENZYMES(600)


Done!


## Question 1: What is the number of classes and number of features in the ENZYMES dataset?

In [None]:
def get_num_classes(pyg_dataset):
  # returns the number of classes for that dataset.

  return pyg_dataset.num_classes

def get_num_features(pyg_dataset):
  # returns the number of features for that dataset.

  return pyg_dataset.num_features

if 'IS_GRADESCOPE_ENV' not in os.environ:
  num_classes = get_num_classes(pyg_dataset)
  num_features = get_num_features(pyg_dataset)
  print("{} dataset has {} classes".format(name, num_classes))
  print("{} dataset has {} features".format(name, num_features))

ENZYMES dataset has 6 classes
ENZYMES dataset has 3 features


## PyG Data

Each PyG dataset stores a list of `torch_geometric.data.Data` objects, where each `torch_geometric.data.Data` object represents a graph. We can easily get the `Data` object by indexing into the dataset.

For more information such as what is stored in the `Data` object, please refer to the [documentation](https://pytorch-geometric.readthedocs.io/en/latest/modules/data.html#torch_geometric.data.Data).

## Question 2: What is the label of the graph with index 100 in the ENZYMES dataset?

In [None]:
def get_graph_class(pyg_dataset, idx):
  # Get the graph with the given index
  graph = pyg_dataset[idx]

  # Get the label of the graph
  label = graph.y.item()

  return label

# Here pyg_dataset is a dataset for graph classification
if 'IS_GRADESCOPE_ENV' not in os.environ:
  graph_0 = pyg_dataset[0]
  print(graph_0)
  idx = 100
  label = get_graph_class(pyg_dataset, idx)
  print('Graph with index {} has label {}'.format(idx, label))

Data(edge_index=[2, 168], x=[37, 3], y=[1])
Graph with index 100 has label 4


## Question 3: How many edges does the graph with index 200 have?

In [None]:
def get_graph_num_edges(pyg_dataset, idx):
  # Get the graph with the given index
  graph = pyg_dataset[idx]

  # Get the edge index
  edge_index = graph.edge_index

  # Convert the edge index to a set of tuples
  edges = set()
  for i in range(edge_index.shape[1]):
    node1 = edge_index[0, i].item()
    node2 = edge_index[1, i].item()
    # Add the edge to the set, ensuring we don't count it twice
    edges.add(tuple(sorted((node1, node2))))

  # The number of edges is the length of the set
  num_edges = len(edges)

  return num_edges

if 'IS_GRADESCOPE_ENV' not in os.environ:
  idx = 200
  num_edges = get_graph_num_edges(pyg_dataset, idx)
  print('Graph with index {} has {} edges'.format(idx, num_edges))

Graph with index 200 has 53 edges


# 2) Open Graph Benchmark (OGB)

The Open Graph Benchmark (OGB) is a collection of realistic, large-scale, and diverse benchmark datasets for machine learning on graphs. Its datasets are automatically downloaded, processed, and split using the OGB Data Loader. The model performance can then be evaluated by using the OGB Evaluator in a unified manner.

## Dataset and Data

OGB also supports PyG dataset and data classes. Here we take a look on the `ogbn-arxiv` dataset.

In [None]:
import torch_geometric.transforms as T
from ogb.nodeproppred import PygNodePropPredDataset

if 'IS_GRADESCOPE_ENV' not in os.environ:
  dataset_name = 'ogbn-arxiv'
  # Load the dataset and transform it to sparse tensor
  dataset = PygNodePropPredDataset(name=dataset_name,
                                  transform=T.ToSparseTensor())
  print('The {} dataset has {} graph'.format(dataset_name, len(dataset)))

  # Extract the graph
  data = dataset[0]
  print(data)

Downloading http://snap.stanford.edu/ogb/data/nodeproppred/arxiv.zip


Downloaded 0.08 GB: 100%|██████████| 81/81 [00:01<00:00, 50.87it/s]


Extracting dataset/arxiv.zip


Processing...


Loading necessary files...
This might take a while.
Processing graphs...


100%|██████████| 1/1 [00:00<00:00, 1857.53it/s]


Converting graphs into PyG objects...


100%|██████████| 1/1 [00:00<00:00, 3738.24it/s]

Saving...



Done!
  self.data, self.slices = torch.load(self.processed_paths[0])


The ogbn-arxiv dataset has 1 graph
Data(num_nodes=169343, x=[169343, 128], node_year=[169343, 1], y=[169343, 1], adj_t=[169343, 169343])


  adj = torch.sparse_csr_tensor(


## Question 4: How many features are in the ogbn-arxiv graph?

In [None]:
def graph_num_features(data):
  # returns the number of features in the graph (as an integer).

  return data.num_features

if 'IS_GRADESCOPE_ENV' not in os.environ:
  num_features = graph_num_features(data)
  print('The graph has {} features'.format(num_features))

The graph has 128 features


# 3) GNN: Node Property Prediction

In this section we will build our first graph neural network using PyTorch Geometric. Then we will apply it to the task of node property prediction (node classification).

Specifically, we will use GCN as the foundation for your graph neural network ([Kipf et al. (2017)](https://arxiv.org/pdf/1609.02907.pdf)). To do so, we will work with PyG's built-in `GCNConv` layer.

## Setup

In [None]:
import torch
import pandas as pd
import torch.nn.functional as F
print(torch.__version__)

# The PyG built-in GCNConv
from torch_geometric.nn import GCNConv

import torch_geometric.transforms as T
from ogb.nodeproppred import PygNodePropPredDataset, Evaluator

2.4.0+cu121


## Load and Preprocess the Dataset

In [None]:
if 'IS_GRADESCOPE_ENV' not in os.environ:
  dataset_name = 'ogbn-arxiv'
  dataset = PygNodePropPredDataset(name=dataset_name,
                                  transform=T.ToSparseTensor())
  data = dataset[0]

  # Make the adjacency matrix to symmetric
  data.adj_t = data.adj_t.to_symmetric()

  device = 'cuda' if torch.cuda.is_available() else 'cpu'

  # If you use GPU, the device should be cuda
  print('Device: {}'.format(device))

  data = data.to(device)
  split_idx = dataset.get_idx_split()
  train_idx = split_idx['train'].to(device)

  self.data, self.slices = torch.load(self.processed_paths[0])


AttributeError: 'Tensor' object has no attribute 'to_symmetric'

## GCN Model

Now we will implement our GCN model!

Please follow the figure below to implement the `forward` function.


![test](https://drive.google.com/uc?id=128AuYAXNXGg7PIhJJ7e420DoPWKb-RtL)

# Graph Convolutional Network (GCN) Implementation

This code implements a flexible Graph Convolutional Network with several key architectural features:

## Architecture Components
1. **Multiple GCN Layers**:
   - Configurable number of layers
   - Consistent hidden dimension across intermediate layers
   - Input and output dimensions can be specified separately

2. **Regularization Features**:
   - Batch normalization after each hidden layer
   - Dropout for preventing overfitting
   - ReLU activation between layers

3. **Flexibility Options**:
   - Can return node embeddings or classification outputs
   - Includes parameter reset functionality
   - Configurable architecture depth

## Processing Flow
The network processes graph data through sequential layers of:
1. Graph convolution
2. Batch normalization
3. ReLU activation
4. Dropout
Finally applying softmax for classification (unless returning embeddings)

In [None]:
class GCN(torch.nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim, num_layers,
                 dropout, return_embeds=False):

        super(GCN, self).__init__()

        # A list of GCNConv layers
        self.convs = torch.nn.ModuleList()
        self.convs.append(GCNConv(input_dim, hidden_dim))  # First layer
        for _ in range(num_layers - 2):  # Hidden layers
            self.convs.append(GCNConv(hidden_dim, hidden_dim))
        self.convs.append(GCNConv(hidden_dim, output_dim))  # Last layer

        # A list of 1D batch normalization layers
        self.bns = torch.nn.ModuleList()
        for _ in range(num_layers - 1):
            self.bns.append(torch.nn.BatchNorm1d(hidden_dim))

        # The log softmax layer
        self.softmax = torch.nn.LogSoftmax(dim=-1)

        # Probability of an element getting zeroed
        self.dropout = dropout

        # Skip classification layer and return node embeddings
        self.return_embeds = return_embeds

    def reset_parameters(self):
        for conv in self.convs:
            conv.reset_parameters()
        for bn in self.bns:
            bn.reset_parameters()

    def forward(self, x, adj_t):

        # First layer
        out = self.convs[0](x, adj_t)
        out = self.bns[0](out)
        out = F.relu(out)
        out = F.dropout(out, p=self.dropout, training=self.training)

        # Hidden layers
        for i in range(1, len(self.convs) - 1):
            out = self.convs[i](out, adj_t)
            out = self.bns[i](out)
            out = F.relu(out)
            out = F.dropout(out, p=self.dropout, training=self.training)

        # Last layer
        out = self.convs[-1](out, adj_t)

        # Skip classification layer and return node embeddings
        if self.return_embeds:
            return out

        # Apply log softmax for classification
        out = self.softmax(out)

        return out

# GNN Training Function

A streamlined training function for Graph Neural Networks that implements a single training step. The function handles:

## Core Components
1. **Model State Management**:
   - Sets model to training mode
   - Handles gradient zeroing and updates

2. **Forward Pass Processing**:
   - Processes node features and adjacency matrix
   - Focuses on specified training indices
   - Applies loss function to relevant nodes only

3. **Optimization Step**:
   - Computes gradients via backpropagation
   - Updates model parameters using optimizer
   - Returns the computed loss for monitoring

This function represents one complete training iteration and is typically called within an epoch loop.

In [None]:
def train(model, data, train_idx, optimizer, loss_fn):
    """Train the model for one iteration"""

    # Set model to training mode (enables dropout, batch norm, etc.)
    model.train()
    loss = 0

    # Clear accumulated gradients from previous iteration
    optimizer.zero_grad()

    # Forward pass: compute model predictions
    # data.x: node features, data.adj_t: adjacency matrix
    out = model(data.x, data.adj_t)

    # Calculate loss only on training nodes
    # train_idx selects which nodes to use for training
    # squeeze() removes extra dimensions from target labels
    loss = loss_fn(out[train_idx], data.y[train_idx].squeeze())

    # Backward pass: compute gradients
    loss.backward()

    # Update model parameters using optimizer
    optimizer.step()

    # Return scalar loss value for monitoring
    return loss.item()

# GNN Evaluation Function

A comprehensive evaluation function for Graph Neural Networks that tests model performance across multiple data splits (train/validation/test). Key features include:

## Functionality
1. **Performance Evaluation**:
   - Runs model in evaluation mode
   - Computes predictions for entire dataset
   - Calculates accuracy for each data split
   
2. **Prediction Processing**:
   - Uses argmax for class prediction
   - Maintains dimension consistency
   - Handles batch processing efficiently

3. **Results Management**:
   - Optional result saving functionality
   - Exports predictions to CSV
   - Returns accuracies for all splits

The function is decorated with `@torch.no_grad()` for memory efficiency during evaluation.

In [None]:
@torch.no_grad()  # Disable gradient computation for efficiency
def test(model, data, split_idx, evaluator, save_model_results=False):
    """Evaluate model performance on train, validation, and test sets"""

    # Set model to evaluation mode (disables dropout, batch norm, etc.)
    model.eval()

    # Forward pass on full dataset
    out = model(data.x, data.adj_t)

    # Convert logits to predictions by taking max probability class
    # keepdim=True maintains dimension for evaluator compatibility
    y_pred = out.argmax(dim=-1, keepdim=True)

    # Calculate accuracy on training set
    train_acc = evaluator.eval({
        'y_true': data.y[split_idx['train']],  # True labels for training nodes
        'y_pred': y_pred[split_idx['train']],  # Predicted labels for training nodes
    })['acc']

    # Calculate accuracy on validation set
    valid_acc = evaluator.eval({
        'y_true': data.y[split_idx['valid']],  # True labels for validation nodes
        'y_pred': y_pred[split_idx['valid']],  # Predicted labels for validation nodes
    })['acc']

    # Calculate accuracy on test set
    test_acc = evaluator.eval({
        'y_true': data.y[split_idx['test']],  # True labels for test nodes
        'y_pred': y_pred[split_idx['test']],  # Predicted labels for test nodes
    })['acc']

    # Optionally save model predictions to CSV
    if save_model_results:
        print("Saving Model Predictions")

        # Prepare predictions for saving
        data = {
            'y_pred': y_pred.view(-1).cpu().detach().numpy()  # Flatten and convert to numpy
        }

        # Create DataFrame and save to CSV
        df = pd.DataFrame(data=data)
        df.to_csv('ogbn-arxiv_node.csv', sep=',', index=False)

    return train_acc, valid_acc, test_acc

In [None]:
# Please do not change the args
if 'IS_GRADESCOPE_ENV' not in os.environ:
  args = {
      'device': device,
      'num_layers': 3,
      'hidden_dim': 256,
      'dropout': 0.5,
      'lr': 0.01,
      'epochs': 100,
  }
  args

In [None]:
if 'IS_GRADESCOPE_ENV' not in os.environ:
  model = GCN(data.num_features, args['hidden_dim'],
              dataset.num_classes, args['num_layers'],
              args['dropout']).to(device)
  evaluator = Evaluator(name='ogbn-arxiv')

In [None]:
# Please do not change these args
# Training should take <10min using GPU runtime
import copy
if 'IS_GRADESCOPE_ENV' not in os.environ:
  # reset the parameters to initial random value
  model.reset_parameters()

  optimizer = torch.optim.Adam(model.parameters(), lr=args['lr'])
  loss_fn = F.nll_loss

  best_model = None
  best_valid_acc = 0

  for epoch in range(1, 1 + args["epochs"]):
    loss = train(model, data, train_idx, optimizer, loss_fn)
    result = test(model, data, split_idx, evaluator)
    train_acc, valid_acc, test_acc = result
    if valid_acc > best_valid_acc:
        best_valid_acc = valid_acc
        best_model = copy.deepcopy(model)
    print(f'Epoch: {epoch:02d}, '
          f'Loss: {loss:.4f}, '
          f'Train: {100 * train_acc:.2f}%, '
          f'Valid: {100 * valid_acc:.2f}% '
          f'Test: {100 * test_acc:.2f}%')

## Question 5: What are your `best_model` validation and test accuracies?

Run the cell below to see the results of your best of model and save your model's predictions to a file named *ogbn-arxiv_node.csv*. You can view this file by clicking on the *Folder* icon on the left side pannel. Report the results on Gradescope.

In [None]:
if 'IS_GRADESCOPE_ENV' not in os.environ:
  best_result = test(best_model, data, split_idx, evaluator, save_model_results=True)
  train_acc, valid_acc, test_acc = best_result
  print(f'Best model: '
        f'Train: {100 * train_acc:.2f}%, '
        f'Valid: {100 * valid_acc:.2f}% '
        f'Test: {100 * test_acc:.2f}%')

# 4) GNN: Graph Property Prediction

In this section we will create a graph neural network for graph property prediction (graph classification).


## Load and preprocess the dataset

In [None]:
from ogb.graphproppred import PygGraphPropPredDataset, Evaluator
from torch_geometric.data import DataLoader
from tqdm.notebook import tqdm

if 'IS_GRADESCOPE_ENV' not in os.environ:
  # Load the dataset
  dataset = PygGraphPropPredDataset(name='ogbg-molhiv')

  device = 'cuda' if torch.cuda.is_available() else 'cpu'
  print('Device: {}'.format(device))

  split_idx = dataset.get_idx_split()

  # Check task type
  print('Task type: {}'.format(dataset.task_type))

In [None]:
# Load the dataset splits into corresponding dataloaders
# We will train the graph classification task on a batch of 32 graphs
# Shuffle the order of graphs for training set
if 'IS_GRADESCOPE_ENV' not in os.environ:
  train_loader = DataLoader(dataset[split_idx["train"]], batch_size=32, shuffle=True, num_workers=0)
  valid_loader = DataLoader(dataset[split_idx["valid"]], batch_size=32, shuffle=False, num_workers=0)
  test_loader = DataLoader(dataset[split_idx["test"]], batch_size=32, shuffle=False, num_workers=0)

In [None]:
if 'IS_GRADESCOPE_ENV' not in os.environ:
  # Please do not change the args
  args = {
      'device': device,
      'num_layers': 5,
      'hidden_dim': 256,
      'dropout': 0.5,
      'lr': 0.001,
      'epochs': 30,
  }
  args

## Graph Prediction Model

### Graph Mini-Batching
Before diving into the actual model, we introduce the concept of mini-batching with graphs. In order to parallelize the processing of a mini-batch of graphs, PyG combines the graphs into a single disconnected graph data object (*torch_geometric.data.Batch*). *torch_geometric.data.Batch* inherits from *torch_geometric.data.Data* (introduced earlier) and contains an additional attribute called `batch`.

The `batch` attribute is a vector mapping each node to the index of its corresponding graph within the mini-batch:

    batch = [0, ..., 0, 1, ..., n - 2, n - 1, ..., n - 1]

This attribute is crucial for associating which graph each node belongs to and can be used to e.g. average the node embeddings for each graph individually to compute graph level embeddings.



### Implemention
Now, we have all of the tools to implement a GCN Graph Prediction model!  

We will reuse the existing GCN model to generate `node_embeddings` and then use  `Global Pooling` over the nodes to create graph level embeddings that can be used to predict properties for the each graph. Remeber that the `batch` attribute will be essential for performining Global Pooling over our mini-batch of graphs.

In [None]:
from ogb.graphproppred.mol_encoder import AtomEncoder
from torch_geometric.nn import global_add_pool, global_mean_pool

### GCN to predict graph property
class GCN_Graph(torch.nn.Module):
    def __init__(self, hidden_dim, output_dim, num_layers, dropout):
        super(GCN_Graph, self).__init__()

        # Load encoders for Atoms in molecule graphs
        self.node_encoder = AtomEncoder(hidden_dim)

        # Node embedding model
        # Note that the input_dim and output_dim are set to hidden_dim
        self.gnn_node = GCN(hidden_dim, hidden_dim,
            hidden_dim, num_layers, dropout, return_embeds=True)

        # Global mean pooling for graph-level representation
        self.pool = global_mean_pool

        # Output layer
        self.linear = torch.nn.Linear(hidden_dim, output_dim)


    def reset_parameters(self):
      self.gnn_node.reset_parameters()
      self.linear.reset_parameters()

    def forward(self, batched_data):
        # Input is a mini-batch of graphs (torch_geometric.data.Batch) and
        # output is the predicted graph property for each graph.

        # Extract important attributes of our mini-batch
        x, edge_index, batch = batched_data.x, batched_data.edge_index, batched_data.batch
        embed = self.node_encoder(x)

        # Generate node embeddings using GCN
        out = self.gnn_node(embed, edge_index)

        # Aggregate node embeddings to graph-level representation
        out = self.pool(out, batch)

        # Predict graph property using linear layer
        out = self.linear(out)

        return out

# Batch-wise GNN Training Function

A training function designed for handling batched graph data with special considerations for:

## Key Features
1. **Batch Processing**:
   - Iterates through data loader with progress bar
   - Handles device (CPU/GPU) transfer
   - Skips invalid batches (single node or empty graphs)

2. **Label Handling**:
   - Manages unlabeled data points
   - Uses masking for labeled data only
   - Converts labels to appropriate type

3. **Training Loop**:
   - Standard optimization cycle
   - Progress tracking via tqdm
   - Graceful handling of edge cases

The function is particularly robust for real-world datasets where some graphs may be unlabeled or have special cases requiring different handling.

In [None]:
def train(model, device, data_loader, optimizer, loss_fn):
    # Train your model by using the given optimizer and loss_fn.
    model.train()
    loss = 0

    for step, batch in enumerate(tqdm(data_loader, desc="Iteration")):
      batch = batch.to(device)

      if batch.x.shape[0] == 1 or batch.batch[-1] == 0:
          pass
      else:
        ## ignore nan targets (unlabeled) when computing training loss.
        is_labeled = batch.y == batch.y

        optimizer.zero_grad()  # Reset gradients
        out = model(batch)  # Forward pass

        # Apply mask for labeled graphs and calculate loss
        loss = loss_fn(out[is_labeled], batch.y[is_labeled].type(torch.float32))


        loss.backward()
        optimizer.step()

    return loss.item()

# GNN Evaluation Function for Graph-Level Prediction

A comprehensive evaluation function for graph-level predictions that includes:

## Key Components
1. **Prediction Collection**:
   - Batch-wise processing with progress tracking
   - Handles edge cases (single-node graphs)
   - Accumulates predictions and ground truth

2. **Data Processing**:
   - Memory-efficient evaluation using torch.no_grad()
   - Concatenates results across batches
   - Converts tensors to numpy arrays

3. **Results Management**:
   - Optional saving of predictions to CSV
   - Structured output format (y_pred | y_true)
   - Flexible file naming with save_file parameter

The function is designed for evaluation scenarios where model predictions need to be both assessed and potentially stored for later analysis.

In [None]:
# The evaluation function
def eval(model, device, loader, evaluator, save_model_results=False, save_file=None):
    model.eval()
    y_true = []
    y_pred = []

    for step, batch in enumerate(tqdm(loader, desc="Iteration")):
        batch = batch.to(device)

        if batch.x.shape[0] == 1:
            pass
        else:
            with torch.no_grad():
                pred = model(batch)

            y_true.append(batch.y.view(pred.shape).detach().cpu())
            y_pred.append(pred.detach().cpu())

    y_true = torch.cat(y_true, dim = 0).numpy()
    y_pred = torch.cat(y_pred, dim = 0).numpy()

    input_dict = {"y_true": y_true, "y_pred": y_pred}

    if save_model_results:
        print ("Saving Model Predictions")

        # Create a pandas dataframe with a two columns
        # y_pred | y_true
        data = {}
        data['y_pred'] = y_pred.reshape(-1)
        data['y_true'] = y_true.reshape(-1)

        df = pd.DataFrame(data=data)
        # Save to csv
        df.to_csv('ogbg-molhiv_graph_' + save_file + '.csv', sep=',', index=False)

    return evaluator.eval(input_dict)

In [None]:
if 'IS_GRADESCOPE_ENV' not in os.environ:
  model = GCN_Graph(args['hidden_dim'],
              dataset.num_tasks, args['num_layers'],
              args['dropout']).to(device)
  evaluator = Evaluator(name='ogbg-molhiv')

In [None]:
# Please do not change these args
# Training should take <10min using GPU runtime
import copy

if 'IS_GRADESCOPE_ENV' not in os.environ:
  model.reset_parameters()

  optimizer = torch.optim.Adam(model.parameters(), lr=args['lr'])
  loss_fn = torch.nn.BCEWithLogitsLoss()

  best_model = None
  best_valid_acc = 0

  for epoch in range(1, 1 + args["epochs"]):
    print('Training...')
    loss = train(model, device, train_loader, optimizer, loss_fn)

    print('Evaluating...')
    train_result = eval(model, device, train_loader, evaluator)
    val_result = eval(model, device, valid_loader, evaluator)
    test_result = eval(model, device, test_loader, evaluator)

    train_acc, valid_acc, test_acc = train_result[dataset.eval_metric], val_result[dataset.eval_metric], test_result[dataset.eval_metric]
    if valid_acc > best_valid_acc:
        best_valid_acc = valid_acc
        best_model = copy.deepcopy(model)
    print(f'Epoch: {epoch:02d}, '
          f'Loss: {loss:.4f}, '
          f'Train: {100 * train_acc:.2f}%, '
          f'Valid: {100 * valid_acc:.2f}% '
          f'Test: {100 * test_acc:.2f}%')

## Question 6: What are your `best_model` validation and test ROC-AUC scores? (20 points)

Run the cell below to see the results of your best of model and save your model's predictions in files named *ogbg-molhiv_graph_[valid,test].csv*. Again, you can view the files by clicking on the *Folder* icon on the left side pannel. Report the results on Gradescope.

In [None]:
if 'IS_GRADESCOPE_ENV' not in os.environ:
  train_auroc = eval(best_model, device, train_loader, evaluator)[dataset.eval_metric]
  valid_auroc = eval(best_model, device, valid_loader, evaluator, save_model_results=True, save_file="valid")[dataset.eval_metric]
  test_auroc  = eval(best_model, device, test_loader, evaluator, save_model_results=True, save_file="test")[dataset.eval_metric]

  print(f'Best model: '
      f'Train: {100 * train_auroc:.2f}%, '
      f'Valid: {100 * valid_auroc:.2f}% '
      f'Test: {100 * test_auroc:.2f}%')

## Question 7 (Optional): Experiment with the two other global pooling layers in Pytorch Geometric.

# Submission

To submit Colab 2, please submit to the following assignments on Gradescope:

1. "Colab 2": submit your answers to the questions in this assignment
2. "Colab 2 Code": submit your completed *CS224W_Colab_2.ipynb*. From the "File" menu select "Download .ipynb" to save a local copy of your completed Colab. **PLEASE DO NOT CHANGE THE NAME!** The autograder depends on the .ipynb file being called "CS224W_Colab_2.ipynb".

Clarrification:
- In "Colab 2 Code", we grade Q1-Q4 (non-training questions) using autograder.
- In "Colab 2", we grade Q5-Q6 (training questions), where Q1-Q4 are assigned 0 points.