This code is based on https://colab.research.google.com/drive/1r_FWLSFf9iL0OWeHeD31d_Opt031P1Nq?usp=sharing . We modified it based on our needs and added some comments.

In [12]:
import torch
from torch import Tensor
print(torch.__version__)

2.1.0


## Heterogeneous Graph Creation

### Load dataset

In [13]:
import pandas as pd
movies_path = '../data/movies.csv'
ratings_path = '../data/reviews_clean.csv'

In [14]:
movies_df = pd.read_csv(movies_path, index_col='rotten_tomatoes_link')
movie_feat = torch.from_numpy(movies_df.values).to(torch.float)
assert movie_feat.size() == (17255,31) 
movies_df.head()

Unnamed: 0_level_0,tomatometer_status,tomatometer_rating,audience_status,audience_rating,Action & Adventure,Comedy,Drama,Science Fiction & Fantasy,Romance,Classics,...,Television,Cult Movies,Anime & Manga,Gay & Lesbian,content_rating_G,content_rating_NC17,content_rating_NR,content_rating_PG,content_rating_PG.1,content_rating_R
rotten_tomatoes_link,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
m/0814255,0.0,49.0,0,53.0,1,1,1,1,0,0,...,0,0,0,0,0,0,0,1,1,0
m/0878835,2.0,87.0,1,64.0,0,1,0,0,0,0,...,0,0,0,0,0,0,0,0,0,1
m/10,1.0,67.0,0,53.0,0,1,0,0,1,0,...,0,0,0,0,0,0,0,0,0,1
m/1000013-12_angry_men,2.0,100.0,1,97.0,0,0,1,0,0,1,...,0,0,0,0,0,0,1,0,0,0
m/1000079-20000_leagues_under_the_sea,1.0,89.0,1,74.0,1,0,1,0,0,0,...,0,0,0,0,1,0,0,0,0,0


The `ratings.csv` data connects reviewers (as given by `Source`) and movies (as given by `Target`).
Here, we first read the `*.csv` file from disk, and create a mapping that maps entry IDs to a consecutive value in the range `{ 0, ..., num_rows - 1 }`.
This is needed as we want our final data representation to be as compact as possible, *e.g.*, the representation of a movie in the first row should be accessible via `x[0]`.

Afterwards, we obtain the final `edge_index` representation of shape `[2, num_ratings]` from `ratings.csv` by merging mapped reviewer and movie indices with the raw indices given by the original data frame.

In [15]:
ratings_df = pd.read_csv(ratings_path)
ratings_df.head()

Unnamed: 0,Source,Target,review_score_scaled,top_critic,review_type
0,Ben McEachen,m/0814255,7,0,1
1,Nick Schager,m/0814255,2,0,0
2,Bill Goodykoontz,m/0814255,7,1,1
3,Jordan Hoffman,m/0814255,7,0,1
4,Jim Schembri,m/0814255,6,1,1


In [16]:
# Create a mapping from unique reviewer indices to range [0, num_reviewer_nodes):
unique_reviewer_id = ratings_df['Source'].unique()
unique_reviewer_id = pd.DataFrame(data={
    'reviewerId': unique_reviewer_id,
    'mappedID': pd.RangeIndex(len(unique_reviewer_id)),
})
print("Mapping of reviewer IDs to consecutive values:")
print("==========================================")
print(unique_reviewer_id.head())
print()
# Create a mapping from unique movie indices to range [0, num_movie_nodes):
unique_movie_id = ratings_df['Target'].unique()
unique_movie_id = pd.DataFrame(data={
    'movieId': movies_df.index,
    'mappedID': pd.RangeIndex(len(movies_df)),
})
print("Mapping of movie IDs to consecutive values:")
print("===========================================")
print(unique_movie_id.head())

# Perform merge to obtain the edges from reviewers and movies:
ratings_reviewer_id = pd.merge(ratings_df['Source'], unique_reviewer_id,
                            left_on='Source', right_on='reviewerId', how='left')
ratings_reviewer_id = torch.from_numpy(ratings_reviewer_id['mappedID'].values)
ratings_movie_id = pd.merge(ratings_df['Target'], unique_movie_id,
                            left_on='Target', right_on='movieId', how='left')
ratings_movie_id = torch.from_numpy(ratings_movie_id['mappedID'].values)

# With this, we are ready to construct our `edge_index` in COO format following PyG semantics:
edge_index_reviewer_to_movie = torch.stack([ratings_reviewer_id, ratings_movie_id], dim=0)
assert edge_index_reviewer_to_movie.size() == (2, 692227)

print()
print("Final edge indices pointing from reviewers to movies:")
print("=================================================")
print(edge_index_reviewer_to_movie)

Mapping of reviewer IDs to consecutive values:
         reviewerId  mappedID
0      Ben McEachen         0
1      Nick Schager         1
2  Bill Goodykoontz         2
3    Jordan Hoffman         3
4      Jim Schembri         4

Mapping of movie IDs to consecutive values:
                                 movieId  mappedID
0                              m/0814255         0
1                              m/0878835         1
2                                   m/10         2
3                 m/1000013-12_angry_men         3
4  m/1000079-20000_leagues_under_the_sea         4

Final edge indices pointing from reviewers to movies:
tensor([[    0,     1,     2,  ...,   176,   238,  1142],
        [    0,     0,     0,  ..., 17254, 17254, 17254]])


With this, we are ready to initialize our `HeteroData` object and pass the necessary information to it.
Note that we also pass in a `node_id` vector to each node type in order to reconstruct the original node indices from sampled subgraphs.
We also take care of adding reverse edges to the `HeteroData` object.
This allows our GNN model to use both directions of the edge for message passing:

In [17]:
from torch_geometric.data import HeteroData
import torch_geometric.transforms as T

data = HeteroData()

# Save node indices:
data["reviewer"].node_id = torch.arange(len(unique_reviewer_id))
data["movie"].node_id = torch.arange(len(movies_df))

# Add the node features and edge indices:
data["movie"].x = movie_feat
data["reviewer", "rates", "movie"].edge_index = edge_index_reviewer_to_movie

ratings_ = torch.from_numpy(ratings_df[['review_score_scaled','top_critic','review_type']].values).to(torch.float)
data["reviewer", "rates", "movie"].edge_attr = ratings_

# We also need to make sure to add the reverse edges from movies to reviewers
# in order to let a GNN be able to pass messages in both directions.
# We can leverage the `T.ToUndirected()` transform for this from PyG:
data = T.ToUndirected()(data)
print(data.edge_types)
print(data)

assert data.node_types == ["reviewer", "movie"]
assert data.edge_types == [("reviewer", "rates", "movie"),
                           ("movie", "rev_rates", "reviewer")]
assert data["reviewer"].num_nodes == 6791
assert data["reviewer"].num_features == 0
assert data["movie"].num_nodes == 17255
assert data["movie"].num_features == 31
assert data["reviewer", "rates", "movie"].num_edges == 692227
assert data["movie", "rev_rates", "reviewer"].num_edges == 692227

[('reviewer', 'rates', 'movie'), ('movie', 'rev_rates', 'reviewer')]
HeteroData(
  reviewer={ node_id=[6791] },
  movie={
    node_id=[17255],
    x=[17255, 31],
  },
  (reviewer, rates, movie)={
    edge_index=[2, 692227],
    edge_attr=[692227, 3],
  },
  (movie, rev_rates, reviewer)={
    edge_index=[2, 692227],
    edge_attr=[692227, 3],
  }
)


## Defining Edge-level Training Splits

We split the ratings of reviewers into training, validation and test sets. Edges `("reviewer", "rates", "movie")` are divided into training, validation and test edges.

During the training of the GNN, some edges are used for message passing and some for supervision. The first edges are utilized for the propagation of information through the graph -each node aggregates information from its neighbors- , while the latent are used for defining the training task and calculating the loss.

Note that we also need to specify the reverse edge type `("movie", "rev_rates", "reviewer")`. This allows the `RandomLinkSplit` transform to drop reverse edges accordingly to not leak any information into the training phase.

In [18]:
# For this, we first split the set of edges into training (80%), validation (10%) and testing edges (10%).
# Across the training edges, we use 70% of edges for message passing and 30% of edges for supervision.
# We further want to generate fixed negative edges for evaluation with a ratio of 2:1.
# Negative edges during training will be generated on-the-fly.
# We can leverage the `RandomLinkSplit()` transform for this from PyG:
transform = T.RandomLinkSplit(
    num_val=0.1,
    num_test=0.1,
    disjoint_train_ratio=0.3,
    neg_sampling_ratio=2.0,
    add_negative_train_samples=False,
    edge_types=("reviewer", "rates", "movie"),
    rev_edge_types=("movie", "rev_rates", "reviewer"),
)

train_data, val_data, test_data = transform(data)
print("Training data:")
print("==============")
print(train_data)
print()
print("Validation data:")
print("================")
print(val_data)

Training data:
HeteroData(
  reviewer={ node_id=[6791] },
  movie={
    node_id=[17255],
    x=[17255, 31],
  },
  (reviewer, rates, movie)={
    edge_index=[2, 387649],
    edge_attr=[387649, 3],
    edge_label=[166134],
    edge_label_index=[2, 166134],
  },
  (movie, rev_rates, reviewer)={
    edge_index=[2, 387649],
    edge_attr=[387649, 3],
  }
)

Validation data:
HeteroData(
  reviewer={ node_id=[6791] },
  movie={
    node_id=[17255],
    x=[17255, 31],
  },
  (reviewer, rates, movie)={
    edge_index=[2, 553783],
    edge_attr=[553783, 3],
    edge_label=[207666],
    edge_label_index=[2, 207666],
  },
  (movie, rev_rates, reviewer)={
    edge_index=[2, 553783],
    edge_attr=[553783, 3],
  }
)


In [19]:
assert train_data["reviewer", "rates", "movie"].num_edges == 387649
assert train_data["reviewer", "rates", "movie"].edge_label_index.size(1) == 166134
assert train_data["movie", "rev_rates", "reviewer"].num_edges == 387649
# No negative edges added:
assert train_data["reviewer", "rates", "movie"].edge_label.min() == 1
assert train_data["reviewer", "rates", "movie"].edge_label.max() == 1

In [20]:
assert val_data["reviewer", "rates", "movie"].num_edges == 553783
assert val_data["reviewer", "rates", "movie"].edge_label_index.size(1) == 207666
assert val_data["movie", "rev_rates", "reviewer"].num_edges == 553783
# Negative edges with ratio 2:1:
assert val_data["reviewer", "rates", "movie"].edge_label.long().bincount().tolist() == [138444, 69222]

In [21]:
assert test_data["reviewer", "rates", "movie"].num_edges == 623005
assert test_data["reviewer", "rates", "movie"].edge_label_index.size(1) == 207666
assert test_data["movie", "rev_rates", "reviewer"].num_edges == 623005
# Negative edges with ratio 2:1:
assert test_data["reviewer", "rates", "movie"].edge_label.long().bincount().tolist() == [138444, 69222]
test_data

HeteroData(
  reviewer={ node_id=[6791] },
  movie={
    node_id=[17255],
    x=[17255, 31],
  },
  (reviewer, rates, movie)={
    edge_index=[2, 623005],
    edge_attr=[623005, 3],
    edge_label=[207666],
    edge_label_index=[2, 207666],
  },
  (movie, rev_rates, reviewer)={
    edge_index=[2, 623005],
    edge_attr=[623005, 3],
  }
)

## Defining Mini-batch Loaders

We are now ready to create a mini-batch loader that will generate subgraphs that can be used as input into our GNN.
While this step is not strictly necessary for small-scale graphs, it is absolutely necessary to apply GNNs on larger graphs that do not fit onto GPU memory otherwise.
Here, we make use of the [`loader.LinkNeighborLoader`](https://pytorch-geometric.readthedocs.io/en/latest/modules/loader.html#torch_geometric.loader.LinkNeighborLoader) which samples multiple hops from both ends of a link and creates a subgraph from it.
Here, `edge_label_index` serves as the "seed links" to start sampling from.

In [22]:
# In the first hop, we sample at most 20 neighbors.
# In the second hop, we sample at most 10 neighbors.
# In addition, during training, we want to sample negative edges on-the-fly with a ratio of 2:1.
# We can make use of the `loader.LinkNeighborLoader` from PyG:
from torch_geometric.loader import LinkNeighborLoader

# Define seed edges:
edge_label_index = train_data["reviewer", "rates", "movie"].edge_label_index
edge_label = train_data["reviewer", "rates", "movie"].edge_label
num_neighbors = [20, 10]    # number of neighbors selected for each node in each batch 
batch_size = 128    # number of starting edges in each batch

train_loader = LinkNeighborLoader(
    data=train_data,
    num_neighbors=num_neighbors,
    neg_sampling_ratio=2.0,
    edge_label_index=(("reviewer", "rates", "movie"), edge_label_index),
    edge_label=edge_label,
    batch_size=batch_size,
    shuffle=True,
)

# Inspect a sample:
sampled_data = next(iter(train_loader))

print("Sampled mini-batch:")
print("===================")
print(sampled_data)
"""
assert sampled_data["reviewer", "rates", "movie"].edge_label_index.size(1) == 3 * batch_size
assert sampled_data["reviewer", "rates", "movie"].edge_label.min() == 0
assert sampled_data["reviewer", "rates", "movie"].edge_label.max() == 1"""

Sampled mini-batch:
HeteroData(
  reviewer={
    node_id=[3030],
    n_id=[3030],
  },
  movie={
    node_id=[7707],
    x=[7707, 31],
    n_id=[7707],
  },
  (reviewer, rates, movie)={
    edge_index=[2, 34631],
    edge_attr=[34631, 3],
    edge_label=[384],
    edge_label_index=[2, 384],
    e_id=[34631],
    input_id=[128],
  },
  (movie, rev_rates, reviewer)={
    edge_index=[2, 15965],
    edge_attr=[15965, 3],
    e_id=[15965],
  }
)


'\nassert sampled_data["reviewer", "rates", "movie"].edge_label_index.size(1) == 3 * batch_size\nassert sampled_data["reviewer", "rates", "movie"].edge_label.min() == 0\nassert sampled_data["reviewer", "rates", "movie"].edge_label.max() == 1'

## Creating a Heterogeneous Link-level GNN

We are now ready to create our heterogeneous GNN.
The GNN is responsible for learning enriched node representations from the surrounding subgraphs, which can be then used to derive edge-level predictions.
For defining our heterogenous GNN, we make use of [`nn.SAGEConv`](https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html#torch_geometric.nn.conv.SAGEConv) or `nn.GraphConv` and the [`nn.to_hetero()`](https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html#torch_geometric.nn.to_hetero_transformer.to_hetero) function, which transforms a GNN defined on homogeneous graphs to be applied on heterogeneous ones.

In addition, we define a final link-level classifier, which simply takes both node embeddings of the link we are trying to predict, and applies a dot-product on them.

As reviewers do not have any node-level information, we choose to learn their features jointly via a `torch.nn.Embedding` layer. In order to improve the expressiveness of movie features, we do the same for movie nodes, and simply add their shallow embeddings to the pre-defined genre features.

In [33]:
from torch_geometric.nn import SAGEConv, GraphConv, to_hetero
import torch.nn.functional as F

class GNN(torch.nn.Module):
    def __init__(self, hidden_channels):
        super().__init__()

        #self.conv1 = GraphConv(hidden_channels, hidden_channels, add_self_loops=False)
        #self.conv2 = GraphConv(hidden_channels, hidden_channels, add_self_loops=False)
        #self.conv2 = GCNConv(hidden_channels, hidden_channels, add_self_loops=False)   #Cannot be used with heterogeneous graph
        self.conv1 = SAGEConv(hidden_channels, hidden_channels, aggregator='mean') 
        self.conv2 = SAGEConv(hidden_channels, hidden_channels, aggregator='mean')

    def forward(self, x: Tensor, edge_index: Tensor) -> Tensor:
        x = F.relu(self.conv1(x, edge_index))
        x = self.conv2(x, edge_index)
        return x

# Our final classifier applies the dot-product between source and destination
# node embeddings to derive edge-level predictions:
class Classifier(torch.nn.Module):
    def forward(self, x_reviewer: Tensor, x_movie: Tensor, edge_label_index: Tensor) -> Tensor:
        # Convert node embeddings to edge-level representations:
        edge_feat_reviewer = x_reviewer[edge_label_index[0]]
        edge_feat_movie = x_movie[edge_label_index[1]]

        # Apply dot-product to get a prediction per supervision edge:
        return (edge_feat_reviewer * edge_feat_movie).sum(dim=-1)


class Model(torch.nn.Module):
    def __init__(self, hidden_channels):
        super().__init__()
        # we also learn two embedding matrices for reviewers and movies
        self.movie_lin = torch.nn.Linear(31, hidden_channels)
        self.reviewer_emb = torch.nn.Embedding(data["reviewer"].num_nodes, hidden_channels)
        self.movie_emb = torch.nn.Embedding(data["movie"].num_nodes, hidden_channels)

        # Instantiate homogeneous GNN
        self.gnn = GNN(hidden_channels)

        # Convert GNN model into a heterogeneous variant. 
        # This is an important step because it allows GRAPHSage to run in heterogeneous graphs
        self.gnn = to_hetero(self.gnn, metadata=data.metadata())

        self.classifier = Classifier()

    def forward(self, data: HeteroData) -> Tensor:
        x_dict = {
          "reviewer": self.reviewer_emb(data["reviewer"].node_id),
          "movie": self.movie_lin(data["movie"].x) + self.movie_emb(data["movie"].node_id),
        }

        # `x_dict` holds feature matrices of all node types
        # `edge_index_dict` holds all edge indices of all edge types
        x_dict = self.gnn(x_dict, data.edge_index_dict)
        pred = self.classifier(
            x_dict["reviewer"],
            x_dict["movie"],
            data["reviewer", "rates", "movie"].edge_label_index,
        )

        return pred

hidden_channels = 64
model = Model(hidden_channels=hidden_channels)
print(model)

Model(
  (movie_lin): Linear(in_features=31, out_features=64, bias=True)
  (reviewer_emb): Embedding(6791, 64)
  (movie_emb): Embedding(17255, 64)
  (gnn): GraphModule(
    (conv1): ModuleDict(
      (reviewer__rates__movie): SAGEConv(64, 64, aggr=mean)
      (movie__rev_rates__reviewer): SAGEConv(64, 64, aggr=mean)
    )
    (conv2): ModuleDict(
      (reviewer__rates__movie): SAGEConv(64, 64, aggr=mean)
      (movie__rev_rates__reviewer): SAGEConv(64, 64, aggr=mean)
    )
  )
  (classifier): Classifier()
)


## Training a Heterogeneous Link-level GNN

The training loop iterates over our mini-batches, applies the forward computation of the model, computes the loss from ground-truth labels and obtained predictions (here we make use of binary cross entropy), and adjusts model parameters via back-propagation and ADAM optimizer.

In [24]:
# Use validation set to keep the best model
# Define the validation seed edges:
edge_label_index = val_data["reviewer", "rates", "movie"].edge_label_index  # pairs of node ids (reviewer,movie) that define an edge
edge_label = val_data["reviewer", "rates", "movie"].edge_label  # if an edge exists or not

# val_loader takes batches (subset of nodes and some edges) from the validation set
val_loader = LinkNeighborLoader(
    data=val_data,
    num_neighbors=num_neighbors,
    edge_label_index=(("reviewer", "rates", "movie"), edge_label_index),
    edge_label=edge_label,
    batch_size=3 * batch_size,
    shuffle=False,
)

sampled_data = next(iter(val_loader))

#print("Sampled mini-batch:")
#print("===================")
#print(sampled_data)

assert sampled_data["reviewer", "rates", "movie"].edge_label_index.size(1) == 3 * batch_size
assert sampled_data["reviewer", "rates", "movie"].edge_label.min() >= 0
assert sampled_data["reviewer", "rates", "movie"].edge_label.max() <= 1

In [25]:
import tqdm
import time
start_time = time.time()
import torch.nn.functional as F
from sklearn.metrics import roc_auc_score

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Device: '{device}'")

best_auc = 0.0
best_epoch = 0
early_stopping_limit = 2
epochs = 10
learning_rate = 0.001

model = model.to(device)
#optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

for epoch in range(0, epochs):
    # Training
    model.train()
    total_loss = total_examples = 0
    for sampled_data in tqdm.tqdm(train_loader):
        optimizer.zero_grad()

        sampled_data.to(device)
        pred = model(sampled_data)

        ground_truth = sampled_data["reviewer", "rates", "movie"].edge_label
        loss = F.binary_cross_entropy_with_logits(pred, ground_truth)

        loss.backward()
        optimizer.step()
        total_loss += float(loss) * pred.numel()
        total_examples += pred.numel()
    print(f"Epoch: {epoch+1}, Training Loss: {total_loss / total_examples:.4f}")

    # Validation
    model.eval()
    preds_val = []
    ground_truths_val = []
    for sampled_data_val in tqdm.tqdm(val_loader):
        with torch.no_grad():
            sampled_data_val.to(device)
            preds_val.append(model(sampled_data_val))
            ground_truths_val.append(sampled_data_val["reviewer", "rates", "movie"].edge_label)

    pred_val = torch.cat(preds_val, dim=0).cpu().numpy()
    ground_truth_val = torch.cat(ground_truths_val, dim=0).cpu().numpy()
    auc_val = roc_auc_score(ground_truth_val, pred_val)
    print(f"Epoch: {epoch+1},\n Validation Loss: {total_loss / total_examples:.4f}")
    print(f" Validation AUC: {auc_val:.4f}")

    if auc_val > best_auc: 
        best_auc = auc_val
        best_epoch = epoch
        torch.save(model.state_dict(), 'best_model.pth')
    elif epoch - best_epoch >= early_stopping_limit: # Early stopping
        print(f"Early stopping at epoch {epoch}")
        break

print(f"Training time: {time.time() - start_time}")

Device: 'cpu'


100%|██████████| 1298/1298 [08:23<00:00,  2.58it/s]


Epoch: 1, Training Loss: 0.6031


100%|██████████| 541/541 [01:10<00:00,  7.64it/s]


Epoch: 1,
 Validation Loss: 0.6031
 Validation AUC: 0.9321


100%|██████████| 1298/1298 [08:29<00:00,  2.55it/s]


Epoch: 2, Training Loss: 0.2655


100%|██████████| 541/541 [01:11<00:00,  7.52it/s]


Epoch: 2,
 Validation Loss: 0.2655
 Validation AUC: 0.9560


100%|██████████| 1298/1298 [08:28<00:00,  2.55it/s]


Epoch: 3, Training Loss: 0.2330


100%|██████████| 541/541 [01:12<00:00,  7.50it/s]


Epoch: 3,
 Validation Loss: 0.2330
 Validation AUC: 0.9616


100%|██████████| 1298/1298 [10:41<00:00,  2.02it/s]


Epoch: 4, Training Loss: 0.2185


100%|██████████| 541/541 [00:35<00:00, 15.18it/s]


Epoch: 4,
 Validation Loss: 0.2185
 Validation AUC: 0.9645


100%|██████████| 1298/1298 [03:18<00:00,  6.54it/s]


Epoch: 5, Training Loss: 0.2067


100%|██████████| 541/541 [00:35<00:00, 15.34it/s]


Epoch: 5,
 Validation Loss: 0.2067
 Validation AUC: 0.9656


100%|██████████| 1298/1298 [03:16<00:00,  6.62it/s]


Epoch: 6, Training Loss: 0.1985


100%|██████████| 541/541 [00:35<00:00, 15.12it/s]


Epoch: 6,
 Validation Loss: 0.1985
 Validation AUC: 0.9667


100%|██████████| 1298/1298 [03:20<00:00,  6.48it/s]


Epoch: 7, Training Loss: 0.1902


100%|██████████| 541/541 [00:35<00:00, 15.25it/s]


Epoch: 7,
 Validation Loss: 0.1902
 Validation AUC: 0.9695


100%|██████████| 1298/1298 [03:19<00:00,  6.49it/s]


Epoch: 8, Training Loss: 0.1843


100%|██████████| 541/541 [00:35<00:00, 15.17it/s]


Epoch: 8,
 Validation Loss: 0.1843
 Validation AUC: 0.9690


100%|██████████| 1298/1298 [03:20<00:00,  6.47it/s]


Epoch: 9, Training Loss: 0.1799


100%|██████████| 541/541 [00:35<00:00, 15.15it/s]


Epoch: 9,
 Validation Loss: 0.1799
 Validation AUC: 0.9704


100%|██████████| 1298/1298 [03:22<00:00,  6.41it/s]


Epoch: 10, Training Loss: 0.1760


100%|██████████| 541/541 [00:35<00:00, 15.11it/s]

Epoch: 10,
 Validation Loss: 0.1760
 Validation AUC: 0.9700
Training time: 3827.3066504001617





## Evaluating a Heterogeneous Link-level GNN

After training, we evaluate our model on useen data coming from the test set.
For this, we define a new `LinkNeighborLoader` (which now iterates over the edges in the test set), obtain the predictions on test edges by running the model, and finally evaluate the performance of the model by computing the AUC score over the set of predictions and their corresponding ground-truth edges (including both positive and negative edges). Finally, we evaluate our model with recall, precision, accuracy and f1-score metrics.

#### Verify that train and test sets do not overlap

In [26]:
train_edge_label_index = train_data["reviewer", "rates", "movie"].edge_label_index
test_edge_label_index = test_data["reviewer", "rates", "movie"].edge_label_index
train_edge_set = set(map(tuple, train_edge_label_index.t().tolist()))
test_edge_set = set(map(tuple, test_edge_label_index.t().tolist()))

common_edges = train_edge_set.intersection(test_edge_set)

if common_edges:
    print("There are common edges between train and test sets. Common Edges:", common_edges)
else:
    print("There are no common edges between train and test sets.")

There are no common edges between train and test sets.


#### Testing edges Loader

In [27]:
# Define the test seed edges:
edge_label_index_test = test_data["reviewer", "rates", "movie"].edge_label_index
edge_label_test = test_data["reviewer", "rates", "movie"].edge_label

test_loader = LinkNeighborLoader(
    data=test_data,
    num_neighbors=num_neighbors,
    edge_label_index=(("reviewer", "rates", "movie"), edge_label_index_test),
    edge_label=edge_label_test,
    batch_size=3 * batch_size,
    shuffle=False,
)
sampled_data = next(iter(test_loader))

#print("Sampled mini-batch:")
#print("===================")
#print(sampled_data)

assert sampled_data["reviewer", "rates", "movie"].edge_label_index.size(1) == 3 * batch_size
assert sampled_data["reviewer", "rates", "movie"].edge_label.min() >= 0
assert sampled_data["reviewer", "rates", "movie"].edge_label.max() <= 1

#### Evaluate

In [28]:
from sklearn.metrics import roc_auc_score

preds = []
ground_truths = []
for sampled_data in tqdm.tqdm(test_loader):
    with torch.no_grad():
        sampled_data.to(device)
        preds.append(model(sampled_data))
        ground_truths.append(sampled_data["reviewer", "rates", "movie"].edge_label)

pred = torch.cat(preds, dim=0).cpu().numpy()
ground_truth = torch.cat(ground_truths, dim=0).cpu().numpy()

100%|██████████| 541/541 [00:36<00:00, 14.94it/s]


In [29]:
auc = roc_auc_score(ground_truth, pred)
print(f"Test AUC (no binary predictions): {auc:.4f}")

Test AUC (no binary predictions): 0.9710


In [35]:
print(list(pred))
print(list(ground_truth))

[2.593857, 3.0406468, 4.647346, -1.5495868, 4.2514095, 3.9373038, 2.9787707, -1.2746379, 4.039027, 4.9126606, 0.521905, 1.5274695, 3.1096566, 3.498466, 2.5542753, 4.716, 3.0437503, 2.529894, 3.4602773, 4.2274103, 1.9041864, -0.488862, 3.8267412, 4.807456, -0.3206, 3.8114338, -2.5194175, -6.39192, -2.2850282, 0.93491113, 3.5634408, 1.8696761, 0.10269589, -0.16039659, 1.6045913, 1.8435196, 4.8715687, 3.2183914, 5.372587, 2.98614, 3.6915774, -0.056677505, 2.2065222, 3.3412535, 2.2679048, 2.249319, 4.1121836, 4.3114076, 2.8331392, 0.6051452, -0.5108763, 3.2549644, 3.9657154, 1.8195585, -1.2102293, 3.7382028, 1.0086737, 2.2644546, -1.999914, 4.327442, 2.2387135, -0.7878195, 0.82025653, 3.5044274, 3.0573814, 1.0650883, -0.84842706, 4.1058493, 2.1601288, 3.4527617, 5.1833143, 4.769571, 0.8919572, 2.5631068, 3.8344102, 1.9713312, 0.8268811, 3.9562442, 3.5975926, 1.7469169, 5.145015, 2.22163, 2.3988633, 0.33136308, 3.729048, 1.9743538, 4.52674, 0.9171269, 2.3207796, 0.934528, 2.921299, 3.231189

In [34]:
binary_predictions = (pred >= 0).astype(int)
binary_predictions

array([1, 1, 1, ..., 0, 0, 0])

In [32]:
from sklearn.metrics import accuracy_score, recall_score, precision_score, f1_score

auc = roc_auc_score(ground_truth, binary_predictions)
accuracy = accuracy_score(ground_truth, binary_predictions)
recall = recall_score(ground_truth, binary_predictions)
precision = precision_score(ground_truth, binary_predictions)
f1 = f1_score(ground_truth, binary_predictions)

print(f"Test AUC: {auc:.4f}")
print(f'Accuracy: {accuracy:.4f}')
print(f'Recall: {recall:.4f}')
print(f'Precision: {precision:.4f}')
print(f'F1 Score: {f1:.4f}')

Test AUC: 0.9175
Accuracy: 0.9207
Recall: 0.9082
Precision: 0.8613
F1 Score: 0.8841
