# Explore the BlogCatalog Dataset

For our part, we chose to work with the BlogCatalog dataset. It is a graph dateset that represents a network of social relationships, where the nodes represent blogger authors and the labels reprensent the bloggers' interest such as *Education*, *Food* and *Health*. The problem we are trying to solve is a multi-label classification of nodes which means that a node (blogger) might have one corresponding label (interest). 

Before diving into our ML-GCN method for multi label classification on graphs, we'll first take a look at our dataset, to get a better understanding of it and make the implementation easier. 

As described in the paper, the graph has over 10312 nodes with over 333983 edges connecting them. For each node, there 39 possible labels with iver 615 co-occurence relationships between them. 
The blog dataset is given in the form of two main csv files. The first one is called group-edges.csv. It gives the nodes and their corresponding labels. It should be noted that each node might have more than one label. The first step before moving along with classification would be to attribute each nodes its corresponding labels in the form of 39 lenghted vector, with the value 1 in the index corresponding to a related label and 0 otherwise.  

The second file is called edges.csv and as its name indicates it represents edge relationships between different nodes to form the initial graph. This graph will be used to build the node graph along with extracting the adjacency matrix and the features vector of each one of the nodes. 

One first important step in any deep learning method is to load the dataset to make it usable for our deep/machine learning algorithm. 

Before doing that, it's important to know what we need for our algorithm (a simple GCN at first). As an input for our model, we will need the adjacency matrix of the graph (normalized with added self loops) $\hat{A}$ and the features matrix $X$. To compute the train/validation loss, we will need the ground truth labels for the training/validation dataset. Last but not least, to feed our dataset to our GCN model, we need our set of nodes split somehow to three subsets for training, validation and testing. 

#### Imports:

In [1]:
import pandas as pd 
import numpy as np
import networkx as nx
import scipy.sparse as sp
from sklearn import preprocessing
import matplotlib.pyplot as plt
import torch

# Define the function to load the data 

In [2]:
def normalize(mx):
    """Row-normalize sparse matrix"""
    rowsum = np.array(mx.sum(1))
    r_inv = np.power(rowsum, -1).flatten()
    r_inv[np.isinf(r_inv)] = 0.
    r_mat_inv = sp.diags(r_inv)
    mx = r_mat_inv.dot(mx)
    return mx

In [3]:
def sparse_mx_to_torch_sparse_tensor(sparse_mx):
    """Convert a scipy sparse matrix to a torch sparse tensor."""
    sparse_mx = sparse_mx.tocoo().astype(np.float32)
    indices = torch.from_numpy(
        np.vstack((sparse_mx.row, sparse_mx.col)).astype(np.int64))
    values = torch.from_numpy(sparse_mx.data)
    shape = torch.Size(sparse_mx.shape)
    return torch.sparse.FloatTensor(indices, values, shape)

In [4]:
def load_data(data_name): 
    print("Loading {} dataset...".format(data_name))
    edges_file = data_name + "/edges.csv"
    node_label_file = data_name + "/group-edges.csv"
    
    # We'll first dive into the group_edges.csv file in order to extract a list of the nodes along with their 
    # corresponding labels
    label_raw, nodes = [], []
    with open(node_label_file) as file_to_read: 
        while True:
            lines = file_to_read.readline()
            if not lines:
                break 
            node, label = lines.split(",")
            label_raw.append(int(label))
            nodes.append(int(node))
    unique_nodes = np.unique(nodes)
    # Now we have a list of nodes and a list of their labels
    # Since a node can have multiple labels, we should give each node a corresponding 39 lengthed vector that 
    # encodes 1 when the node has the label corresponding to the index and 0 otherwise.  
    label_raw = np.array(label_raw)
    nodes = np.array(nodes)
    labels = np.zeros((unique_nodes.shape[0], 39))
    for l in range(1, 40, 1):
        indices = np.argwhere(label_raw == l).reshape(-1)
        n_l = nodes[indices]
        for n in n_l:
            labels[n-1][l-1] = 1
    
    # Now we can build our BlogCatalog graph using the file edges.csv 
    
    file_to_read = open(edges_file, 'rb')
    print(file_to_read)
    G = nx.read_edgelist(file_to_read, delimiter = ",", nodetype = int)
    
    # Let's now extract our adjacency matrix from the graph 
    A = nx.adjacency_matrix(G, nodelist = unique_nodes) # Already a symmetric matrix 
    A = sp.coo_matrix(A.todense())
    
    # Let's extract the feature matrix as well
    X = sp.csr_matrix(A)
    
    # As we saw in the paper, we need the normalized version of the adjacency matrix with the added self loops
    A = normalize(A + sp.eye(A.shape[0]))
    # X = normalize(X) --> Why do we need to do that ? 
    
    # Let's define the train, validation and test sets 
    indices = np.arange(A.shape[0]).astype('int32')
    # np.random.shuffle(indices)
    idx_train = indices[:A.shape[0] // 3]
    idx_val = indices[A.shape[0] // 3: (2 * A.shape[0]) // 3]
    idx_test = indices[(2 * A.shape[0]) // 3:]
    
    # Convert to tensors 
    X = torch.FloatTensor(np.array(X.todense()))
    labels = torch.LongTensor(labels)
    A = sparse_mx_to_torch_sparse_tensor(A)
    idx_train = torch.LongTensor(idx_train)
    idx_val = torch.LongTensor(idx_val)
    idx_test = torch.LongTensor(idx_test)
    
    return A, X, labels, idx_train, idx_val, idx_test

In [5]:
A, X, labels, idx_train, idx_val, idx_test = load_data("BlogCatalog")

Loading BlogCatalog dataset...
<_io.BufferedReader name='BlogCatalog/edges.csv'>


# Define the function that gives the accuracy 

Simple accuracy is equal to the number of correct predictions over the number of samples, in the multi-label classification, the accuracy can be defined in 2 ways: 

- Sample view : A sample is correctly classified when all of its labels are correctly classified and accuracy means the number of correctly classified samples over the total number of samples.
- Sample-class view : Assume that the output of the classifier is a matrix with dimension N by C where N is the number of samples and C is the number of classes. Accuracy means how many of the N by C elements in this output are correctly classified elements divided by the number of elements of the matrix. 

We can implement both views in the following way: 

In [6]:
def accuracy_sample(output, labels):
    """ 
    output and labels are tensors
    output is of shape (N,C)
    Labels is of shape (N,C)
    Result : acc gives the accuracy computed according to the sample view
    """
    N = labels.shape[0]
    corr = np.sum(np.all(np.equal(output, labels), axis=1))
    # corr is the number of equal rows and thus the number of correctly classified samples
    acc = corr / N
    return acc       

In [7]:
def accuracy_sample_class(output, labels):
    """ 
    output is of shape (N,C)
    Labels is of shape (N,C)
    Result : acc gives the accuracy computed according to the sample-class view
    """
    N = labels.shape[0]
    C = labels.shape[1]
    corr = np.sum(np.equal(output, labels))
    # corr is the number of equal elements between labels and output and thus the number of correctly classified 
    # labels for each sample 
    acc = corr/(N*C)
    return acc

In [8]:
def threshold(output):
    output[output > 0.5] = 1
    output[output <= 0.5] = 0
    return output

# How is the loss computed for our case? 

When you want to use a simple classifier over a multi-label and multi-class dataset, your loss function is the sum/average over the binary classifier of each class.

Here, in our case, we are trying to apply a simple classifier to solve a multi label classification problem. There are a few things that we should pay attention to in order to succeed in buiding our model for this specific task. In summary:

- Number of nodes in the output layer matches the number of labels. 
- A sigmoid function should be applied for each node in the output layer. 
- A binary cross-entropy loss is the loss that should be used over each one of the labels. 

# The training phase : 

In [9]:
from __future__ import division
from __future__ import print_function

import time
import argparse
import numpy as np

import torch
import torch.nn.functional as F
import torch.optim as optim

#from pygcn.utils import load_data, accuracy_sample, accuracy_sample_class
from pygcn.models import GCN

In [32]:
parser = argparse.ArgumentParser()
parser.add_argument('--no-cuda', action='store_true', default=False,
                    help='Disables CUDA training.')
parser.add_argument('--fastmode', action='store_true', default=False,
                    help='Validate during training pass.')
parser.add_argument('--seed', type=int, default=42, help='Random seed.')
parser.add_argument('--epochs', type=int, default=200,
                    help='Number of epochs to train.')
parser.add_argument('--lr', type=float, default=0.01,
                    help='Initial learning rate.')
parser.add_argument('--weight_decay', type=float, default=5e-4,
                    help='Weight decay (L2 loss on parameters).')
parser.add_argument('--hidden', type=int, default=16,
                    help='Number of hidden units.')
parser.add_argument('--dropout', type=float, default=0.5,
                    help='Dropout rate (1 - keep probability).')
parser.add_argument('-f')

args = parser.parse_args()
args.cuda = not args.no_cuda and torch.cuda.is_available()
np.random.seed(args.seed)
torch.manual_seed(args.seed)
if args.cuda:
    torch.cuda.manual_seed(args.seed)

In [11]:
# Load the data 
adj, features, labels, idx_train, idx_val, idx_test = load_data("BlogCatalog")

Loading BlogCatalog dataset...
<_io.BufferedReader name='BlogCatalog/edges.csv'>


In [12]:
# Model and Optimizer 
model = GCN(nfeat=features.shape[1],
            nhid=args.hidden,
            nclass=labels.shape[1],
            dropout=args.dropout)
optimizer = optim.SGD(model.parameters(),
                      lr = args.lr, weight_decay = args.weight_decay)

In [13]:
if args.cuda:
    model.cuda()
    features = features.cuda()
    adj = adj.cuda()
    labels = labels.cuda()
    idx_train = idx_train.cuda()
    idx_val = idx_val.cuda()
    idx_test = idx_test.cuda()

In [14]:
model.train()

GCN(
  (gc1): GraphConvolution (10312 -> 16)
  (gc2): GraphConvolution (16 -> 39)
)

In [15]:
optimizer.zero_grad()

In [16]:
output = model(features, adj)



In [17]:
labels = labels.float()

In [18]:
loss_train = np.sum([F.binary_cross_entropy_with_logits(output[idx_train][:,i], labels[idx_train][:,i]) for i in range(39)])

In [19]:
loss_train

tensor(36.9938, grad_fn=<AddBackward0>)

In [20]:
acc_train = accuracy_sample_class(threshold(output.detach().numpy()[idx_train]), labels.detach().numpy()[idx_train])

In [21]:
acc_train

0.5373797960356005

In [22]:
loss_train = np.sum([F.binary_cross_entropy_with_logits(output[idx_val][:,i], labels[idx_val][:,i]) for i in range(39)])

In [23]:
acc_val = accuracy_sample_class(threshold(output.detach().numpy()[idx_val]), labels.detach().numpy()[idx_val])

In [24]:
acc_val

0.5412367673060138

In [25]:
loss_train.backward()
optimizer.step()

In [26]:
if not args.fastmode:
        # Evaluate validation set performance separately,
        # deactivates dropout during validation run.
        model.eval()
        output = model(features, adj)

In [30]:
loss_val = np.sum([F.binary_cross_entropy_with_logits(output[idx_val][:,i], labels[idx_val][:,i]) for i in range(39)])
acc_val = accuracy_sample_class(threshold(output.detach().numpy()[idx_val]), labels.detach().numpy()[idx_val])

In [31]:
acc_val

0.5690039763359519

In [None]:
threshold(output.detach().numpy()[idx_val])

In [None]:
acc_val

In [39]:
def train(epoch):
    t = time.time()
    model.train()
    optimizer.zero_grad()
    output = model(features, adj)
    loss_train = np.sum([F.binary_cross_entropy_with_logits(output[idx_train][:,i], labels[idx_train][:,i]) for i in range(39)])
    acc_train = accuracy_sample_class(threshold(output.detach().numpy()[idx_train]), labels.detach().numpy()[idx_train])
    print(np.sum(threshold(output.detach().numpy()[idx_train])>0))
    loss_train.backward()
    optimizer.step()

    #if not args.fastmode:
        # Evaluate validation set performance separately,
        # deactivates dropout during validation run.
        #model.eval()
        #output = model(features, adj)

    loss_val = np.sum([F.binary_cross_entropy_with_logits(output[idx_val][:,i], labels[idx_val][:,i]) for i in range(39)])
    acc_val = accuracy_sample_class(threshold(output.detach().numpy()[idx_val]), labels.detach().numpy()[idx_val])
    print('Epoch: {:04d}'.format(epoch+1),
          'loss_train: {:.4f}'.format(loss_train.item()),
          'acc_train: {:.4f}'.format(acc_train.item()),
          'loss_val: {:.4f}'.format(loss_val.item()),
          'acc_val: {:.4f}'.format(acc_val.item()),
          'time: {:.4f}s'.format(time.time() - t))

In [34]:
# without taking into consideration the imbalance 
t_total = time.time()
for epoch in range(args.epochs):
    train(epoch)
print("Optimization Finished!")
print("Total time elapsed: {:.4f}s".format(time.time() - t_total))

58335
Epoch: 0001 loss_train: 36.8453 acc_train: 0.5547 loss_val: 36.8165 acc_val: 0.5572 time: 0.3541s
56752
Epoch: 0002 loss_train: 36.6824 acc_train: 0.5661 loss_val: 36.6596 acc_val: 0.5673 time: 0.3122s
54927
Epoch: 0003 loss_train: 36.4693 acc_train: 0.5786 loss_val: 36.4404 acc_val: 0.5796 time: 0.3271s
53356
Epoch: 0004 loss_train: 36.3028 acc_train: 0.5900 loss_val: 36.2703 acc_val: 0.5884 time: 0.4239s
50579
Epoch: 0005 loss_train: 36.0459 acc_train: 0.6094 loss_val: 36.0009 acc_val: 0.6096 time: 0.4338s
44502
Epoch: 0006 loss_train: 35.8528 acc_train: 0.6525 loss_val: 35.7934 acc_val: 0.6534 time: 0.3830s
39081
Epoch: 0007 loss_train: 35.6202 acc_train: 0.6914 loss_val: 35.5878 acc_val: 0.6929 time: 0.3271s
36106
Epoch: 0008 loss_train: 35.3937 acc_train: 0.7131 loss_val: 35.3487 acc_val: 0.7121 time: 0.3092s
33550
Epoch: 0009 loss_train: 35.1208 acc_train: 0.7310 loss_val: 35.0455 acc_val: 0.7256 time: 0.3293s
31709
Epoch: 0010 loss_train: 34.8532 acc_train: 0.7433 loss_val

Epoch: 0082 loss_train: 27.3460 acc_train: 0.9629 loss_val: 27.3163 acc_val: 0.9637 time: 0.3820s
12
Epoch: 0083 loss_train: 27.3335 acc_train: 0.9631 loss_val: 27.3385 acc_val: 0.9635 time: 0.4059s
36
Epoch: 0084 loss_train: 27.3367 acc_train: 0.9629 loss_val: 27.3591 acc_val: 0.9634 time: 0.3920s
21
Epoch: 0085 loss_train: 27.3515 acc_train: 0.9630 loss_val: 27.3487 acc_val: 0.9637 time: 0.3780s
26
Epoch: 0086 loss_train: 27.3716 acc_train: 0.9630 loss_val: 27.3242 acc_val: 0.9635 time: 0.3790s
26
Epoch: 0087 loss_train: 27.3492 acc_train: 0.9630 loss_val: 27.3288 acc_val: 0.9636 time: 0.3710s
16
Epoch: 0088 loss_train: 27.3226 acc_train: 0.9631 loss_val: 27.3313 acc_val: 0.9635 time: 0.4338s
15
Epoch: 0089 loss_train: 27.3120 acc_train: 0.9630 loss_val: 27.3111 acc_val: 0.9636 time: 0.4628s
27
Epoch: 0090 loss_train: 27.3138 acc_train: 0.9630 loss_val: 27.2898 acc_val: 0.9635 time: 0.4819s
38
Epoch: 0091 loss_train: 27.3427 acc_train: 0.9629 loss_val: 27.3382 acc_val: 0.9635 time: 0

Epoch: 0164 loss_train: 27.1752 acc_train: 0.9629 loss_val: 27.1629 acc_val: 0.9634 time: 0.4398s
3
Epoch: 0165 loss_train: 27.1674 acc_train: 0.9631 loss_val: 27.1483 acc_val: 0.9636 time: 0.5037s
5
Epoch: 0166 loss_train: 27.1669 acc_train: 0.9631 loss_val: 27.1642 acc_val: 0.9636 time: 0.4957s
23
Epoch: 0167 loss_train: 27.1517 acc_train: 0.9630 loss_val: 27.1566 acc_val: 0.9637 time: 0.4379s
12
Epoch: 0168 loss_train: 27.1656 acc_train: 0.9631 loss_val: 27.1497 acc_val: 0.9637 time: 0.4149s
11
Epoch: 0169 loss_train: 27.1330 acc_train: 0.9631 loss_val: 27.1175 acc_val: 0.9637 time: 0.4109s
12
Epoch: 0170 loss_train: 27.1383 acc_train: 0.9631 loss_val: 27.1629 acc_val: 0.9636 time: 0.4209s
6
Epoch: 0171 loss_train: 27.1268 acc_train: 0.9631 loss_val: 27.1285 acc_val: 0.9637 time: 0.4289s
8
Epoch: 0172 loss_train: 27.1852 acc_train: 0.9631 loss_val: 27.1518 acc_val: 0.9637 time: 0.4199s
1
Epoch: 0173 loss_train: 27.1531 acc_train: 0.9631 loss_val: 27.1588 acc_val: 0.9637 time: 0.4478

In [35]:
def test():
    model.eval()
    output = model(features, adj)
    loss_test = np.sum([F.binary_cross_entropy_with_logits(output[idx_test][:,i], labels[idx_test][:,i]) for i in range(39)])
    acc_test = accuracy_sample_class(threshold(output.detach().numpy()[idx_test]), labels.detach().numpy()[idx_test])
    print("Test set results:",
          "loss= {:.4f}".format(loss_test.item()),
          "accuracy= {:.4f}".format(acc_test.item()))

In [36]:
test()

Test set results: loss= 27.0957 accuracy= 0.9651


# Deal with the imbalance of the dataset

If we consider each label with its own binary classification problem, the dataset is clearly imbalanced in each problem with a favorable position for the label 0 that represents the absence of the label for the corresponding node. 

### First attempt : use the weight parameter in the binary_cross_entropy loss 

In [58]:
def weight(c):
    '''
    c is a set of labels represented by 0 and 1
    '''
    n_0 = np. count_nonzero(c == 0)
    n_1 = np.count_nonzero(c == 1)
    N = c.shape[0]
    res = torch.tensor([1/n_0,1/n_1])*N
    return res

In [59]:
# Model and Optimizer 
model = GCN(nfeat=features.shape[1],
            nhid=args.hidden,
            nclass=labels.shape[1],
            dropout=args.dropout)
optimizer = optim.SGD(model.parameters(),
                      lr = args.lr, weight_decay = args.weight_decay)

In [60]:
def train(epoch):
    t = time.time()
    model.train()
    optimizer.zero_grad()
    output = model(features, adj)
    #weight = torch.tensor([2.0,1.0])
    loss_train = np.sum([(F.binary_cross_entropy_with_logits(output[idx_train][:,i], labels[idx_train][:,i])*weight(labels[idx_train][:,i])).mean() for i in range(39)])
    acc_train = accuracy_sample_class(threshold(output.detach().numpy()[idx_train]), labels.detach().numpy()[idx_train])
    loss_train.backward()
    optimizer.step()

    #if not args.fastmode:
        # Evaluate validation set performance separately,
        # deactivates dropout during validation run.
        #model.eval()
        #output = model(features, adj)

    loss_val = np.sum([(F.binary_cross_entropy_with_logits(output[idx_val][:,i], labels[idx_val][:,i])*weight(labels[idx_val][:,i])).mean() for i in range(39)])
    acc_val = accuracy_sample_class(threshold(output.detach().numpy()[idx_val]), labels.detach().numpy()[idx_val])
    print('Epoch: {:04d}'.format(epoch+1),
          'loss_train: {:.4f}'.format(loss_train.item()),
          'acc_train: {:.4f}'.format(acc_train.item()),
          'loss_val: {:.4f}'.format(loss_val.item()),
          'acc_val: {:.4f}'.format(acc_val.item()),
          'time: {:.4f}s'.format(time.time() - t))

In [61]:
t_total = time.time()
for epoch in range(args.epochs):
    train(epoch)
print("Optimization Finished!")
print("Total time elapsed: {:.4f}s".format(time.time() - t_total))

Epoch: 0001 loss_train: 2551.3767 acc_train: 0.3933 loss_val: 3165.1851 acc_val: 0.4094 time: 0.6459s
Epoch: 0002 loss_train: 1916.8715 acc_train: 0.6995 loss_val: 2342.1960 acc_val: 0.6960 time: 0.6266s
Epoch: 0003 loss_train: 1796.5952 acc_train: 0.8862 loss_val: 2227.0566 acc_val: 0.8864 time: 0.6153s
Epoch: 0004 loss_train: 1781.4299 acc_train: 0.9621 loss_val: 2210.3606 acc_val: 0.9627 time: 0.6274s
Epoch: 0005 loss_train: 1779.4535 acc_train: 0.9625 loss_val: 2209.1763 acc_val: 0.9630 time: 0.5481s
Epoch: 0006 loss_train: 1776.4615 acc_train: 0.9627 loss_val: 2206.1196 acc_val: 0.9632 time: 0.6083s
Epoch: 0007 loss_train: 1775.4109 acc_train: 0.9626 loss_val: 2204.6379 acc_val: 0.9633 time: 0.6054s
Epoch: 0008 loss_train: 1775.0363 acc_train: 0.9624 loss_val: 2205.0432 acc_val: 0.9630 time: 0.6421s
Epoch: 0009 loss_train: 1774.9130 acc_train: 0.9629 loss_val: 2203.7161 acc_val: 0.9635 time: 0.6489s
Epoch: 0010 loss_train: 1775.4124 acc_train: 0.9625 loss_val: 2204.6025 acc_val: 0

Epoch: 0082 loss_train: 1772.9731 acc_train: 0.9626 loss_val: 2202.4473 acc_val: 0.9632 time: 0.8643s
Epoch: 0083 loss_train: 1773.0850 acc_train: 0.9626 loss_val: 2202.0564 acc_val: 0.9633 time: 0.7481s
Epoch: 0084 loss_train: 1772.9656 acc_train: 0.9627 loss_val: 2201.8384 acc_val: 0.9635 time: 0.6952s
Epoch: 0085 loss_train: 1772.3230 acc_train: 0.9629 loss_val: 2201.5459 acc_val: 0.9636 time: 0.6659s
Epoch: 0086 loss_train: 1772.5586 acc_train: 0.9628 loss_val: 2201.3513 acc_val: 0.9634 time: 0.5926s
Epoch: 0087 loss_train: 1772.8250 acc_train: 0.9627 loss_val: 2202.0757 acc_val: 0.9633 time: 0.6953s
Epoch: 0088 loss_train: 1772.3594 acc_train: 0.9628 loss_val: 2201.3726 acc_val: 0.9636 time: 0.6151s
Epoch: 0089 loss_train: 1772.5969 acc_train: 0.9628 loss_val: 2201.7280 acc_val: 0.9633 time: 0.6230s
Epoch: 0090 loss_train: 1772.6995 acc_train: 0.9628 loss_val: 2201.7283 acc_val: 0.9634 time: 0.6347s
Epoch: 0091 loss_train: 1772.1119 acc_train: 0.9630 loss_val: 2201.9194 acc_val: 0

Epoch: 0163 loss_train: 1772.3411 acc_train: 0.9629 loss_val: 2201.4639 acc_val: 0.9636 time: 0.6008s
Epoch: 0164 loss_train: 1772.3208 acc_train: 0.9629 loss_val: 2201.4717 acc_val: 0.9635 time: 0.6415s
Epoch: 0165 loss_train: 1772.7601 acc_train: 0.9627 loss_val: 2201.9971 acc_val: 0.9635 time: 0.6338s
Epoch: 0166 loss_train: 1772.0277 acc_train: 0.9631 loss_val: 2202.0537 acc_val: 0.9635 time: 0.6128s
Epoch: 0167 loss_train: 1771.8601 acc_train: 0.9631 loss_val: 2201.3157 acc_val: 0.9636 time: 0.6037s
Epoch: 0168 loss_train: 1772.3959 acc_train: 0.9629 loss_val: 2202.1816 acc_val: 0.9634 time: 0.6121s
Epoch: 0169 loss_train: 1771.9132 acc_train: 0.9631 loss_val: 2201.9905 acc_val: 0.9634 time: 0.6375s
Epoch: 0170 loss_train: 1771.9006 acc_train: 0.9631 loss_val: 2201.3743 acc_val: 0.9636 time: 0.6151s
Epoch: 0171 loss_train: 1772.2080 acc_train: 0.9630 loss_val: 2201.6567 acc_val: 0.9636 time: 0.6131s
Epoch: 0172 loss_train: 1772.3909 acc_train: 0.9629 loss_val: 2201.5469 acc_val: 0

In [62]:
def test():
    model.eval()
    output = model(features, adj)
    loss_test = np.sum([(F.binary_cross_entropy_with_logits(output[idx_test][:,i], labels[idx_test][:,i])*weight(labels[idx_test][:,i])).mean() for i in range(39)])
    acc_test = accuracy_sample_class(threshold(output.detach().numpy()[idx_test]), labels.detach().numpy()[idx_test])
    print("Test set results:",
          "loss= {:.4f}".format(loss_test.item()),
          "accuracy= {:.4f}".format(acc_test.item()))

In [63]:
test()

Test set results: loss= 2129.0698 accuracy= 0.9651


### Second Attempt : try nn.BCEwithlogits with weights (inverse of the classes' occurence in the labeling vector)

In [None]:
# Model and Optimizer 
model = GCN(nfeat=features.shape[1],
            nhid=args.hidden,
            nclass=labels.shape[1],
            dropout=args.dropout)
optimizer = optim.SGD(model.parameters(),
                      lr = args.lr, weight_decay = args.weight_decay)