<h1 style='color:blue'><center>Recommendation System using Graph Neural Networks</center></h1>

---

<b style='color:DodgerBlue'><center><a href='https://www.linkedin.com/in/mugheesasif/'>Mughees Asif</a></center></b>
<b style='color:DodgerBlue'><center><a href='https://www.sems.qmul.ac.uk/staff/a.nanjangud'>Dr Angadh Nanjangud</a></center></b>
<i style='color:rgb(0, 122, 172)'><center><a href='http://www.eecs.qmul.ac.uk/'>School of Electronic Engineering and Computer Science</a></center></i>
<i style='color:rgb(0, 122, 172)'><center><a href='https://www.qmul.ac.uk/'>Queen Mary, University of London</a></center></i>

## Abstract

Recently, neural networks have been used in developing recommendation systems that can parse graph-like data structures to develop meaningful representations from user-to-item relationships and social network information. Moreover, recommendation systems that have been developed with Graph Neural Networks (GNNs) expedite the aggregation process of macro (e.g., topological structure) and micro (e.g., node information) operations, and therefore enhance the overall information filtering capabilities of the system. However, the representation learning process is non-linear as social relationships combined with item interactions, both, need to be considered for optimal results. This research project aimed to address this by proposing a recommendation system that is capable of using the underlying social connections between users and items. The system was also split into three variations where several metrics were used to draw comparisons with published academic recommendation systems. The training of the models was done by using two real-world datasets that contain user-to-user and user-to-item information. The results show the system performing with equal efficiency as the sourced academic models, and also highlight the suitability of the system for recommendation tasks.

The associated report for this research can be accessed [here](https://drive.google.com/file/d/1rlg-qpLjy5kA0SMW5FDsr1YtZ0OJK90X/view?usp=sharing).

## Contents<a class="anchor" id="contents"></a>

---

**1** &nbsp;&nbsp;**[Import dependencies](#dw-dep)**<br>

**2** &nbsp;&nbsp;**[Download data](#data)**<br>

**3** &nbsp;&nbsp;**[Getting interactions](#preprocess-data)**<br>

**4** &nbsp;&nbsp;**[Graph Neural Network](#gnn)**<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;4.1.&nbsp;&nbsp;*[Multi-Layer Perceptron](#mlp)*<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;4.2.&nbsp;&nbsp;*[Aggregator](#aggregator)*<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;4.3.&nbsp;&nbsp;*[Item modelling](#item-model)*<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;4.4.&nbsp;&nbsp;*[User modelling](#user-model)*<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;4.5.&nbsp;&nbsp;*[Model $X_a$](#model-x)*<br>

**5** &nbsp;&nbsp;**[Training](#training)**<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;5.1.&nbsp;&nbsp;*[Preprocess dataset and create graph](#preprocess-dataset)*<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;5.2.&nbsp;&nbsp;*[Hyperparameters and seting up the model](#params)*<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;5.3.&nbsp;&nbsp;*[Train the model](#train)*<br>

**6** &nbsp;&nbsp;**[Testing](#testing)**<br>

**7** &nbsp;&nbsp;**[Results](#results)**<br>

**8** &nbsp;&nbsp;**[References](#references)**<br>

**9** &nbsp;&nbsp;**[Notes](#notes)**<br>

<div class="alert alert-block alert-info">
<b>Tip:</b> To return to the contents, press the 🔝 icon located in the title of each chapter.</div>

## 1&nbsp;&nbsp;Import dependencies <a class="anchor" id="dw-dep"></a> [🔝](#contents)

This section imports the necessary libraries needed to develop the GNN recommendation system.

<div class="alert alert-block alert-warning">
    <b>Note:</b> <li>Please ensure to have the <a href='https://github.com/mughees-asif/postgraduate-research-project/blob/main/collate.py'><code>collate.py</code></a> function in the root folder before running the experiment.</li>
</div>

In [9]:
# Main
import torch
import torch.nn as nn
import pickle
import numpy as np

# Utilities
from tqdm import tqdm
from torch.utils.data import Dataset, DataLoader
from collate import collate_fn

## 2&nbsp;&nbsp;Download data <a class="anchor" id="data"></a> [🔝](#contents)

This section downloads the Ciao and Epinions data sets from the official [GitHub](https://github.com/mughees-asif/) repository of this project. Alternatively, they can also be sourced from [here](https://www.cse.msu.edu/~tangjili/datasetcode/truststudy.htm).

<div class="alert alert-block alert-warning">
<b>Note:</b> <li>The following cellblock should only uncommented if the notebook is being run on a <b>Google Colab</b> instance.</li> <li>The command downloads the datasets and sets them up correctly in the Colab environment.</li> <li> If running on a local machine, please disregard the cellblock and ensure to have the <code>data</code> folder (download from above) present in the root folder of the project i.e., <code>~/postgraduate-research-project/data/</code>.</li>
</div>

In [10]:
# !wget https://github.com/mughees-asif/postgraduate-research-project/raw/master/data.zip
# !unzip /content/data.zip

## 3&nbsp;&nbsp;Getting interactions <a class="anchor" id="preprocess-data"></a> [🔝](#contents)

This section will sift through the dataset to establish the interactions between the user-user and user-item categories.

In [11]:
class CreateInteractions(Dataset):
    def __init__(self, data, user_item_list, user_user_list, user_user_item_list, item_user_list):
        self.data = data
        self.u_items_list = user_item_list
        self.u_users_list = user_user_list
        self.u_users_items_list = user_user_item_list
        self.i_users_list = item_user_list

    def __getitem__(self, index):
        # Baseline parameters
        user_id = self.data[index][0]
        item_id = self.data[index][1]
        label = self.data[index][2]
        
        # Interaction information
        user_item = self.u_items_list[user_id]
        user_user = self.u_users_list[user_id]
        user_user_item = self.u_users_items_list[user_id]
        item_user = self.i_users_list[item_id]

        return (user_id, item_id, label), user_item, user_user, user_user_item, item_user

    def __len__(self):
        return len(self.data)

## 4&nbsp;&nbsp;Graph Neural Network (GNN)<a class="anchor" id="gnn"></a> [🔝](#contents)

A graph $\mathcal{G}$ is represented by the notation $\mathcal{G}=(\mathcal{V},\mathcal{E})$, where $\mathcal{V}$ is representative of the set of available nodes and $\mathcal{E}$ is the set of edges. 

Furthermore, $v_i \in \mathcal{V}$ is a node with an edge $e_{ij}=(v_i,v_j) \in \mathcal{E}$ extending from $v_j$ to $v_i$, and the local neighbourhood of the node $v$ can be denoted as $\mathcal{N}(v)=\{u \in \mathcal{V}|(v,u) \in \mathcal{E}\}$.

The main intuition behind GNNs is the iterative aggregation (Section 4.2.) of the feature information from a neighbourhood of nodes that is integrated (Section 4.1.) with the information from a current node during the propagation process.

### 4.1.&nbsp;&nbsp;Multi-Layer Perceptron (MLP)<a class="anchor" id="mlp"></a>

<img src="./images/mlp.png" alt="rating" style="width: 500px;"/>
<h5 align="center">Figure 1: Multi-Layer Perceptron</h5>

A Multi-Layer Perceptron (MLP) is introduced to concatenate (update) the two vectors (See Figure 1 above), resulting in:

$$
\begin{align}
    \boldsymbol{\mathrm{h}}_{i} = \sigma (W_{l} \cdot c_{l-1} + b_{i})
\end{align}
$$

where, $c_l = \left[ \boldsymbol{\mathrm{h}}_{i}^{I} \oplus \boldsymbol{\mathrm{h}}_{i}^{S} \right]$.

In [12]:
class MLP(nn.Module):
    def __init__(self, input_dim, output_dim):
        super(MLP, self).__init__()
        # Seqeuntial container for 2 linear transformation layers with ReLU 
        # as the activation function
        self.mlp = nn.Sequential(
            nn.Linear(input_dim, input_dim//2, bias=True),
            nn.ReLU(),
            nn.Linear(input_dim//2, output_dim, bias=True)
        )

    def forward(self, x):
        return self.mlp(x)

### 4.2.&nbsp;&nbsp;Aggregator<a class="anchor" id="aggregator"></a>

$$
\mathrm{Aggregation}:\;\;\boldsymbol{\mathrm{n}}_v = \mathrm{Aggregator}_l \left(\left\{ \boldsymbol{\mathrm{h}}_{u}^{l}, \forall u \in \mathcal{N}_v \right\}\right)
$$

where, $\boldsymbol{\mathrm{h}}_{u}^{l}$ that represents the node at the $l^{th}$ layer and $\mathcal{N}(v)$ is the neighbourhood of the node.

In [13]:
class Aggregator(nn.Module):
    def __init__(self, input_dim, output_dim):
        super(Aggregator, self).__init__()
        # Simple linear `y=mx^T+c` transformation
        self.mlp = nn.Sequential(
            nn.Linear(input_dim, output_dim, bias=True),
            nn.ReLU()
        )

    def forward(self, x):
        return self.mlp(x)

### 4.3.&nbsp;&nbsp;Item modelling<a class="anchor" id="item-model"></a>

<img src="./images/item-modelling.png" alt="rating" style="width: 400px;"/>
<h5 align="center">Figure 2: Item modelling framework</h5>

For each item, the users' preferences are aggregated i.e., the mean of the ratings ranging from $1$ to $5$ for all items $(R = r \in {1,2,3,4,5})$, and using a MLP, the two vectors holding information regarding plain user embedding $\boldsymbol{\mathrm{p}}_{t}$ and opinion embedding $\boldsymbol{\mathrm{e}}_{r}$ are used to develop a user representation $\boldsymbol{\mathrm{f}}_{jt}$:

$$
\begin{align}
    \boldsymbol{\mathrm{f}}_{jt} = g_{u} (\boldsymbol{\mathrm{p}}_{t} \oplus \boldsymbol{\mathrm{e}}_{r})
\end{align}
$$

The latent factors are derived by the introduction of an attention mechanism:

$$
\begin{align}
    \boldsymbol{\mathrm{z}}_{j} = \sigma \left( \boldsymbol{\mathrm{W}} \cdot A_{\mathrm{users}} \left( {\boldsymbol{\mathrm{f}}}_{jt}, \forall t \in B(j) \right) + \boldsymbol{b}\right) \\
    \\
    \boldsymbol{\mathrm{z}}_{j} = \sigma \left( \boldsymbol{\mathrm{W}} \cdot \left\{ \sum_{t \in B(j)}\mu_{jt}{\boldsymbol{\mathrm{f}}}_{jt}\right\} + \boldsymbol{b}\right)    
\end{align}
$$

where, $\mu_{jt}$ is "$\ldots$ to capture heterogeneous influence from user-item interactions on learning item latent factor"<sup>1</sup>. 

<div class="alert alert-block alert-info">
<b>Tip:</b> Please refer to the mathematical notation to understand the variables.</div>

In [14]:
class ItemModelling(nn.Module):
    def __init__(self, embedded_dimensions, user_embedding, item_embedding, rating_embedding):
        super(ItemModelling, self).__init__()
        self.emb_dim = embedded_dimensions
        self.user_emb = user_embedding
        self.item_emb = item_embedding
        self.rating_emb = rating_embedding

        self.g_u = MLP(2 * self.emb_dim, self.emb_dim)

        self.item_users_attn = MLP(2 * self.emb_dim, 1)
        self.aggr_users = Aggregator(self.emb_dim, self.emb_dim)

        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        self.eps = 1e-10

    def forward(self, iids, i_user_pad):
        p_t = self.user_emb(i_user_pad[:, :, 0])
        i_user_er = self.rating_emb(i_user_pad[:, :, 1])
        mask_i = torch.where(i_user_pad[:, :, 0] > 0, torch.tensor([1.], device=self.device),
                             torch.tensor([0.], device=self.device))
        
        f_jt = self.g_u(torch.cat([p_t, i_user_er], dim=2).view(-1, 2 * self.emb_dim)).view(p_t.size())
        q_j = mask_i.unsqueeze(2).expand_as(f_jt) * self.item_emb(iids).unsqueeze(1).expand_as(f_jt)
        
        mu_jt = self.item_users_attn(torch.cat([f_jt, q_j], dim=2).view(-1, 2 * self.emb_dim)).view(mask_i.size())
        mu_jt = torch.exp(mu_jt) * mask_i
        mu_jt = mu_jt / (torch.sum(mu_jt, 1).unsqueeze(1).expand_as(mu_jt) + self.eps)

        z_j = self.aggr_users(torch.sum(mu_jt.unsqueeze(2).expand_as(f_jt) * f_jt, 1))

        return z_j

### 4.4.&nbsp;&nbsp;User modelling<a class="anchor" id="user-model"></a>

#### Modelling components<sup>2</sup>

The user modelling operation aims to learn the latent factors $\boldsymbol{\mathrm{h}}_{i} \in \mathcal{R}^d$ of users $u_i$.  This operation requires the concatenation of two latent factors to obtain the user latent factors $\boldsymbol{\mathrm{h}}_{i}$: item space user latent factor $\boldsymbol{\mathrm{h}}_{i}^{I} \in \mathcal{R}^d$ from the user-item graph, and a social space user latent factor $\boldsymbol{\mathrm{h}}_{i}^{S} \in \mathcal{R}^d$ from the social graph.

<img src="./images/user-modelling-0.png" alt="combo" style="width: 750px;"/> 
<h5 align="center">Figure 3: User modelling components</h5>

##### Item space

The item space operation utilises the interactions between the users and items and also the users' preferences regarding the item, all encoded as a user-item graph. The main premise is to learn item-space user latent factor $\boldsymbol{\mathrm{h}}_{i}^{I}$. This can be defined in the classic $y=mx+c$ equivalent linear function as:

$$
\begin{align}
    \boldsymbol{\mathrm{h}}_{i}^{I} = \sigma \left( \boldsymbol{\mathrm{W}} \cdot A_{\mathrm{item}} \left( {\boldsymbol{\mathrm{x}}}_{ia}, \forall a \in C(i) \right) + \boldsymbol{b}\right)
    \label{eq3}
\end{align}
$$

where, $\sigma$ is the rectified linear unit function, $\boldsymbol{\mathrm{W}}$ are the weights of the network, $\boldsymbol{b}$ is the bias, $A_{\mathrm{item}}$ is the aggregation operation, $C(i)$ are the items the user interacted wit, and $\boldsymbol{\mathrm{x}}_{ia}$ is representation vector that includes the users' opinion. 

The output is the representation vector that includes the users' opinion on a certain item $\boldsymbol{\mathrm{x}}_{ia}$:

$$
\begin{align}
    \boldsymbol{\mathrm{x}}_{ia} = g_{v}\left( \boldsymbol{\mathrm{q}}_a \oplus \boldsymbol{\mathrm{e}}_r \right)
\end{align}
$$

where, $g_v$ is the MLP.

A $2$-layer attention mechanism intervenes where each interaction is given an individual weight dependent on the user's interest in the item:

$$
\begin{align}
    \boldsymbol{\mathrm{h}}_{i}^{I} = \sigma \left( \boldsymbol{\mathrm{W}} \cdot \left\{ \sum_{a \in C(i)}\alpha_{ia}{\boldsymbol{\mathrm{x}}}_{ia}\right\} + \boldsymbol{b}\right)
\end{align}
$$

where, $\alpha_{ia}$ is representative of the interaction between the user $u_i$ and the item $v_a$. 

##### Social space

Incorporating social space latent factors $\boldsymbol{\mathrm{h}}_{i}^{S}$ that are an aggregation of the neighbouring users item space $A_{\mathrm{neighbours}}(\cdot)$:

$$
\begin{align}
    \boldsymbol{\mathrm{h}}_{i}^{S} = \sigma \left( \boldsymbol{\mathrm{W}} \cdot A_{\mathrm{neighbours}} \left( {\boldsymbol{\mathrm{h}}}_{o}^{I}, \forall o \in N(i) \right) + \boldsymbol{b}\right)
\end{align}
$$

To diminish the impact of assuming that all neighbours contribute equally, an attention mechanism using a $2$-layer neural network is introduced that develops the correlation between user-to-user and user-to-item interaction:

$$
\begin{align}
     \boldsymbol{\mathrm{h}}_{i}^{S} = \sigma \left( \boldsymbol{\mathrm{W}} \cdot \left\{ \sum_{o \in N(i)}\beta_{io}{\boldsymbol{\mathrm{h}}}_{o}^{I}\right\} + \boldsymbol{b}\right)   
\end{align}
$$

where, $\beta_{io}$ is representative of the interaction between the user's social circle $u_i$ and the item $v_a$. 

#### Overall operation

Using the MLP, the vectors are concatenated.

<img src="./images/user-modelling.png" alt="rating" style="width: 400px;"/>
<h5 align="center">Figure 4: User modelling overall</h5>

#### Variants

The variants can be developed as follows:

* $X_a$: As described above and coded below (default setting).
* $X_b$: To disable the item-space operation, comment out/alter the **relevant lines of code indicted below**. Restart the kernel and run the notebook again to train the GNN.
* $X_c$: Same as above. 

<div class="alert alert-block alert-info">
<b>Tip:</b> Please refer to the mathematical notation to understand the variables.</div>

In [15]:
class UserModelling(nn.Module):
    def __init__(self, embedded_dimensions, user_embedding, item_embedding, rating_embedding):
        super(UserModelling, self).__init__()
        self.emb_dim = embedded_dimensions
        self.user_emb = user_embedding
        self.item_emb = item_embedding
        self.rating_emb = rating_embedding

        self.g_v = MLP(2 * self.emb_dim, self.emb_dim)

        self.user_item_attn = MLP(2 * self.emb_dim, 1)
        self.aggr_items = Aggregator(self.emb_dim, self.emb_dim)

        self.user_user_attn = MLP(2 * self.emb_dim, 1)
        self.aggr_neighbors = Aggregator(self.emb_dim, self.emb_dim)

        self.mlp = nn.Sequential(
            nn.Linear(2 * self.emb_dim, self.emb_dim, bias=True),
            nn.ReLU(),
            nn.Linear(self.emb_dim, self.emb_dim, bias=True),
            nn.ReLU(),
            nn.Linear(self.emb_dim, self.emb_dim, bias=True),
            nn.ReLU()
        )

        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        self.eps = 1e-10

    def forward(self, uids, u_item_pad, u_user_pad, u_user_item_pad):
        q_a = self.item_emb(u_item_pad[:, :, 0])
        u_item_er = self.rating_emb(u_item_pad[:, :, 1])
        x_ia = self.g_v(torch.cat([q_a, u_item_er], dim=2).view(-1, 2 * self.emb_dim)).view(q_a.size())
        mask_u = torch.where(u_item_pad[:, :, 0] > 0, torch.tensor([1.], device=self.device),
                             torch.tensor([0.], device=self.device))
        p_i = mask_u.unsqueeze(2).expand_as(x_ia) * self.user_emb(uids).unsqueeze(1).expand_as(x_ia)
        alpha = self.user_item_attn(torch.cat([x_ia, p_i], dim=2).view(-1, 2 * self.emb_dim)).view(mask_u.size())
        alpha = torch.exp(alpha) * mask_u
        alpha = alpha / (torch.sum(alpha, 1).unsqueeze(1).expand_as(alpha) + self.eps)
        ################################################################
        # Comment out the line below to disable the item-space operation
        ################################################################
        h_iI = self.aggr_items(torch.sum(alpha.unsqueeze(2).expand_as(x_ia) * x_ia, 1))
        ################################################################

        q_a_s = self.item_emb(u_user_item_pad[:, :, :, 0])
        u_user_item_er = self.rating_emb(u_user_item_pad[:, :, :, 1])
        x_ia_s = self.g_v(torch.cat([q_a_s, u_user_item_er], dim=2).view(-1, 2 * self.emb_dim)).view(q_a_s.size())
        mask_s = torch.where(u_user_item_pad[:, :, :, 0] > 0, torch.tensor([1.], device=self.device),
                             torch.tensor([0.], device=self.device))
        p_i_s = mask_s.unsqueeze(3).expand_as(x_ia_s) * self.user_emb(u_user_pad).unsqueeze(2).expand_as(x_ia_s)
        alpha_s = self.user_item_attn(torch.cat([x_ia_s, p_i_s], dim=3).view(-1, 2 * self.emb_dim)).view(mask_s.size())
        alpha_s = torch.exp(alpha_s) * mask_s
        alpha_s = alpha_s / (torch.sum(alpha_s, 2).unsqueeze(2).expand_as(alpha_s) + self.eps)
        h_oI_temp = torch.sum(alpha_s.unsqueeze(3).expand_as(x_ia_s) * x_ia_s, 2)
        h_oI = self.aggr_items(h_oI_temp.view(-1, self.emb_dim)).view(h_oI_temp.size())

        beta = self.user_user_attn(torch.cat([h_oI, self.user_emb(u_user_pad)], dim=2).view(-1, 2 * self.emb_dim)).view(
            u_user_pad.size())
        mask_su = torch.where(u_user_pad > 0, torch.tensor([1.], device=self.device),
                              torch.tensor([0.], device=self.device))
        beta = torch.exp(beta) * mask_su
        beta = beta / (torch.sum(beta, 1).unsqueeze(1).expand_as(beta) + self.eps)
        ################################################################
        # Comment out the line below to disable the social-space operation
        ################################################################
        h_iS = self.aggr_neighbors(torch.sum(beta.unsqueeze(2).expand_as(h_oI) * h_oI, 1))
        ################################################################
        
        ################################################################
        # Uncomment the relevant user latent factor below 
        ################################################################
        # Default: Includes both operations
        h_i = self.mlp(torch.cat([h_iI, h_iS], dim=1))
        # Item-space operation eliminated
        # h_i = self.mlp(h_iS)
        # Social-space operation eliminated
        # h_i = self.mlp(h_iI)
        ################################################################

        return h_i

### 4.5.&nbsp;&nbsp;Model $X_a$<a class="anchor" id="model-x"></a>

<img src="./images/rating.png" alt="rating" style="width: 400px;"/>
<h5 align="center">Figure 5: Model $X$ framework displaying the two main components needed for a rating prediction</h5>

The latent factors gained from the previous sections are concatenated, $\boldsymbol{\mathrm{h}}_i \oplus \boldsymbol{\mathrm{z}}_j$, and passed through a MLP to get the final ratings $r_{ij}^{'}$:

In [16]:
class GraphNeuralNetwork(nn.Module):
    def __init__(self, users, items, ratings, emb_dim=64):
        super(GraphNeuralNetwork, self).__init__()
        self.n_users = users
        self.n_items = items
        self.n_ratings = ratings
        self.emb_dim = emb_dim

        self.user_emb = nn.Embedding(self.n_users, self.emb_dim, padding_idx=0)
        self.item_emb = nn.Embedding(self.n_items, self.emb_dim, padding_idx=0)
        self.rating_emb = nn.Embedding(self.n_ratings, self.emb_dim, padding_idx=0)

        self.user_model = UserModelling(self.emb_dim, self.user_emb, self.item_emb, self.rating_emb)
        self.item_model = ItemModelling(self.emb_dim, self.user_emb, self.item_emb, self.rating_emb)

        self.mlp = nn.Sequential(
            nn.Linear(2 * self.emb_dim, self.emb_dim, bias=True),
            nn.ReLU(),
            nn.Linear(self.emb_dim, self.emb_dim, bias=True),
            nn.ReLU(),
            nn.Linear(self.emb_dim, 1)
        )

    def forward(self, uids, iids, u_item_pad, u_user_pad, u_user_item_pad, i_user_pad):
        h_i = self.user_model(uids, u_item_pad, u_user_pad, u_user_item_pad)
        z_j = self.item_model(iids, i_user_pad)

        r_ij = self.mlp(torch.cat([h_i, z_j], dim=1))

        return r_ij

## 5&nbsp;&nbsp;Training <a class="anchor" id="training"></a> [🔝](#contents)

### 5.1.&nbsp;&nbsp;Preprocess dataset and create graph<a class="anchor" id="preprocess-dataset"></a>

<div class="alert alert-block alert-warning">
    <b>Note:</b> <li>If running on a <b>Google Colab</b> instance, please change the filepaths accordingly:<ul><code>'/content/dataset_epinion.pkl' OR '/content/dataset_ciao.pkl'</code></ul><ul><code>'/content/list_epinions.pkl' OR '/content/list_ciao.pkl'</code></ul></li>
    <li>If running on a <b>local machine</b> instance, please change the filepaths accordingly:<ul><code>'./data/dataset_epinion.pkl' OR './data/dataset_ciao.pkl'</code></ul><ul><code>'./data/list_epinions.pkl' OR './data/list_ciao.pkl'</code></ul></li>
</div>

In [17]:
with open('./data/dataset_epinions.pkl', 'rb') as f:
    train = pickle.load(f)
    validate = pickle.load(f)
    test = pickle.load(f)

with open('./data/list_epinions.pkl', 'rb') as f:
    user_item_list = pickle.load(f)
    user_user_list = pickle.load(f)
    user_user_item_list = pickle.load(f)
    item_user_list = pickle.load(f)
    (user_count, item_count, rate_count) = pickle.load(f)

In [19]:
batch_size = 128

train_data = CreateInteractions(train, user_item_list, user_user_list, user_user_item_list, item_user_list)
valid_data = CreateInteractions(validate, user_item_list, user_user_list, user_user_item_list, item_user_list)
test_data = CreateInteractions(test, user_item_list, user_user_list, user_user_item_list, item_user_list)

train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True, collate_fn=collate_fn)
valid_loader = DataLoader(valid_data, batch_size=batch_size, shuffle=False, collate_fn=collate_fn)
test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=False, collate_fn=collate_fn)

### 5.2.&nbsp;&nbsp;Hyperparameters and seting up the model<a class="anchor" id="params"></a>

In [26]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
embed_dim = 64
learning_rate = 1e-3
num_epochs = 25

model = GraphNeuralNetwork(user_count+1, item_count+1, rate_count+1, embed_dim).to(device)
optimizer = torch.optim.RMSprop(model.parameters(), learning_rate)
criterion = nn.MSELoss()
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size = 4, gamma = 0.1)

print(model)

GraphNeuralNetwork(
  (user_emb): Embedding(22167, 64, padding_idx=0)
  (item_emb): Embedding(296278, 64, padding_idx=0)
  (rating_emb): Embedding(28, 64, padding_idx=0)
  (user_model): UserModelling(
    (user_emb): Embedding(22167, 64, padding_idx=0)
    (item_emb): Embedding(296278, 64, padding_idx=0)
    (rating_emb): Embedding(28, 64, padding_idx=0)
    (g_v): MLP(
      (mlp): Sequential(
        (0): Linear(in_features=128, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=64, bias=True)
      )
    )
    (user_item_attn): MLP(
      (mlp): Sequential(
        (0): Linear(in_features=128, out_features=64, bias=True)
        (1): ReLU()
        (2): Linear(in_features=64, out_features=1, bias=True)
      )
    )
    (aggr_items): Aggregator(
      (mlp): Sequential(
        (0): Linear(in_features=64, out_features=64, bias=True)
        (1): ReLU()
      )
    )
    (user_user_attn): MLP(
      (mlp): Sequential(
        (0): Linear(

### 5.3.&nbsp;&nbsp;Train the model<a class="anchor" id="train"></a>


<div class="alert alert-block alert-warning">
    <b>Note:</b> <li>Ensure there is an empty <code>'/trained_models/'</code> folder in the root directory i.e., <code>'~/postgraduate-research-project/trained_models/'</code>.</li>
    <li>This applies to <b>both</b> <code>Google Colab</code> and local machine instances.</li>
</div>

In [None]:
for epoch in range(n_epochs):
    # Training
    model.train()
    s_loss = 0
    for i, (user_id, item_id, labels, user_item, user_user, user_user_item, item_user) in tqdm(enumerate(train_loader),
                                                                                               total=len(train_loader)):
        user_id = user_id.to(device)
        item_id = item_id.to(device)
        labels = labels.to(device)
        user_item = user_item.to(device)
        user_user = user_user.to(device)
        user_user_item = user_user_item.to(device)
        item_user = item_user.to(device)

        optimizer.zero_grad()
        outputs = model(user_id, item_id, user_item, user_user, user_user_item, item_user)
        loss = criterion(outputs, labels.unsqueeze(1))

        loss.backward()
        optimizer.step()

        loss_val = loss.item()
        s_loss += loss_val

        iter_num = epoch * len(train_loader) + i + 1

    # Validating
    model.eval()
    errors = []
    with torch.no_grad():
        for user_id, item_id, labels, user_item, user_user, user_user_item, item_user in tqdm(valid_loader):
            user_id = user_id.to(device)
            item_id = item_id.to(device)
            labels = labels.to(device)
            user_item = user_item.to(device)
            user_user = user_user.to(device)
            user_user_item = user_user_item.to(device)
            item_user = item_user.to(device)
            preds = model(user_id, item_id, user_item, user_user, user_user_item, item_user)
            error = torch.abs(preds.squeeze(1) - labels)
            errors.extend(error.data.cpu().numpy().tolist())

    # Evaluation metrics
    mae = np.mean(errors)
    rmse = np.sqrt(np.mean(np.power(errors, 2)))

    scheduler.step()

    ckpt_dict = {
        'epoch': epoch + 1,
        'state_dict': model.state_dict(),
        'optimizer': optimizer.state_dict()
    }

    torch.save(ckpt_dict, 'trained_models/latest_checkpoint.pth')

    # Save model for testing
    best_mae = 0
    if epoch == 0:
        best_mae = mae
    elif mae < best_mae:
        best_mae = mae
        torch.save(ckpt_dict, 'trained_models/best_checkpoint_{}.pth'.format(embed_dim))

    print('Epoch #{}: MAE: {:.4f}, RMSE: {:.4f}, Best MAE: {:.4f}'.format(epoch + 1, mae, rmse, best_mae))

## 6&nbsp;&nbsp;Testing <a class="anchor" id="testing"></a> [🔝](#contents)

The evaluation criteria included examining two negatively-oriented metrics to analyse the model's precision in measuring the likelihood of a user giving a certain rating for an item, where the lowest numerical value is taken as the most accurate. The metrics serve as an indication of the error in the final rating prediction. Lastly, the training was run for $25$ epochs, and the model with the best metrics was saved at intermittent checkpoints to be used later for testing:

* **Mean Absolute Error** (MAE): Measures the average magnitude of the prediction $y_i$ and true $x_i$ errors without any directional considerations on $n$ points:
$$
\begin{align}
    \mathrm{MAE} = \frac{1}{n} \sum_{i=1}^{n} |y_i - x_i|
\end{align}
$$

* **Root Mean Squared Error** (RMSE): Same as MAE but with the caveat of square-rooting the average of the squared errors which translates into enhanced sensitivity for larger errors:

$$
\begin{align}
    \mathrm{RMSE} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} \left(|y_i - x_i|\right)^2}
\end{align}
$$

In [None]:
embed_dim = 64
checkpoint = torch.load('trained_models/best_checkpoint_{}.pth'.format(embed_dim))
model = GraphNeuralNetwork(user_count+1, item_count+1, rate_count+1, embed_dim).to(device)
model.load_state_dict(checkpoint['state_dict'])

model.eval()
test_errors = []
with torch.no_grad():
    for user_id, item_id, labels, user_item, user_user, user_user_item, item_user in tqdm(test_loader):
        user_id = user_id.to(device)
        item_id = item_id.to(device)
        labels = labels.to(device)
        user_item = user_item.to(device)
        user_user = user_user.to(device)
        user_user_item = user_user_item.to(device)
        item_user = item_user.to(device)
        predictions = model(user_id, item_id, user_item, user_user, user_user_item, item_user)
        error = torch.abs(predictions.squeeze(1) - labels)
        test_errors.extend(error.data.cpu().numpy().tolist())

test_mae = np.mean(test_errors)
test_rmse = np.sqrt(np.mean(np.power(test_errors, 2)))
print('Test: MAE: {:.4f}, RMSE: {:.4f}'.format(test_mae, test_rmse))

## 7&nbsp;&nbsp;Results <a class="anchor" id="results"></a> [🔝](#contents)

The framework with the associated variants was benchmarked against published recommendation systems from academic literature, where the results were averaged to enable comparison. 

##### The links in the references below will take the reader to the implementations that were used for this research.

* **SoRec**: Ma, H., Yang, H., Lyu, M.R. and King, I., 2008, October. [SoRec: Social Recommendation using Probabilistic Matrix Factorisation.](https://github.com/SeizeTheMoment/SoRec-Social-Recommendation-Using-Probabilistic-Matrix-Factorization) _In Proceedings of the 17th ACM Conference on Information and Knowledge anagement_ (pp. 931-940).

* **SoReg**: Ma, H., Zhou, D., Liu, C., Lyu, M.R. and King, I., 2011, February. [Recommender Systems with Social Regularization.](https://github.com/Coder-Yu/QRec) _In Proceedings of the fourth ACM International Conference on Web Search and Data Mining_ (pp. 287-296).

* **DeepSoR**: Fan, W., Li, Q. and Cheng, M., 2018, April. [Deep Modelling of Social Relations for Recommendation.](https://github.com/wenqifan03/GraphRec-WWW19) _In Proceedings of the AAAI Conference on Artificial Intelligence_ (Vol. 32, No. 1).

* **GC-MC**: Berg, R.V.D., Kipf, T.N. and Welling, M., 2017. [Graph Convolutional Matrix Completion.](https://github.com/riannevdberg/gc-mc) _arXiv preprint_ arXiv:1706.02263.

### Evaluation metrics for all the models

|  Dataset | Metric | Algorithm |       |         |         |       |         |         |
|:--------:|:------:|:---------:|:-----:|:-------:|:-------:|:-----:|:-------:|:-------:|
|          |        |   **SoRec**   | **SoReg** | **DeepSoR** | **GCMC+SN** | $\boldsymbol{X_a}$    | $\boldsymbol{X_b}$ | $\boldsymbol{X_c}$ |
|   **Ciao**   | _MAE_    | $0.87$      | $0.85$  | $0.82$    | $0.81$    |$0.73$          | $0.79$       | $0.88$       |
|          | _RMSE_   | $1.04$      | $1.06$  | $1.03$    | $1.02$    | $1.00$          | $0.98$       | $1.01$       |
| **Epinions** | _MAE_    | $1.09$      | $1.07$  | $0.89$    | $1.01$    |$1.01$          | $1.04$       | $1.02$       |
|          | _RMSE_   | $1.14$      | $1.17$  | $1.09$    | $1.07$    | $0.87$          | $1.02$       | $0.99$     |

<img src="./images/mae.png" alt="mae" style="width: 500px;"/>
<h5 align="center">Figure 6: MAE for both datasets</h5>

<img src="./images/rmse.png" alt="rmse" style="width: 500px;"/>
<h5 align="center">Figure 7: RMSE for both datasets</h5>

### Weighted averages

| Algorithm | Dataset |          |
|:---------:|:-------:|:--------:|
|           |   **Ciao**  | **Epinions** |
|   **SoRec**   |  $0.925$  |   $1.115$  |
|   **SoReg**   |  $0.950$  |   $1.120$  |
|  **DeepSoR**  |  $0.985$  |   $1.000$  |
|   **GC-MC**   |  $0.945$  |   $1.040$ |
|    $\boldsymbol{X_a}$    |  $0.885$  |   $0.980$  |
|    $\boldsymbol{X_b}$   |  $0.940$  |   $1.000$  |
|    $\boldsymbol{X_c}$    |  $0.945$  |   $0.945$  |

<img src="./images/average.png" alt="rating" style="width: 500px;"/>
<h5 align="center">Figure 8: Averaged metrics for all the tested models</h5>

## 8&nbsp;&nbsp;References <a class="anchor" id="references"></a> [🔝](#contents)

<sup>1</sup>Wu, S., Sun, F., Zhang, W., Xie, X. and Cui, B., 2020. Graph Neural Networks in Recommender Systems: A Survey. <i>ACM Computing Surveys</i> (CSUR).<br/>
<sup>2</sup>Fan, W., Ma, Y., Li, Q., He, Y., Zhao, E., Tang, J. and Yin, D., 2019, May. Graph Neural Networks for Social Recommendation. <i>In The World Wide Web Conference</i> (pp. 417-426).

## 9&nbsp;&nbsp;Notes <a class="anchor" id="notes"></a> [🔝](#contents)

* This project was developed for the [MSc. Artificial Intelligence](https://www.qmul.ac.uk/postgraduate/taught/coursefinder/courses/artificial-intelligence-msc/) programme as a dissertation project at Queen Mary, University of London. 

* The motivation for this project aligns with the author's life ambition and interest in modelling relationships found in the Criminal Justice System (CJS) of the U.K. using a 21st-century technological perspective.  Contributing factors include prisoner recidivism and associated influences, and leveraging big data to analyse the bias in the foundational triumvirate of law: judge-lawyer-defendant. Please review the [Lammy Report](https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/643001/lammy-review-final-report.pdf) to read about missing state intervention in the rising recidivism rate.

* To cite, please use the following information:

_BibTeX_:

`@misc{asif_nanjangud_2022,`<br />
    `title = {Recommendation System using Graph Neural Networks},`<br />
    `url = {https://github.com/mughees-asif/postgraduate-research-project},`<br />
    `journal = {GitHub},`<br />
    `publisher = {Mughees Asif},`<br />
    `author = {Asif, Mughees and Nanjangud, Angadh},`<br />
    `year = {2022},`<br />
    `month = {Aug}`<br />
   `}`

_Harvard_:

`Asif, M., and Nanjangud, A.. (2022). Recommendation System using Graph Neural Networks.`

_APA_:

`Asif, M., & Nanjangud, A.. (2022). Recommendation System using Graph Neural Networks.`

<h1 style='color:purple'><center><a href='#contents'>🔝</a>&#8592;&#8592;&#8592;&#8592; END &#8594;&#8594;&#8594;&#8594;<a href='#contents'>🔝</a></center></h1>