# Attention Fusion of Text and User Embeddings
## Model training notebook

The following notebook demonstrates how to train models using Google Colab or Amazon Sagemaker Studio.

The notebook clones [our fork](https://github.com/guptaviha/GF-OLD) of the code published by Miao 2022. The original code and data repositories can be found [here](https://github.com/mzx4936/GF-OLD) and [here](https://github.com/mzx4936/GF-OLD-Dataset). Our modifications consist of 
1. adding implementations of GATv2 and pre-trained RoBERTa models;
2. and, modifying the main training loop for running multiple iterations of a given model and saving training metrics and test results to disk.

**Note:** The models can be trained on CPU but it is highly advised to use a GPU instance if at all possible. Model training on GPU can take as long as 1 hour. Training on CPU will take significantly longer.

In [None]:
# to check if GPU is running...
!nvidia-smi

Sun Dec 18 19:41:59 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   51C    P0    26W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

For our experiments, we saved checkpoints and results to GoogleDrive in case of service timeouts or interruptions. Uncomment the code cell below ro mount a GoogleDrive, if desired.

In [None]:
#from google.colab import drive
#drive.mount('/content/drive')

## Install dependencies

At the time of running, Google Colab has these package versions installed:

```
NumPY version:  1.21.6
pandas version: 1.3.5
PyTorch version: 1.13.0+cu116
```

For replication of this notebook, we advise using the same package versions.

In [2]:
import numpy as np
import pandas as pd
import torch

print(f'NumPY version:  {np.__version__}')
print(f'pandas version: {pd.__version__}')
print(f'PyTorch version: {torch.__version__}')

NumPY version:  1.21.6
pandas version: 1.3.5
PyTorch version: 1.13.0+cu116


In [None]:
# install dependencies
# Might have to restart runtime after this
!pip install dgl-cu116 dglgo -f https://data.dgl.ai/wheels/repo.html

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in links: https://data.dgl.ai/wheels/repo.html


In [None]:
# install remaining dependencies
!pip install emoji
!pip install wordsegment
!pip install ekphrasis
!pip install transformers

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
# download code
!rm -rf graph-fusion-old/ & git clone https://github.com/guptaviha/GF-OLD.git 

Cloning into 'graph-fusion-old'...
remote: Enumerating objects: 280, done.[K
remote: Counting objects: 100% (133/133), done.[K
remote: Compressing objects: 100% (105/105), done.[K
remote: Total 280 (delta 45), reused 95 (delta 25), pack-reused 147[K
Receiving objects: 100% (280/280), 4.15 MiB | 18.88 MiB/s, done.
Resolving deltas: 100% (124/124), done.


In [None]:
# change to directory GF-OLD
%cd /content/GF-OLD

/content/graph-fusion-old


In [None]:
# Clean cached version of fine-tuned RoBERTa model from HuggingFace
!rm -rf cardiffnlp/

The following cell will run a single iteration of our GATv2+TwitterRoBERTa model and save the results to the local disk (`\GF-OLD\results`, 'GF-OLD\saved_models).

In [None]:
!python train_joint.py \
  -bs=32 \
  -lr_other=1e-5 \
  -lr_gat=1e-2 \
  -ep=20 \
  -dr=0.5 \
  -ad=0.1 \
  -hs=768 \
  --model=jointv2_twitter_roberta \
  --clip \
  --cuda=1 \
  --num-trials=1 \
#   --log-path=/content/drive/MyDrive/dl-project/logs/final

The following cell can be used to run the above for ten iterations and demonstrates how to save the results to GoogleDrive, assuming it was mounted. The cell output saved below shows the results of running the training session.

In [None]:
# !python train_joint.py \
#   -bs=32 \
#   -lr_other=1e-5 \
#   -lr_gat=1e-2 \
#   -ep=20 \
#   -dr=0.5 \
#   -ad=0.1 \
#   -hs=768 \
#   --model=jointv2_twitter_roberta \
#   --clip \
#   --cuda=1 \
#   --num-trials=10 \
#   --log-path=/content/drive/MyDrive/dl-project/logs/final

  self.tok = re.compile(r"({})".format("|".join(pipeline)))
Reading twitter - 1grams ...
Reading twitter - 2grams ...
  regexes = {k.lower(): re.compile(self.expressions[k]) for k, v in
Reading twitter - 1grams ...
Graph(num_nodes=1260, num_edges=10137,
      ndata_schemes={'features': Scheme(shape=(2,), dtype=torch.float32)}
      edata_schemes={})
Some weights of the model checkpoint at cardiffnlp/twitter-roberta-base-offensive were not used when initializing RobertaModel: ['classifier.dense.weight', 'classifier.dense.bias', 'classifier.out_proj.bias', 'classifier.out_proj.weight']
- This IS expected if you are initializing RobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification m