# Download the code and install the dependencies

In [None]:
!git clone https://github.com/YuvalUner/dEFEND_paper_reproduction.git

In [None]:
import os
os.chdir('dEFEND_paper_reproduction')

In [None]:
!pip install -r requirements.txt

In [None]:
!python -m spacy download en_core_web_sm

Finally, the model itself uses glove.6b.100d.txt embeddings.\
You can download them from kaggle [here](https://www.kaggle.com/datasets/sawarn69/glove6b100dtxt) or from the original source [here](https://nlp.stanford.edu/projects/glove/).\
You can place the file in the "data" directory, or anywhere else you'd like (make sure to change the `--embedding_path` argument accordingly).

# Optional: Preprocess the data

Running the cell below will preprocess the data we have provided in the "data" directory.\
If you intend to train the model more than once, it is recommended to preprocess the data once and save it using this cell, as it can take a while.\
Otherwise, you can skip this cell and the model will preprocess the data before training. However, please remember to change the `--require_preprocessing` argument to `True` in the training cell if choose to skip this cell.\
Also, change the `--dataset_name` argument to the name of the dataset you want to preprocess (either `"politifact"` or `"gossipcop"`).

In [None]:
!python preprocess.py --dataset_name politifact --dataroot data

# Train the model

Running the cell below will train the model on the specified dataset.\
Please refer to the `help` flag for more information on the available arguments.\
\
Change the `--use_comments` argument to `True` if you want to use the article-comments pairs in the dataset.\
However, be aware that the dataset provided by us has auto-generated comments, due to issues accessing the original comments.\
As such, the comments may not be very useful for training the model, despite the model being able to use them, as described in the original paper.

In [None]:
!python train.py --gpu_ids 0 --dataset_name politifact --dataroot data --require_preprocessing False --save_epoch_freq 5 --use_comments False --name "defend_politifact"

# Model Explainability

One of the main contributions of dEFEND is its explainability.\
In the cells below, you can load the trained model, and use it to make explainable predictions on article / article-comments pairs.

## Import the model and set up the options

In [None]:
from dEFEND_paper_reproduction import *
import model

The model requires these options to be set in order to load.\
Change the `--name` argument to the name of the model you want to load, as well as the `use_comments` argument and the `--embedding_path` argument if you have made changes to them.

In [None]:
import argparse

options_dict = {
    "dataroot": "data",
    "embedding_path": "data/glove.6B.100d.txt",
    "gpu_ids": [0],
    "batch_size": 30,
    "max_sentence_len": 120,
    "max_sentence_count": 50,
    "max_comment_count": 50,
    "max_comment_len": 120,
    "embedding_dim": 100,
    "vocab_size": 20000,
    "name": "defend_politifact",
    "bidirectional": True,
    "RMSprop_ro_param": 0.9,
    "RMSprop_eps": 0.1,
    "RMSprop_decay": 0.0,
    "max_epochs": 20,
    "checkpoints_dir": "./checkpoints",
    "save_epoch_freq": 1,
    "d": 100,
    "k": 80,
    "lr": 0.02,
    "use_comments": False,
    "require_preprocessing": False,
    "dataset_name": "politifact"
}

opt = argparse.Namespace(**options_dict)

## Load the model

In [None]:
if opt.use_comments:
    defend = model.Defend(opt)
else:
    defend = model.DefendNoComments(opt)

In [None]:
defend.load_model(f"{opt.checkpoints_dir}/{opt.name}.pt")

## Load the data

In [None]:
from data import load_articles_with_comments
articles, comments, true_labels = load_articles_with_comments(opt)

## Predict and explain

In [None]:
article, comments, label = articles[0], comments[0], true_labels[0]

In [None]:
print(article)

In [None]:
print(label)

In [None]:
pred, top_sent, top_com = defend.predict_explain(articles[357], comments)
print(pred)
print(top_sent)
print(top_com)