# Visualizing Paragraph Attention

This is the code for the project: 

*From RoBERTa to aLEXa: **A**utomated **L**egal **Ex**pert **A**rbitrator for Neural Legal Judgement Prediction*.

## 1. Installations & Imports

**This code should be run from within the src directory.**

### 1.2 Imports

In [1]:
import torch
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from datasets import load_dataset, DatasetDict
from tqdm import tqdm

### 1.3 Initialization

In [2]:
%matplotlib inline

In [3]:
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (20,8)

## Load Model

In [4]:
from dataset import generate_echr_dataset
from preprocessing import preprocess_dataset
from model import load_model

In [5]:
%load_ext autoreload
%autoreload 2

In [40]:
# Variables
n_subset = 20
max_paragraph_len = 224
max_paragraphs = 48
num_labels = 1
base_model = 'bert-base-uncased'
model_path = 'model_outputs/alexa-bert-binary-t1/pytorch_model.bin'

# Load dataset
dataset = generate_echr_dataset('..', n_subset=n_subset, attention_forcing=True, shuffle=False)
dataset = preprocess_dataset(dataset, 'binary', base_model, False, True, max_paragraphs, max_paragraph_len, remove_cols=False)

Loading dataset from path: ..
100%|██████████| 7100/7100 [00:00<00:00, 10285.01it/s]
100%|██████████| 1380/1380 [00:00<00:00, 10458.72it/s]
100%|██████████| 2998/2998 [00:00<00:00, 11679.41it/s]
Loaded train, val, and test splits.
Train split size: 20


  0%|          | 0/20 [00:00<?, ?ex/s]

  0%|          | 0/20 [00:00<?, ?ex/s]

  0%|          | 0/20 [00:00<?, ?ex/s]

Preprocessing dataset for binary classification objective.


  0%|          | 0/20 [00:00<?, ?ex/s]

  0%|          | 0/20 [00:00<?, ?ex/s]

  0%|          | 0/20 [00:00<?, ?ex/s]

  0%|          | 0/20 [00:00<?, ?ex/s]

  0%|          | 0/20 [00:00<?, ?ex/s]

  0%|          | 0/20 [00:00<?, ?ex/s]

Tokenizing dataset using bert-base-uncased tokenizer.


... Not removing facts and rationales.
... Not removing facts and rationales.
... Not removing facts and rationales.


In [15]:
# Load model
model = load_model(model_path, False, True, base_model, 1, max_paragraphs, max_paragraph_len)

Initializing hierarchical aLEXa model with input shape [-1, 48, 224].


In [12]:
dataset['train']['attention_label_mask'].sum()

tensor(0.)

In [14]:
dataset['train']['attention_labels'].sum()

tensor(0.)

In [44]:
predict_dataset = dataset.remove_columns(['facts', 'rationale'])

In [45]:
val_set = predict_dataset['val']

In [46]:
from model import predict

In [47]:
output = predict(model, val_set, False, True)

In [48]:
attn_labels, attn_mask, class_logits, attn_logits, attn_factor, class_factor = output.predictions

In [41]:
facts = ["4. The applicants are spouses. They were born in 1949 and 1965 respectively and live in Vienna, Austria.", "5. On 13 April 2005 the applicants brought an action seeking dissolution of joint ownership of a real estate before the Dunajsk\u00e1 Streda District Court (file no. 9C 70/2005).", "6. On 6 September 2006, at its fifth hearing, the District Court delivered a judgment. The defendant appealed. The applicants requested the District Court to give a supplementary judgment. On 9 November 2006 the case file was submitted to the Trnava Regional Court.", "7. On 20 March 2007 the Regional Court returned the case file to the first-instance court as incomplete. On 11 September 2007 the District Court gave a supplementary judgment and on 11 January 2008 the case file was again submitted to the Regional Court.", "8. In 2008 the Regional Court stayed the proceedings for two months pending the outcome of inheritance proceedings after the defendant had died.", "9. On 31 March 2009 the Regional Court quashed the first-instance judgment and remitted the case to the District Court for a new determination.", "10. On 20 August 2010 the applicants complained before the Constitutional Court about the length of the proceedings before the District Court.", "11. On 4 October 2010 the District Court approved the friendly settlement of the case reached between the parties. This decision became final on 30 October 2010.", "12. On 24 November 2010 the Constitutional Court declared the applicants\u2019 complaint inadmissible as being manifestly ill-founded (case no. I. \u00daS 455/2010). It held that there had been no significant delays in the proceedings before the District Court in breach of Article 6 \u00a7 1 of the Convention and its constitutional equivalent."]

In [42]:
facts

['4. The applicants are spouses. They were born in 1949 and 1965 respectively and live in Vienna, Austria.',
 '5. On 13 April 2005 the applicants brought an action seeking dissolution of joint ownership of a real estate before the Dunajská Streda District Court (file no. 9C 70/2005).',
 '6. On 6 September 2006, at its fifth hearing, the District Court delivered a judgment. The defendant appealed. The applicants requested the District Court to give a supplementary judgment. On 9 November 2006 the case file was submitted to the Trnava Regional Court.',
 '7. On 20 March 2007 the Regional Court returned the case file to the first-instance court as incomplete. On 11 September 2007 the District Court gave a supplementary judgment and on 11 January 2008 the case file was again submitted to the Regional Court.',
 '8. In 2008 the Regional Court stayed the proceedings for two months pending the outcome of inheritance proceedings after the defendant had died.',
 '9. On 31 March 2009 the Regional 

In [61]:
num_facts = len(facts)
num_facts

9

In [53]:
from sklearn.preprocessing import MinMaxScaler

In [55]:
scaler = MinMaxScaler()

In [62]:
scaled_logits = scaler.fit_transform(-attn_logits[2][:num_facts].reshape(-1, 1)).flatten()
scaled_logits

array([0.22236133, 0.11718947, 0.03363597, 0.        , 0.11040962,
       0.487998  , 0.0706135 , 0.07451683, 1.        ], dtype=float32)

In [64]:
import html
from IPython.core.display import display, HTML
import string

In [67]:
def html_escape(text):
    return html.escape(text)

In [73]:
cut_off = 0.3

highlighted_text = []
for idx in range(num_facts):
    weight = scaled_logits[idx]
    text = facts[idx]
    if weight > cut_off:
        highlighted_text.append('<span style="background-color:rgba(231,76,60,' + str(weight) + ');">' + html_escape(text) + '</span>')
    else:
        highlighted_text.append(text)

highlighted_text = '<br>'.join(highlighted_text)
print(highlighted_text)
display(HTML(highlighted_text))

4. The applicants are spouses. They were born in 1949 and 1965 respectively and live in Vienna, Austria.<br>5. On 13 April 2005 the applicants brought an action seeking dissolution of joint ownership of a real estate before the Dunajská Streda District Court (file no. 9C 70/2005).<br>6. On 6 September 2006, at its fifth hearing, the District Court delivered a judgment. The defendant appealed. The applicants requested the District Court to give a supplementary judgment. On 9 November 2006 the case file was submitted to the Trnava Regional Court.<br>7. On 20 March 2007 the Regional Court returned the case file to the first-instance court as incomplete. On 11 September 2007 the District Court gave a supplementary judgment and on 11 January 2008 the case file was again submitted to the Regional Court.<br>8. In 2008 the Regional Court stayed the proceedings for two months pending the outcome of inheritance proceedings after the defendant had died.<br><span style="background-color:rgba(231,7