<a href="https://colab.research.google.com/github/alexlimatds/circle-2022/blob/main/RRLLJ_Mixup.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Rhetorical Role Labeling for Legal Judgments - experiments with Mixup data augmentation

In this notebook we utilize Sentence BERT (SBERT) features to represent the sentences. We also apply the Mixup data augmentation method.
We use the SentenceTransformer library for SBERT implementation.

### Installing dependencies

In [1]:
pip install -U sentence-transformers

Collecting sentence-transformers
  Downloading sentence-transformers-2.2.0.tar.gz (79 kB)
[K     |████████████████████████████████| 79 kB 3.2 MB/s 
[?25hCollecting transformers<5.0.0,>=4.6.0
  Downloading transformers-4.17.0-py3-none-any.whl (3.8 MB)
[K     |████████████████████████████████| 3.8 MB 32.4 MB/s 
Collecting sentencepiece
  Downloading sentencepiece-0.1.96-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
[K     |████████████████████████████████| 1.2 MB 38.7 MB/s 
[?25hCollecting huggingface-hub
  Downloading huggingface_hub-0.4.0-py3-none-any.whl (67 kB)
[K     |████████████████████████████████| 67 kB 3.5 MB/s 
Collecting tokenizers!=0.11.3,>=0.11.1
  Downloading tokenizers-0.11.6-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.5 MB)
[K     |████████████████████████████████| 6.5 MB 32.0 MB/s 
[?25hCollecting pyyaml>=5.1
  Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64

### Loading dataset

In [2]:
from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)
g_drive_dir = "/content/gdrive/MyDrive/"

Mounted at /content/gdrive


In [3]:
!mkdir data
!mkdir data/train
!tar -xf {g_drive_dir}AILA_2021/AILA_2021_train.tar.xz -C data/train

train_dir = 'data/train/'

In [4]:
import pandas as pd
import numpy as np
from os import listdir

def read_docs(dir_name):
  docs_ = {} # key: file name, value: dataframe with sentences and labels
  labels_ = set()
  for f in listdir(dir_name):
    df = pd.read_csv(
        dir_name + f, 
        sep='\t', 
        names=['sentence', 'label'])
    docs_[f] = df
    labels_.update(df['label'].to_list())
  return docs_, labels_

docs_train, labels_train = read_docs(train_dir)
n_classes = len(labels_train)
print(f'TRAIN: {len(docs_train)} documents read.')
print(f'Number of classes: {n_classes}')

TRAIN: 60 documents read.
Number of classes: 7


### Mixup data
The augmented data was generated in other notebook. Here, we just load them. 

The augmented data was generated with different alpha values. So, there are more than one set of augmented data.

In [5]:
mixup_dir = g_drive_dir + 'RRLLJ/'

In [6]:
from os import listdir

mixup_features = {} # key: alpha value, value: feature vectors (numpy matrix)
mixup_targets = {}  # key: alpha value, value: target vectors (numpy matrix)
for f in listdir(mixup_dir):
  if f.endswith('.npy'):
    tensor_ = np.load(mixup_dir + f)
    sep_idx_ = f.rindex('_')
    alpha_ = f[17:sep_idx_].replace('_', '.')
    suffix_ = f[sep_idx_+1:f.rindex('.')]
    if suffix_ == 'targets':
      mixup_targets[alpha_] = tensor_
    elif suffix_ == 'features':
      mixup_features[alpha_] = tensor_
    else:
      print('WARNING: unknow file suffix:', suffix_)
    print(f'Alpha value: {alpha_} ({suffix_})')

Alpha value: 1.0 (features)
Alpha value: 1.0 (targets)
Alpha value: 0.7 (targets)
Alpha value: 0.7 (features)
Alpha value: 0.3 (targets)
Alpha value: 0.1 (targets)
Alpha value: 0.3 (features)
Alpha value: 0.1 (features)


### Label encoder

The labels were encoded as one-hot vectors during the generation of the Mixup data. Let's load the mapping between the labels and the one-hot vectors.

In [7]:
class LabelEncoder:

  def __init__(self):
    with open(f'{mixup_dir}labels.txt', 'r') as file:
      lines = file.readlines()
      self.labels = [None] * len(lines)
      self.vectors = [None] * len(lines)
      for l in lines:
        tokens = l.split(':')
        label_ = tokens[0]
        vector_ = np.fromstring(tokens[1].strip()[1:-1], sep=' ', dtype=np.float_)
        idx = vector_.argmax()
        self.labels[idx] = label_
        self.vectors[idx] = vector_
    self.classes_ = self.labels
  
  def encode_single(self, label_):
    idx = self.labels.index(label_) # throws exception if label_ isn't in the list
    return self.vectors[idx]
  
  def decode_single(self, vector_):
    #idx = self.vectors.index(vector_.all()) # throws exception if vector_ isn't in the list
    idx = vector_.argmax()
    return self.labels[idx]

  def encode(self, labels_):
    return [self.encode_single(l) for l in labels_]
  
  def decode(self, vectors_):
    return [self.decode_single(v) for v in vectors_]

label_encoder = LabelEncoder()

In [8]:
label_encoder.labels

['Argument',
 'Facts',
 'Precedent',
 'Ratio of the decision',
 'Ruling by Lower Court',
 'Ruling by Present Court',
 'Statute']

In [9]:
docs_train_targets = {} # key: file id, value: matrix of one-hot encoded labels
for doc_id, df in docs_train.items():
  docs_train_targets[doc_id] = label_encoder.encode(df['label'].tolist())


### SBERT features

In [10]:
from sentence_transformers import SentenceTransformer

sent_encoder = SentenceTransformer('sentence-transformers/LaBSE')

Downloading:   0%|          | 0.00/391 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.62k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/804 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/122 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/461 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.88G [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/112 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/9.62M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/411 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/5.22M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/190 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/114 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/2.36M [00:00<?, ?B/s]

In [11]:
n_features = sent_encoder.get_sentence_embedding_dimension()
print(f'Features dimension: {n_features}')

Features dimension: 768


In [12]:
%%time
docs_train_embedding = {} # key: file id, value: numpy matrix of features
for doc_id, df in docs_train.items():
  docs_train_embedding[doc_id] = sent_encoder.encode(df['sentence'].tolist())


CPU times: user 1min 24s, sys: 1.68 s, total: 1min 25s
Wall time: 1min 35s


### Evaluation functions

In [25]:
import numpy as np
import sklearn
from sklearn.model_selection import KFold
from sklearn.metrics import precision_recall_fscore_support
from IPython.display import display, HTML

def docs_to_sentences(docs_idx, doc_keys_list):
  features_ = None
  targets_ = None
  for idx in docs_idx:
    doc_id = doc_keys_list[idx]
    if features_ is None:
      features_ = docs_train_embedding[doc_id]
      targets_ = docs_train_targets[doc_id]
    else:
      features_ = np.vstack((features_, docs_train_embedding[doc_id]))
      targets_ = np.vstack((targets_, docs_train_targets[doc_id]))
  return features_, targets_

def metrics_report(title, averages, stds):
  report_df = pd.DataFrame(columns=['Score', 'Standard Deviation'])
  report_df.loc['Precision'] = [f'{averages[0]:.4f}', f'{stds[0]:.4f}']
  report_df.loc['Recall'] = [f'{averages[1]:.4f}', f'{stds[1]:.4f}']
  report_df.loc['F1'] = [f'{averages[2]:.4f}', f'{stds[2]:.4f}']
  display(HTML(f'<br><span style="font-weight: bold">{title}: cross-validation macro averages</span>'))
  display(report_df)

def classification_report(metrics):
  report_df = pd.DataFrame(columns=['Precision', 'Recall', 'F1'])
  for i, l in enumerate(label_encoder.classes_):
    report_df.loc[l] = [
      f'{metrics[i, 0]:.4f}', 
      f'{metrics[i, 1]:.4f}', 
      f'{metrics[i, 2]:.4f}', 
    ]
  display(HTML(f'<br><span style="font-weight: bold">Classification Report (cross-validation test averages)</span>'))
  display(report_df)

test_metrics = {}

def cross_validation(trainer, augmented_features, augmented_targets, alpha_str):
  # Cross validation: for train, it uses augmented data + folder data. For test, it uses folder data only
  train_metrics_cross = []
  test_metrics_cross = []
  test_metrics_by_class = np.zeros((n_classes, 3)) # 3 metrics (P, R, F1)
  n_folds = 5
  skf = KFold(n_splits=n_folds) # for cross-validation
  docs_list = list(docs_train.keys())
  for train_docs_idx, test_docs_idx in skf.split(docs_list): # The cross-validation splitting is document-oriented
    # train: includes augmented data
    train_features_fold, train_targets_fold = docs_to_sentences(train_docs_idx, docs_list)
    model = trainer(
        np.vstack((train_features_fold, augmented_features)), 
        np.vstack((train_targets_fold, augmented_targets)))
    # test: no augmented data
    test_features_fold, test_targets_fold = docs_to_sentences(test_docs_idx, docs_list)
    predictions = model.predict(test_features_fold)
    test_labels_fold = label_encoder.decode(test_targets_fold)
    # averaged test metrics
    p_test, r_test, f1_test, _ = precision_recall_fscore_support(
        test_labels_fold, 
        predictions, 
        average='macro', 
        zero_division=0)
    test_metrics_cross.append([p_test, r_test, f1_test])
    # test metrics by class
    metrics = precision_recall_fscore_support(
        test_labels_fold, 
        predictions, 
        average=None, 
        labels=label_encoder.labels, 
        zero_division=0)
    test_metrics_by_class = test_metrics_by_class + np.hstack((
        metrics[0].reshape(-1, 1),  # precision
        metrics[1].reshape(-1, 1),  # recall
        metrics[1].reshape(-1, 1))) # F1
    # train metrics
    predictions = model.predict(train_features_fold)
    p_train, r_train, f1_train, _ = precision_recall_fscore_support(
        label_encoder.decode(train_targets_fold), 
        predictions, 
        average='macro', 
        zero_division=0)
    train_metrics_cross.append([p_train, r_train, f1_train])
  
  #print(f'**** RESULTS (alpha = {alpha_str}) ****')
  display(HTML(f'<br><span style="font-weight: bold">**** RESULTS (alpha={alpha_str}) ****</span>'))

  train_metrics_cross = np.array(train_metrics_cross)
  train_mean = np.mean(train_metrics_cross, axis=0)
  train_std = np.std(train_metrics_cross, axis=0)
  metrics_report('TRAIN', train_mean, train_std)

  test_metrics_cross = np.array(test_metrics_cross)
  test_mean = np.mean(test_metrics_cross, axis=0)
  test_std = np.std(test_metrics_cross, axis=0)
  metrics_report('TEST', test_mean, test_std)

  test_metrics_by_class /= n_classes
  classification_report(test_metrics_by_class)

  model_metrics = test_metrics.get(model.__class__.__name__, [])
  model_metrics.append((alpha_str, test_mean))
  test_metrics[model.__class__.__name__] = model_metrics

### PyTorch models

In [14]:
import torch

pu_device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

In [15]:
from torch.utils.data import Dataset

class MyDataset(Dataset):
  def __init__(self, inputs, targets, device):
    self.X = torch.from_numpy(inputs).float().to(device)
    self.y = torch.from_numpy(targets).float().to(device)

  def __len__(self):
    return len(self.X)

  def __getitem__(self, idx):
    return [self.X[idx], self.y[idx]]


In [16]:
from torch.optim import Adam
from torch.utils.data import DataLoader
from sklearn.model_selection import ShuffleSplit

torch.manual_seed(1)

class MLPTrainer:

  def __init__(self, model, device, l2_penalty=0.0001):
    self.model = model
    setattr(self.model.__class__, 'predict', self.predict)
    self.device = device
    self.model.to(device)
    # The training replicates the default configuration from scikit-learn's MLPClassifier
    self.criterion = torch.nn.CrossEntropyLoss().to(device)
    self.lambd = l2_penalty # weight decay for Adam optmizer
    self.n_epochs = 200

  def fit(self, inputs, targets, verbose=False):
    # early stopping params and variables
    tol = 0.001
    n_iter_no_change = 7
    early_stop_count = 0
    best_loss_validation = float("inf")
    # splitting train data into train and validation sets in order to perform early stopping
    spl = ShuffleSplit(n_splits=1, train_size=0.9, random_state=1)
    for train_index, val_index in spl.split(inputs):
      # getting datasets
      train_x = inputs[train_index]
      train_y = targets[train_index]
      validation_x = inputs[val_index]
      validation_y = targets[val_index]
      train_dl = DataLoader(
        MyDataset(train_x, train_y, self.device), 
        batch_size=64)
      validation_dl = DataLoader(
        MyDataset(validation_x, validation_y, self.device), 
        batch_size=len(validation_x))
      # training
      self.model.train()
      optimizer = Adam(
        self.model.parameters(), 
        weight_decay=self.lambd)
      for epoch in range(self.n_epochs):
        # iterate mini batches
        for x, y in train_dl:
          optimizer.zero_grad()
          yhat = self.model(x)
          loss = self.criterion(yhat, y)
          loss.backward()
          optimizer.step()
        # Early stopping
        for x, y in validation_dl:
          loss_validation = self.criterion(self.model(x), y)
        if loss_validation >= best_loss_validation - tol:
          early_stop_count += 1
        else:
          early_stop_count = 0
          best_loss_validation = loss_validation
        if early_stop_count == n_iter_no_change:
          break
    if verbose:
      print(f'TRAIN: Stopped at epoch {epoch + 1} {"(MAX EPOCH)" if epoch + 1 == self.n_epochs else ""}')

    self.model.eval()
    return self.model

  def predict(self, X):
    y = self.model.forward(torch.from_numpy(X).float().to(self.device))
    return label_encoder.decode(y.detach().to('cpu').numpy())

#### TorchMLP

In [20]:
import torch.nn
from torch.nn.init import xavier_uniform_
from torch.nn.init import kaiming_uniform_

class TorchMLP(torch.nn.Module):
  def __init__(self, n_inputs, n_classes):
    super(TorchMLP, self).__init__()
    # hidden layer
    n_hidden_units = 100
    hidden1 = torch.nn.Linear(n_inputs, n_hidden_units)
    kaiming_uniform_(hidden1.weight, nonlinearity='relu')
    relu = torch.nn.ReLU()
    # output layer
    output = torch.nn.Linear(n_hidden_units, n_classes)
    xavier_uniform_(output.weight)
    # There's no need of softmax function because it's included in the CrossEntropyLoss function
    self.layers = torch.nn.Sequential(
      hidden1, 
      relu, 
      output)
  
  def forward(self, X):
    return self.layers(X)
  

In [30]:
def torch_mlp_trainer(X, y):
  trainer = MLPTrainer(
      TorchMLP(n_features, n_classes), 
      pu_device, 
      l2_penalty=0.0015)
  return trainer.fit(X, y, verbose=True)

In [31]:
%%time
for alpha_, features_ in mixup_features.items():
  cross_validation(
      torch_mlp_trainer, 
      features_, 
      mixup_targets[alpha_], 
      alpha_)

TRAIN: Stopped at epoch 111 
TRAIN: Stopped at epoch 101 
TRAIN: Stopped at epoch 136 
TRAIN: Stopped at epoch 102 
TRAIN: Stopped at epoch 117 


Unnamed: 0,Score,Standard Deviation
Precision,0.6628,0.0112
Recall,0.4765,0.0191
F1,0.5082,0.0162


Unnamed: 0,Score,Standard Deviation
Precision,0.5422,0.0597
Recall,0.3967,0.0391
F1,0.4189,0.0381


Unnamed: 0,Precision,Recall,F1
Argument,0.3459,0.1863,0.1863
Facts,0.379,0.4673,0.4673
Precedent,0.3562,0.2047,0.2047
Ratio of the decision,0.3848,0.4847,0.4847
Ruling by Lower Court,0.25,0.0129,0.0129
Ruling by Present Court,0.6099,0.2677,0.2677
Statute,0.3851,0.3601,0.3601


TRAIN: Stopped at epoch 121 
TRAIN: Stopped at epoch 117 
TRAIN: Stopped at epoch 114 
TRAIN: Stopped at epoch 122 
TRAIN: Stopped at epoch 128 


Unnamed: 0,Score,Standard Deviation
Precision,0.6772,0.0429
Recall,0.4819,0.0168
F1,0.5114,0.0147


Unnamed: 0,Score,Standard Deviation
Precision,0.5247,0.0638
Recall,0.4045,0.045
F1,0.4233,0.0424


Unnamed: 0,Precision,Recall,F1
Argument,0.3656,0.1838,0.1838
Facts,0.3775,0.4889,0.4889
Precedent,0.3555,0.2082,0.2082
Ratio of the decision,0.3948,0.4835,0.4835
Ruling by Lower Court,0.1429,0.0049,0.0049
Ruling by Present Court,0.6036,0.2804,0.2804
Statute,0.3833,0.3728,0.3728


TRAIN: Stopped at epoch 115 
TRAIN: Stopped at epoch 117 
TRAIN: Stopped at epoch 112 
TRAIN: Stopped at epoch 139 
TRAIN: Stopped at epoch 137 


Unnamed: 0,Score,Standard Deviation
Precision,0.6577,0.0163
Recall,0.4875,0.017
F1,0.5168,0.0123


Unnamed: 0,Score,Standard Deviation
Precision,0.5431,0.0861
Recall,0.4081,0.0385
F1,0.429,0.0391


Unnamed: 0,Precision,Recall,F1
Argument,0.3765,0.201,0.201
Facts,0.3833,0.4721,0.4721
Precedent,0.3524,0.2131,0.2131
Ratio of the decision,0.391,0.482,0.482
Ruling by Lower Court,0.2286,0.0124,0.0124
Ruling by Present Court,0.6009,0.2769,0.2769
Statute,0.3827,0.3831,0.3831


TRAIN: Stopped at epoch 88 
TRAIN: Stopped at epoch 111 
TRAIN: Stopped at epoch 113 
TRAIN: Stopped at epoch 101 
TRAIN: Stopped at epoch 135 


Unnamed: 0,Score,Standard Deviation
Precision,0.6548,0.0088
Recall,0.4801,0.0239
F1,0.513,0.0189


Unnamed: 0,Score,Standard Deviation
Precision,0.5177,0.0614
Recall,0.3989,0.0388
F1,0.4216,0.0405


Unnamed: 0,Precision,Recall,F1
Argument,0.3432,0.1935,0.1935
Facts,0.3832,0.4698,0.4698
Precedent,0.3584,0.1913,0.1913
Ratio of the decision,0.3881,0.4939,0.4939
Ruling by Lower Court,0.1071,0.0114,0.0114
Ruling by Present Court,0.6163,0.2689,0.2689
Statute,0.3922,0.3656,0.3656


CPU times: user 13min 28s, sys: 26.7 s, total: 13min 55s
Wall time: 13min 50s


#### TorchMLPMaxPool

In [21]:
import math

class TorchMLPMaxPool(torch.nn.Module):
  def __init__(self, n_inputs, n_classes):
    super(TorchMLPMaxPool, self).__init__()
    # max pool
    window_size = 2
    max_pool = torch.nn.MaxPool1d(window_size, ceil_mode=True)
    n_out_pool = math.ceil((n_inputs - window_size) / window_size + 1)
    # hidden layers
    n_hidden_units = 100
    hidden1 = torch.nn.Linear(n_out_pool, n_hidden_units)
    kaiming_uniform_(hidden1.weight, nonlinearity='relu')
    relu = torch.nn.ReLU()
    # output layer
    output = torch.nn.Linear(n_hidden_units, n_classes)
    xavier_uniform_(output.weight)
    # There's no need of softmax function because it's included in the CrossEntropyLoss function
    self.layers = torch.nn.Sequential(
      max_pool, 
      hidden1, 
      relu, 
      output)
  
  def forward(self, X):
    return self.layers(X)
  

In [32]:
def torch_mlp_maxpool_trainer(X, y):
  trainer = MLPTrainer(
      TorchMLPMaxPool(n_features, n_classes), 
      pu_device, 
      l2_penalty=0.00015)
  return trainer.fit(X, y, verbose=True)

In [33]:
%%time
for alpha_, features_ in mixup_features.items():
  cross_validation(
      torch_mlp_maxpool_trainer, 
      features_, 
      mixup_targets[alpha_], 
      alpha_)

TRAIN: Stopped at epoch 37 
TRAIN: Stopped at epoch 55 
TRAIN: Stopped at epoch 47 
TRAIN: Stopped at epoch 62 
TRAIN: Stopped at epoch 66 


Unnamed: 0,Score,Standard Deviation
Precision,0.7546,0.0424
Recall,0.6243,0.031
F1,0.669,0.0365


Unnamed: 0,Score,Standard Deviation
Precision,0.4738,0.0407
Recall,0.4102,0.0447
F1,0.4236,0.0396


Unnamed: 0,Precision,Recall,F1
Argument,0.281,0.194,0.194
Facts,0.3767,0.4375,0.4375
Precedent,0.308,0.2203,0.2203
Ratio of the decision,0.3838,0.4465,0.4465
Ruling by Lower Court,0.1553,0.0744,0.0744
Ruling by Present Court,0.4819,0.3296,0.3296
Statute,0.3822,0.3489,0.3489


TRAIN: Stopped at epoch 49 
TRAIN: Stopped at epoch 43 
TRAIN: Stopped at epoch 55 
TRAIN: Stopped at epoch 60 
TRAIN: Stopped at epoch 85 


Unnamed: 0,Score,Standard Deviation
Precision,0.7558,0.0431
Recall,0.6339,0.0439
F1,0.6759,0.0457


Unnamed: 0,Score,Standard Deviation
Precision,0.4797,0.0507
Recall,0.4122,0.0617
F1,0.4254,0.0564


Unnamed: 0,Precision,Recall,F1
Argument,0.2975,0.1922,0.1922
Facts,0.3732,0.4458,0.4458
Precedent,0.3173,0.2271,0.2271
Ratio of the decision,0.3869,0.4488,0.4488
Ruling by Lower Court,0.152,0.072,0.072
Ruling by Present Court,0.4857,0.333,0.333
Statute,0.3857,0.3421,0.3421


TRAIN: Stopped at epoch 67 
TRAIN: Stopped at epoch 58 
TRAIN: Stopped at epoch 51 
TRAIN: Stopped at epoch 70 
TRAIN: Stopped at epoch 55 


Unnamed: 0,Score,Standard Deviation
Precision,0.7696,0.0291
Recall,0.6482,0.0277
F1,0.6914,0.0262


Unnamed: 0,Score,Standard Deviation
Precision,0.4787,0.0462
Recall,0.4185,0.057
F1,0.4332,0.0518


Unnamed: 0,Precision,Recall,F1
Argument,0.3061,0.2159,0.2159
Facts,0.3791,0.4369,0.4369
Precedent,0.3098,0.2469,0.2469
Ratio of the decision,0.39,0.4454,0.4454
Ruling by Lower Court,0.1241,0.0715,0.0715
Ruling by Present Court,0.4941,0.3315,0.3315
Statute,0.3903,0.3447,0.3447


TRAIN: Stopped at epoch 45 
TRAIN: Stopped at epoch 41 
TRAIN: Stopped at epoch 60 
TRAIN: Stopped at epoch 65 
TRAIN: Stopped at epoch 55 


Unnamed: 0,Score,Standard Deviation
Precision,0.756,0.0292
Recall,0.6274,0.0377
F1,0.6712,0.0349


Unnamed: 0,Score,Standard Deviation
Precision,0.4763,0.0535
Recall,0.406,0.0513
F1,0.4225,0.0504


Unnamed: 0,Precision,Recall,F1
Argument,0.2993,0.1967,0.1967
Facts,0.3766,0.4305,0.4305
Precedent,0.2992,0.2214,0.2214
Ratio of the decision,0.3813,0.4497,0.4497
Ruling by Lower Court,0.1411,0.0806,0.0806
Ruling by Present Court,0.5051,0.3229,0.3229
Statute,0.3789,0.3281,0.3281


CPU times: user 6min 45s, sys: 12.4 s, total: 6min 57s
Wall time: 6min 55s


#### TorchLogisticRegression

In [17]:
class TorchLogisticRegression(torch.nn.Module):
  def __init__(self, n_inputs, n_classes, device, verbose=False):
    super(TorchLogisticRegression, self).__init__()
    self.verbose = verbose
    self.device = device
    self.layer = torch.nn.Linear(n_inputs, n_classes)
    xavier_uniform_(self.layer.weight)

  def forward(self, X):
    return self.layer(X)
  
  def predict(self, X):
    y = self.forward(torch.from_numpy(X).float().to(self.device))
    return label_encoder.decode(y.detach().to('cpu').numpy())

  def fit(self, X, y):
    # SGD params
    learning_rate = 0.5
    momentum = 0.9
    lambda_param = 0.0001 # L2 regularization
    n_iterations = 1000
    decay_rate = 0.95  # learning rate decay
    # early stopping params and variables
    tol = 0.001
    n_iter_no_change = 5
    early_stop_count = 0
    best_loss = float("inf")
    # loss function and optmizer
    self.train()
    criterion = torch.nn.CrossEntropyLoss().to(self.device)
    optimizer = torch.optim.SGD(
      self.parameters(), 
      lr=learning_rate, 
      momentum=momentum, 
      weight_decay=lambda_param)
    lr_scheduler = torch.optim.lr_scheduler.ExponentialLR(
      optimizer=optimizer, 
      gamma=decay_rate)
    # Data loader
    batch_size = 64
    train_dl = DataLoader(
      MyDataset(X, y, self.device), 
      batch_size=batch_size, 
      shuffle=True)
    # Train loop
    for i in range(1, n_iterations + 1):
      # iterate mini batches
      for x_batch, y_batch in train_dl:
        optimizer.zero_grad()
        y_hat = self(x_batch)
        loss = criterion(y_hat, y_batch)
        loss.backward()
        optimizer.step()
      lr_scheduler.step()
      # early stop
      if loss >= best_loss - tol:
        early_stop_count += 1
      else:
        early_stop_count = 0
        best_loss = loss
      if early_stop_count == n_iter_no_change:
        break
    
    if self.verbose:
      print(f'TRAIN: Stopped at iteration {i} {"(MAX ITERATION)" if i == n_iterations else ""}')
    self.eval()
    return self


In [18]:
def torch_lr_trainer(X, y):
  lr_ = TorchLogisticRegression(n_features, n_classes, pu_device, verbose=True).to(pu_device)
  return lr_.fit(X, y)

In [26]:
%%time
for alpha_, features_ in mixup_features.items():
  cross_validation(
      torch_lr_trainer, 
      features_, 
      mixup_targets[alpha_], 
      alpha_)

TRAIN: Stopped at iteration 9 
TRAIN: Stopped at iteration 9 
TRAIN: Stopped at iteration 7 
TRAIN: Stopped at iteration 12 
TRAIN: Stopped at iteration 7 


Unnamed: 0,Score,Standard Deviation
Precision,0.6038,0.0367
Recall,0.4479,0.0209
F1,0.4684,0.015


Unnamed: 0,Score,Standard Deviation
Precision,0.5047,0.0608
Recall,0.4058,0.0464
F1,0.4193,0.0419


Unnamed: 0,Precision,Recall,F1
Argument,0.2952,0.2175,0.2175
Facts,0.3844,0.4537,0.4537
Precedent,0.354,0.2007,0.2007
Ratio of the decision,0.3927,0.4701,0.4701
Ruling by Lower Court,0.1429,0.0119,0.0119
Ruling by Present Court,0.5889,0.2783,0.2783
Statute,0.3654,0.3967,0.3967


TRAIN: Stopped at iteration 17 
TRAIN: Stopped at iteration 16 
TRAIN: Stopped at iteration 11 
TRAIN: Stopped at iteration 13 
TRAIN: Stopped at iteration 13 


Unnamed: 0,Score,Standard Deviation
Precision,0.6013,0.0196
Recall,0.4438,0.0206
F1,0.4687,0.0144


Unnamed: 0,Score,Standard Deviation
Precision,0.5046,0.0671
Recall,0.3949,0.0337
F1,0.4103,0.0328


Unnamed: 0,Precision,Recall,F1
Argument,0.3206,0.189,0.189
Facts,0.3928,0.43,0.43
Precedent,0.3498,0.1942,0.1942
Ratio of the decision,0.3791,0.4928,0.4928
Ruling by Lower Court,0.119,0.0049,0.0049
Ruling by Present Court,0.5926,0.2706,0.2706
Statute,0.3689,0.3928,0.3928


TRAIN: Stopped at iteration 8 
TRAIN: Stopped at iteration 9 
TRAIN: Stopped at iteration 11 
TRAIN: Stopped at iteration 16 
TRAIN: Stopped at iteration 18 


Unnamed: 0,Score,Standard Deviation
Precision,0.6254,0.0395
Recall,0.4442,0.0157
F1,0.4707,0.0091


Unnamed: 0,Score,Standard Deviation
Precision,0.5107,0.0583
Recall,0.3965,0.0432
F1,0.414,0.0411


Unnamed: 0,Precision,Recall,F1
Argument,0.3425,0.1756,0.1756
Facts,0.3862,0.4486,0.4486
Precedent,0.3199,0.2225,0.2225
Ratio of the decision,0.3852,0.4738,0.4738
Ruling by Lower Court,0.1333,0.0092,0.0092
Ruling by Present Court,0.6148,0.2678,0.2678
Statute,0.3714,0.3846,0.3846


TRAIN: Stopped at iteration 11 
TRAIN: Stopped at iteration 13 
TRAIN: Stopped at iteration 10 
TRAIN: Stopped at iteration 9 
TRAIN: Stopped at iteration 13 


Unnamed: 0,Score,Standard Deviation
Precision,0.592,0.0099
Recall,0.4465,0.0191
F1,0.4743,0.0157


Unnamed: 0,Score,Standard Deviation
Precision,0.5088,0.0674
Recall,0.4005,0.0424
F1,0.4206,0.0389


Unnamed: 0,Precision,Recall,F1
Argument,0.2996,0.2277,0.2277
Facts,0.3901,0.4296,0.4296
Precedent,0.3293,0.2026,0.2026
Ratio of the decision,0.3832,0.4812,0.4812
Ruling by Lower Court,0.1587,0.0196,0.0196
Ruling by Present Court,0.595,0.2754,0.2754
Statute,0.3881,0.3665,0.3665


CPU times: user 1min 5s, sys: 2.7 s, total: 1min 8s
Wall time: 1min 9s


### Summary

In [38]:
from IPython.display import display, update_display

metrics_df = pd.DataFrame(columns=['Model', 'Mixup alpha', 'Precision', 'Recall', 'F1'])
i = 0
for model_name, metrics in test_metrics.items():
  for m in metrics:
    metrics_df.loc[i] = [model_name, m[0], f'{m[1][0]:.4f}', f'{m[1][1]:.4f}', f'{m[1][2]:.4f}']
    i += 1
metrics_display = display(metrics_df, display_id='metrics_table')

Unnamed: 0,Model,Mixup alpha,Precision,Recall,F1
0,TorchLogisticRegression,1.0,0.5047,0.4058,0.4193
1,TorchLogisticRegression,0.7,0.5046,0.3949,0.4103
2,TorchLogisticRegression,0.3,0.5107,0.3965,0.414
3,TorchLogisticRegression,0.1,0.5088,0.4005,0.4206
4,TorchMLP,1.0,0.5422,0.3967,0.4189
5,TorchMLP,0.7,0.5247,0.4045,0.4233
6,TorchMLP,0.3,0.5431,0.4081,0.429
7,TorchMLP,0.1,0.5177,0.3989,0.4216
8,TorchMLPMaxPool,1.0,0.4738,0.4102,0.4236
9,TorchMLPMaxPool,0.7,0.4797,0.4122,0.4254
