# Initialization 

## Install Packages
At the beginning, packages neeed to be installed to execute the pipeline. The parameter `install_packages` can be set to True or False to indicate whether the required packages are already installed or not.

In [1]:
install_packages = True #@param ["True", "False"] {type:"raw"}

In [2]:
if install_packages:
  !pip install seqeval
  !pip install transformers

Collecting seqeval
[?25l  Downloading https://files.pythonhosted.org/packages/9d/2d/233c79d5b4e5ab1dbf111242299153f3caddddbb691219f363ad55ce783d/seqeval-1.2.2.tar.gz (43kB)
[K     |███████▌                        | 10kB 20.3MB/s eta 0:00:01[K     |███████████████                 | 20kB 26.3MB/s eta 0:00:01[K     |██████████████████████▌         | 30kB 31.7MB/s eta 0:00:01[K     |██████████████████████████████  | 40kB 25.8MB/s eta 0:00:01[K     |████████████████████████████████| 51kB 6.2MB/s 
Building wheels for collected packages: seqeval
  Building wheel for seqeval (setup.py) ... [?25l[?25hdone
  Created wheel for seqeval: filename=seqeval-1.2.2-cp37-none-any.whl size=16172 sha256=5eb55fc9367ce2d983783b279d42eb1cc99e4297801ed4f785e2236329bf95b4
  Stored in directory: /root/.cache/pip/wheels/52/df/1b/45d75646c37428f7e626214704a0e35bd3cfc32eda37e59e5f
Successfully built seqeval
Installing collected packages: seqeval
Successfully installed seqeval-1.2.2
Collecting transform

## Import required Packages

In [None]:
#GENERAL UTILITIES
import pandas as pd
import numpy as np
import os
import pickle
from progressbar import ProgressBar


#IMPORTS FOR NEURAL NETWORK APPLICATION 
import transformers
import transformers as ppb
from transformers import BertModel, BertConfig, BertPreTrainedModel, AdamW, BertForTokenClassification
from transformers import get_linear_schedule_with_warmup

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.nn import CrossEntropyLoss
from tqdm import tqdm, trange

#IMPORTS FOR CREATING BROWN DATASET
import spacy
import nltk

#IMPORT FOR CREATING RUSSIAN FAIRYTALES DATASET
import xml.dom.minidom
import xml.etree.ElementTree as ET

#IMPORTS FOR PREPROCESSING THE DATA
from keras.preprocessing.sequence import pad_sequences
from sklearn.model_selection import train_test_split
from torch.utils.data import TensorDataset, DataLoader, RandomSampler, SequentialSampler

#IMPORTS FOR EVALUATION OF PREDICTIONS
from seqeval.metrics import f1_score, accuracy_score
from sklearn.metrics import confusion_matrix

## Mount Google Drive
The user needs to give the notebook permission to access the google drive. The user needs to follow the link and copy paste the link into the field.

In [None]:
# mount the drive
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## Set the Working Directory
This sets the parameter `working_dir`. This is the general working directory, the root directory for the notebook. 

In [None]:
working_dir = "/content/drive/path_to_unzipped_repository" #@param {type:"string"}

# Semantic Role Labeling

##  Setting Global Variables
To ease the study of each used model, we set a few global parameters in the beginning of the notebook. 
The first parameter is `model_type`. It allows the user to set one of the five different neural network configurations:
  1. `both_model_simple` a simple neural network linear layer ontop of bert contextualized embeddings trained on  both the fairy tale and brown dataset
  2.`fairy_model_simple` a simple neural network linear layer ontop of bert contextualized embeddings trained on the fairy tale dataset
  3.`fairy_model_complex` a complex neural network linear layer ontop of bert contextualized embeddings trained on the fairy tale dataset
  4.`fairy_model_withAnim` a neural network linear layer ontop of bert contextualized embeddings in combination with animacy trained on the fairy tale dataset
  5.`fairy_model_withPred` a neural network linear layer ontop of bert contextualized embeddings in combination with predicate indication trained on the fairy tale dataset.







In [None]:
model_type = "fairy_model_withAnim" #@param ["both_model_simple", "fairy_model_complex", "fairy_model_simple", "fairy_model_withAnim", "fairy_model_withPred"]


Next is the parameter `fast_loading`. This parameter is defaulted to true and enables the loading of preprocessed dictionary files of the fairytale and brown datasets. These can be found on the github page under /input/data_dicts/

In [None]:
fast_loading = True #@param ["True", "False"] {type:"raw"}

## Loading Datasets - Fast Loading
This can be done, if the preprocessed files are already
available (default). The files are saved dictionaries in pickle format. We specify two input paths for the brown and fairytale dataset: `input_dir_brown` and `input_dir_fairytale` (if defaulted the user does not need to set anything). Both datasets are loaded into the environment. Depending on the model one into one dictionary called: `data_dict`. Each entry is one sentence with all the given annotations.



In [None]:
input_dir_brown     = working_dir + 'data/srl_detection/input/data_dict_brown.pickle' 
input_dir_fairytale = working_dir + 'data/srl_detection/input/data_dict_fairytaile.pickle' 

In [None]:
# Opens the dictionary files and saves them into data_dict

if fast_loading:

  with open(input_dir_brown, 'rb') as handle:
      data_dict_brown = pickle.load(handle)
  with open(input_dir_fairytale, 'rb') as handle:
      data_dict_fairytale = pickle.load(handle)

  data_dict = []

  # Check for global variable and set data to specific model_type
  if model_type == "both_model_simple":
    data_dict.extend(data_dict_brown)
    data_dict.extend(data_dict_fairytale)
  else:
    data_dict.extend(data_dict_fairytale)


  compute_animacy = False
  if model_type == "fairy_model_withAnim":
    compute_animacy = True

else:
  print("Fast loading set to false, please provide the paths to the processed files otherwise default is used")
  try:
    with open(input_dir_brown, 'rb') as handle:
        data_dict_brown = pickle.load(handle)
    with open(input_dir_fairytale, 'rb') as handle:
        data_dict_fairytale = pickle.load(handle)

    data_dict = []

    # Check for global variable and set data to specific model_type
    if model_type == "both_model_simple":
      data_dict.extend(data_dict_brown)
      data_dict.extend(data_dict_fairytale)
    else:
      data_dict.extend(data_dict_fairytale)


    compute_animacy = False
    if model_type == "fairy_model_withAnim":
      compute_animacy = True
  except IOError:
    print("File not accessible")

Number of sentenences within the dataset:

In [None]:
len(data_dict)

3812

##  Processing - Create Tensors and Lableset
This is the final Processing step to convert the individual sentences into a tensor data structure. 

The following steps are used:

1. Creation of target class label dictionaries for predicate sense disambiguation and semantic arguments.
2. Tokenization of target sentences by BERT from the `transformers` package.
3. Padding of target tensors, creation of predicate indicators, animacy indicators and verb position indicators.
4. Combining all individual tensor into one training dataset, where a train/test split is peformed.



### Processing - Step 1
In this step we sweep over the given `data_dict`. For each sentence, three additional annotations are introduced:
* `"O"` signifying words with missing annotations 
* `"X"` indicator for subtokens of words produced by BERT
* `"PAD"` indicator for padded words of a sentence

Furthermore each annotated sentence for semantic roles is converted according to the BIO-tagging scheme criteria. All target class labels are saved in two dictionaries, giving for each label the appropiate index and vice versa:

* `tag2idx` , `idx2tag` are the predicate annotations
* `bio2idx` , `idx2bio` are the semantic class labels


In [None]:
#Helper functions to peform step 1. Annotations of "O","X" and "PAD".

def create_set_list(data_dict,list_identifier):
  only_preds = []
  for index,value in enumerate(data_dict):
    pred_list1 = value[list_identifier]
    only_pred = list(filter(lambda a: a != "O", pred_list1))
    only_preds.append(only_pred)
  only_preds = [item for sublist in only_preds for item in sublist]
  frame_labels = list(set(only_preds))
  return frame_labels

def create_dicts(label_list):
  tag_values = ["O"]
  tag_values.extend(list(sorted(label_list)))
  tag_values.append("X")
  tag_values.append("PAD")
  tag2idx = {t: i for i, t in enumerate(tag_values)}
  candidate_labels_ids = tag2idx
  idx2tag =  dict((v,k) for k,v in candidate_labels_ids.items())
  return tag2idx,idx2tag

In [None]:
#Sweeping over each sentence to convert semantic annotations according to the bio
#tagging scheme criteria.
for i in range(0,len(data_dict)):
  first_token = data_dict[i]["apred1"][0]
  if first_token == "O":
    pass
  else:
    first_token = "B-" + first_token

  bio_tagged_scheme = [first_token]
  for index,current_val in enumerate(data_dict[i]["apred1"][1:]):
    index    = index + 1
    prev_val =  data_dict[i]["apred1"][(index-1)]
    if current_val == "O" or current_val == "[CLS]" or current_val == "[SEP]":
      pass
    elif prev_val == current_val:
      current_val = "I-" + current_val
    else:
      current_val = "B-" + current_val
    bio_tagged_scheme.append(current_val)
  data_dict[i]["bio_tagged_list"] = bio_tagged_scheme

In [None]:
#Creating the class label sets for predicates and semantic roles
bio2idx,idx2bio = create_dicts(create_set_list(data_dict,"bio_tagged_list"))
tag_values = list(bio2idx.keys())
tag2idx,idx2tag = create_dicts(create_set_list(data_dict,"sense_list"))

### Processing - Step 2
In this step we load the `tokenizer` function from the `BertModel` of the `tranformer` package. We use `"bert-base-case"` because it got the highest accuracy scores in our preliminary analysis. The `tokenizer` function splits individual words in to the word root and subtokens, called lemmatization. We tag each word root as its labeltag and each subtoken as `'X'`.

In [None]:
#Initilize Bert Tokenizer
model_class, tokenizer_class, pretrained_weights = (ppb.BertModel, ppb.BertTokenizer, 'bert-base-cased')
tokenizer = tokenizer_class.from_pretrained(pretrained_weights)

HBox(children=(FloatProgress(value=0.0, description='Downloading', max=213450.0, style=ProgressStyle(descripti…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=29.0, style=ProgressStyle(description_w…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=435797.0, style=ProgressStyle(descripti…




In [None]:
def tokenize_and_preserve_labels(sentence, text_labels, sense):
    tokenized_sentence = []
    labels = []

    for word, label in zip(sentence, text_labels):

        # Tokenize the word and count # of subwords the word is broken into
        tokenized_word = tokenizer.tokenize(word)
        n_subwords = len(tokenized_word)

        # Add the tokenized word to the final tokenized word list
        tokenized_sentence.extend(tokenized_word)

        # Add the same label to the new list of labels `n_subwords` times

        # Check for predicate or semantic role tokenization
        if sense:
          labels.extend([label])
          if n_subwords > 1:
            labels.extend(["O"] * (n_subwords-1))
        else:
          labels.extend([label])
          if n_subwords > 1:
            labels.extend(["X"] * (n_subwords-1))

    return tokenized_sentence, labels

In [None]:
# Tokenize for predicate labels
tokenized_texts_and_labels = [
    tokenize_and_preserve_labels(curr_dict["tokens"], curr_dict["sense_list"],True)
    for curr_dict in data_dict
]

# Tokenize for bio labels
tokenized_texts_and_labels_bio = [
    tokenize_and_preserve_labels(curr_dict["tokens"], curr_dict["bio_tagged_list"],False)
    for curr_dict in data_dict
]

In [None]:
# Sentence
tokenized_texts = [token_label_pair[0] for token_label_pair in tokenized_texts_and_labels]

# Predicates
labels          = [token_label_pair[1] for token_label_pair in tokenized_texts_and_labels]

# Semantic Roles
bio_labels      = [token_label_pair[1] for token_label_pair in tokenized_texts_and_labels_bio]

### Processing - Step 3
Next the longest sentence within the data set is identified. Saved in the variable `MAX_LENGTH`. Each sentence is converted to index integers and padded with the previously introduced `"PAD"` token. Furthermore we introduce the predicate embedding indicator (`pred_indicator`), animate embedding indicator (`anim_indicator`) and verb position indicator (`verb_indicator`). Both embedding indicators are simple random float vectors of length 10. Two seperate vectors indicating if the word is non animate/predicate or is animate/predicate.


In [None]:
# Find the longest sentence within the data set and set to MAX_LENGTH
MAX_LENGTH = 0
for tupel in tokenized_texts_and_labels:
  leng = int(len(tupel[0]))
  if leng >= MAX_LENGTH:
    MAX_LENGTH = leng

In [None]:
print( "The longest sentence is of length: " +str(MAX_LENGTH))

The longest sentence is of length: 74


In [None]:
# Convert word tokens to integers and pad to MAX_LENGTH
input_ids = pad_sequences([tokenizer.convert_tokens_to_ids(txt) for txt in tokenized_texts],
                          maxlen=MAX_LENGTH, dtype="long", value=0.0,
                          truncating="post", padding="post")

In [None]:
#Convert word labels to integers and pad to MAX_LENGTH
tt = [[tag2idx.get(l) if None != tag2idx.get(l) else (len(tag2idx)+1)  for l in lab] for lab in labels]
tags = pad_sequences(tt,
                     maxlen=MAX_LENGTH, value=tag2idx["PAD"], padding="post",
                     dtype="long", truncating="post")

In [None]:
# Convert word semantic roles to integers and pad to MAX_LENGTH
tt_bio = [[bio2idx.get(l) for l in lab] for lab in bio_labels]
tags_bio = pad_sequences(tt_bio,
                     maxlen=MAX_LENGTH, value=bio2idx["PAD"], padding="post",
                     dtype="long", truncating="post")

In [None]:
# Find the word index for the predicates of the sentence
verb_indicator = []

# Iterate through each sentence
for curr_tag in tags:
  curr_indicator = np.zeros(MAX_LENGTH)

  # Iterate through single sentence
  for i,single_tags in enumerate(curr_tag):
    # Check for predicate
    if single_tags != tag2idx["PAD"] and single_tags != tag2idx["O"]:
      curr_indicator[i] = 1
      verb_indicator.append(curr_indicator)
verb_indicator = np.array(verb_indicator)

In [None]:
# Find the word index for the predicates of the sentence and generate
# a random vector of length 10. Pred_is indicates a predicate and pred_non
# indicates a non predicate word.

pred_is  = np.random.rand(10)
pred_non = np.random.rand(10)
pred_indicator = []

# Iterate through each sentence
for i,curr_tag in enumerate(tags):
  one_tag = []


  # Iterate through single sentence
  for lables_int in curr_tag:

    # Check for predicate
    if lables_int != tag2idx["O"] and lables_int != tag2idx["PAD"]:
      pred_ind = pred_is
    else:
      pred_ind = pred_non
    one_tag.append(pred_ind)
  pred_indicator.append(one_tag)
pred_indicator      = np.array(pred_indicator)

In [None]:
# Find the word index for the animate words of the sentence and generate
# a random vector of length 10. anim_is indicates a animate word and anim_non
# indicates a non animate word.


if compute_animacy:

  tokenized_texts_and_labels_anim = [
      tokenize_and_preserve_labels(curr_dict["tokens"], curr_dict["animacy"],True)
      for curr_dict in data_dict
  ]


  anim_labels     = [token_label_pair[1] for token_label_pair in tokenized_texts_and_labels_anim]

  #Need to define a sublabeling set just for animacy
  anim2idx = {"O":0,"A":1,"PAD":2}
  candidate_labels_ids = anim2idx
  idx2anim =  dict((v,k) for k,v in candidate_labels_ids.items())

  tt_anim = [[anim2idx.get(l) for l in lab] for lab in anim_labels]

  tags_anim = pad_sequences(tt_anim,
                      maxlen=MAX_LENGTH, value=anim2idx["PAD"], padding="post",
                      dtype="long", truncating="post")

  #Generate the final anim_indicator embeddings
  anim_is = np.random.rand(10)
  anim_non = np.random.rand(10)
  anim_indicator = []

  #Iterate through each sentence 
  for i,curr_tag in enumerate(tags_anim):
    one_tag = []

    #Iterate through single sentence
    for lables_int in curr_tag:

      #Check for animacy
      if lables_int == anim2idx["A"]:
        anim_ind = anim_is
      else:
        anim_ind = anim_non
      one_tag.append(anim_ind)
    anim_indicator.append(one_tag)
  anim_indicator      = np.array(anim_indicator)
else:
  anim_indicator = pred_indicator

### Processing - Step 4
Finally all datasets are converted to tensors and combined into one final training dataset. A $90\%$ to $10\%$ split into training and test is peformed. Additionally an attentionmask is introduced, which is a boolean vector with true for given words in a sentence and 0 for pad tokens. The final datasets variables for the neural network are `train_dataloader` and `valid_dataloader`.

In [None]:
# Create attention mask vector
attention_masks = [[float(i != 0.0) for i in ii] for ii in input_ids]

In [None]:

# Peform the same train test split for each individual data class

tr_inputs, val_inputs, tr_tags, val_tags = train_test_split(input_ids, tags,
                                                            random_state=2018, test_size=0.1)
tr_masks, val_masks, _, _ = train_test_split(attention_masks, input_ids,
                                             random_state=2018, test_size=0.1)
tr_tags_bio, val_tags_bio, _, _ = train_test_split(tags_bio, input_ids,
                                             random_state=2018, test_size=0.1)
tr_verb_indicator, val_verb_indicator, _, _ = train_test_split(verb_indicator, input_ids,
                                             random_state=2018, test_size=0.1)
tr_anim_indicator, val_anim_indicator, _, _ = train_test_split(anim_indicator, input_ids,
                                             random_state=2018, test_size=0.1)
tr_pred_indicator, val_pred_indicator, _, _ = train_test_split(pred_indicator, input_ids,
                                             random_state=2018, test_size=0.1)

In [None]:
# Create the tensor vectors

tr_inputs = torch.tensor(tr_inputs,dtype=torch.long)
val_inputs = torch.tensor(val_inputs,dtype=torch.long)
tr_tags = torch.tensor(tr_tags,dtype=torch.long)
val_tags = torch.tensor(val_tags,dtype=torch.long)
tr_tags_bio = torch.tensor(tr_tags_bio,dtype=torch.long)
val_tags_bio = torch.tensor(val_tags_bio,dtype=torch.long)
tr_masks = torch.tensor(tr_masks,dtype=torch.long)
val_masks = torch.tensor(val_masks,dtype=torch.long)
tr_verb_indicator = torch.tensor(tr_verb_indicator,dtype=torch.long)
val_verb_indicator = torch.tensor(val_verb_indicator,dtype=torch.long)
tr_anim_indicator = torch.tensor(tr_anim_indicator,dtype=torch.long)
val_anim_indicator = torch.tensor(val_anim_indicator,dtype=torch.long)
tr_pred_indicator = torch.tensor(tr_pred_indicator,dtype=torch.long)
val_pred_indicator = torch.tensor(val_pred_indicator,dtype=torch.long)

In [None]:
# Loading tensors into one datasets and initilize the random sampling
train_data = TensorDataset(tr_inputs, tr_masks, tr_tags,tr_tags_bio,tr_verb_indicator,tr_anim_indicator,tr_pred_indicator)
train_sampler = RandomSampler(train_data)

valid_data = TensorDataset(val_inputs, val_masks, val_tags,val_tags_bio,val_verb_indicator,val_anim_indicator,val_pred_indicator)
valid_sampler = SequentialSampler(valid_data)




## Neural Network Application
The general outline of the model follows the paper by Shi, Peng, and Jimmy Lin: *Simple bert models for relation extraction and semantic role labeling.*

The neural network consists of a BERT contextual embedder in combination with a linear layer for classification. 
BERT converts the target data into numerical contextual embeddings, where the semantic relations within the sentence are preserved. Each word is converted to a tensor of 768 numerical floats. Thus a target dataset converts to a tensor with dimensions $(sentence_{all},sentence_{length},768)$. If we use predicate or animate indicators the last dimensions increases by $10$ i.e. $768+10 = 778$. Our linear classification layers are of size $n = len(labelset)$.
Three parameters we deemed most important can be set as global variables:
 1. `BATCH_SIZE`: How many data points are shown to the neural network at once. 
 2. `EPOCH`: One whole pass over the dataset
 3. `learning_rate`: Step size until convergence to a minimum
 
Within the literature a `BATCH_SIZE` of $32$ was deemed optimal.
In our analysis we used an `EPOCH` value of $15$. Smaller values are suitable too due to diminishing returns after $8$ epochs. The `learning_rate` is set to $0.00003$.

In [None]:
BATCH_SIZE = 32 #@param {type:"integer"}
EPOCHS     =   15#@param {type:"integer"}
learning_rate = 3e-5   #@param {type:"number"}

In [None]:
# Initializes the data loader at the specified batch size
train_dataloader = DataLoader(train_data,sampler=train_sampler, batch_size=BATCH_SIZE)
valid_dataloader = DataLoader(valid_data,sampler=valid_sampler, batch_size=BATCH_SIZE)

### Baseclass Overwrite 
We use the base class `BertForTokenClassification` from the `transformers` package. It includes the BERT layer and a simple linear layer stacked ontop of the embeddings. By loading the `BertForTokenClassification` into the environment we can make small modifications to the model to include our specific use case. Mainly we added more complexity i.e. linear layers within the `BertForTokenClassificationComplex` class and we added predicate/animate indication within the `BertForTokenClassificationInd` class.

In [None]:
class BertForTokenClassificationInd(BertPreTrainedModel):

    _keys_to_ignore_on_load_unexpected = [r"pooler"]

    def __init__(self, config):
        super().__init__(config)

        # Number of unique target labels
        self.num_labels = config.num_labels

        # Init bert and linear layers with dropout and hidden layer sizes
        self.bert       = BertModel(config, add_pooling_layer=False)
        self.dropout    = nn.Dropout(config.hidden_dropout_prob)
        self.classifier = nn.Linear(config.hidden_size + 10, config.num_labels)

        self.init_weights()

    def forward(
        self,

        #Input sentence
        input_ids=None,

        #Input attenmask
        attention_mask=None,

        #Whats the second sentence, in our case the predicate
        token_type_ids=None,

        #Not relevant
        position_ids=None,
        head_mask=None,
        inputs_embeds=None,

        #Target labels
        labels=None,
        output_attentions=None,
        output_hidden_states=None,
        return_dict=None,

        #Target indicators
        embedder_indicator=None,
    ):
        r"""
        labels (:obj:`torch.LongTensor` of shape :obj:`(batch_size, sequence_length)`, `optional`):
            Labels for computing the token classification loss. Indices should be in ``[0, ..., config.num_labels -
            1]``.
        """
        return_dict = return_dict if return_dict is not None else self.config.use_return_dict

        outputs = self.bert(
            input_ids,
            attention_mask=attention_mask,
            token_type_ids=token_type_ids,
            position_ids=position_ids,
            head_mask=head_mask,
            inputs_embeds=inputs_embeds,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
            return_dict=return_dict,
        )


        # Bert outputs 
        sequence_output = outputs[0]

        # Concatenate the indicators to the output
        sequence_output = torch.cat((sequence_output,embedder_indicator),dim=2)
        sequence_output = self.dropout(sequence_output)

        # Forward pass through the network to compute logits
        batch_size,sequence_length,embedding_dim = sequence_output.size()
        logits                 = self.classifier(sequence_output)


        # If labels available compute loss for gradient computation 
        loss = None
        if labels is not None:
            loss_fct = CrossEntropyLoss()
            # Only keep active parts of the loss
            if attention_mask is not None:
                active_loss = attention_mask.view(-1) == 1
                active_logits = logits.view(-1, self.num_labels)
                active_labels = torch.where(
                    active_loss, labels.view(-1), torch.tensor(loss_fct.ignore_index).type_as(labels)
                )
                loss = loss_fct(active_logits, active_labels)
            else:
                loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))

        if not return_dict:
            output = (logits,) + outputs[2:]
            return ((loss,) + output) if loss is not None else output

        return loss,logits


class BertForTokenClassificationComplex(BertPreTrainedModel):

    _keys_to_ignore_on_load_unexpected = [r"pooler"]

    def __init__(self, config):
        super().__init__(config)
        self.num_labels = config.num_labels


        # Init of all bert and multiple layers with hidden size 300
        self.bert       = BertModel(config, add_pooling_layer=False)
        self.dropout    = nn.Dropout(config.hidden_dropout_prob)
        self.dropout_hidden   = nn.Dropout(0.2)
        self.hidden_layer = nn.Linear(config.hidden_size, 300)
        self.classifier   = nn.Linear(300 , config.num_labels)

        self.init_weights()

    def forward(
        self,
        #Input sentence
        input_ids=None,

        #Input attenmask
        attention_mask=None,

        #Whats the second sentence, in our case the predicate
        token_type_ids=None,

        #Not relevant
        position_ids=None,
        head_mask=None,
        inputs_embeds=None,

        #Target labels
        labels=None,
        output_attentions=None,
        output_hidden_states=None,
        return_dict=None,

        #Target indicators
        embedder_indicator=None,
    ):
        r"""
        labels (:obj:`torch.LongTensor` of shape :obj:`(batch_size, sequence_length)`, `optional`):
            Labels for computing the token classification loss. Indices should be in ``[0, ..., config.num_labels -
            1]``.
        """
        return_dict = return_dict if return_dict is not None else self.config.use_return_dict

        outputs = self.bert(
            input_ids,
            attention_mask=attention_mask,
            token_type_ids=token_type_ids,
            position_ids=position_ids,
            head_mask=head_mask,
            inputs_embeds=inputs_embeds,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
            return_dict=return_dict,
        )



        # Forward pass through the linear layers, with added linear layers
        sequence_output = outputs[0]
        sequence_output = self.dropout(sequence_output)
        batch_size,sequence_length,embedding_dim = sequence_output.size()
        sequence_output        = self.hidden_layer(sequence_output)
        sequence_output        = F.relu(sequence_output)
        sequence_output        = self.dropout_hidden(sequence_output)
        logits                 = self.classifier(sequence_output)


        # Compute loss if labels are available
        loss = None
        if labels is not None:
            loss_fct = CrossEntropyLoss()
            # Only keep active parts of the loss
            if attention_mask is not None:
                active_loss = attention_mask.view(-1) == 1
                active_logits = logits.view(-1, self.num_labels)
                active_labels = torch.where(
                    active_loss, labels.view(-1), torch.tensor(loss_fct.ignore_index).type_as(labels)
                )
                loss = loss_fct(active_logits, active_labels)
            else:
                loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))

        if not return_dict:
            output = (logits,) + outputs[2:]
            return ((loss,) + output) if loss is not None else output

        return loss,logits

In [None]:

#Initilize the specific model based on the global variable model_type

if model_type == "both_model_simple" or model_type == "fairy_model_simple":
  model = BertForTokenClassification.from_pretrained(
      "bert-base-cased",
      num_labels=len(bio2idx),
      output_attentions = False,
      output_hidden_states = False
  )

elif model_type == "fairy_model_withAnim" or model_type == "fairy_model_withPred":
  model = BertForTokenClassificationInd.from_pretrained(
      "bert-base-cased",
      num_labels=len(bio2idx),
      output_attentions = False,
      output_hidden_states = False
  )

elif model_type == "fairy_model_complex":
  model = BertForTokenClassificationComplex.from_pretrained(
      "bert-base-cased",
      num_labels=len(bio2idx),
      output_attentions = False,
      output_hidden_states = False
  )



HBox(children=(FloatProgress(value=0.0, description='Downloading', max=433.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=435779157.0, style=ProgressStyle(descri…




Some weights of the model checkpoint at bert-base-cased were not used when initializing BertForTokenClassificationInd: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
- This IS expected if you are initializing BertForTokenClassificationInd from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassificationInd from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForTokenClassificationInd were not initialized from the model checkpoint at b

In [None]:

#The following codesnippets are based on the given tutorial by:
#https://www.depends-on-the-definition.com/named-entity-recognition-with-bert/

#Weight decay is a regulization method and a penilization method for complexity
#of the model. Furthermore we use the AdamW optimizer for the best optimization.


FULL_FINETUNING = True
if FULL_FINETUNING:
    param_optimizer = list(model.named_parameters())
    no_decay = ['bias', 'gamma', 'beta']
    optimizer_grouped_parameters = [
        {'params': [p for n, p in param_optimizer if not any(nd in n for nd in no_decay)],
         'weight_decay_rate': 0.01},
        {'params': [p for n, p in param_optimizer if any(nd in n for nd in no_decay)],
         'weight_decay_rate': 0.0}
    ]
else:
    param_optimizer = list(model.classifier.named_parameters())
    optimizer_grouped_parameters = [{"params": [p for n, p in param_optimizer]}]

optimizer = AdamW(
    optimizer_grouped_parameters,
    lr=learning_rate,
    eps=1e-8
)

In [None]:
max_grad_norm = 1.0

# Total number of training steps is number of batches * number of epochs.
total_steps = len(train_dataloader) * EPOCHS

# Create the learning rate scheduler.
scheduler = get_linear_schedule_with_warmup(
    optimizer,
    num_warmup_steps=0,
    num_training_steps=total_steps
)

model.cuda();
device = "cuda"

## Training
This is the final training and validation step of the initialized neural networks. We iterate over the whole dataset `EPOCHS` times while presenting the model `BATCH_SIZE` sentences. Each individual step of the training is commented within the code cell, but follows this general outline:

 * Loop over `EPOCHS` and set model into training mode
  * Loop over `BATCH_SIZE` of training data
    *  Calculate forward pass through model
    *  Calculate loss and peform backward pass to calculate gradients
    *  Update paramters with optimizer and update learning rate


In [None]:
## Store the average loss after each epoch so we can plot them.
loss_values, validation_loss_values, val_accuracies = [], [], []

for _ in trange(EPOCHS, desc="Epoch"):

    # TRAINING

    # Perform one full pass over the training set.

    # Put the model into training mode.
    model.train()
    # Reset the total loss for this epoch.
    total_loss = 0

    # Training loop
    for step, batch in enumerate(train_dataloader):

        # Add batch to gpu
        batch = tuple(t.to(device) for t in batch)
        b_input_ids, b_input_mask, _,b_labels,b_frame_indicator,b_anim_indicator,b_pred_indicator  = batch


        # Check for global variable to initilize the specfic model
        if model_type == "fairy_model_withAnim": embedder_indicator = b_anim_indicator 
        if model_type == "fairy_model_withPred": embedder_indicator = b_pred_indicator 

        # Check for global variable to initilize the specfic model
        if model_type == "both_model_simple" or model_type == "fairy_model_simple":
          outputs = model(b_input_ids, token_type_ids=b_frame_indicator,attention_mask=b_input_mask, labels=b_labels)
          loss = outputs[0]
        else:
          loss,logits = model(b_input_ids, token_type_ids=b_frame_indicator,attention_mask=b_input_mask, labels=b_labels,embedder_indicator=embedder_indicator)
        
        # Always clear any previously calculated gradients before performing a backward pass.
        model.zero_grad()
        
        # Perform a backward pass to calculate the gradients.
        loss.backward()
        # track train loss
        total_loss += loss.item()
        
        # Clip the norm of the gradient
        # This is to help prevent the "exploding gradients" problem.
        torch.nn.utils.clip_grad_norm_(parameters=model.parameters(), max_norm=max_grad_norm)
        # update parameters
        optimizer.step()
        # Update the learning rate.
        scheduler.step()

    # Calculate the average loss over the training data.
    avg_train_loss = total_loss / len(train_dataloader)
    print("Average train loss: {}".format(avg_train_loss))

    # Store the loss value for plotting the learning curve.
    loss_values.append(avg_train_loss)



    # VALIDATION


    # Put the model into evaluation mode
    model.eval()

    # Reset the validation loss for this epoch.
    eval_loss, eval_accuracy = 0, 0
    nb_eval_steps, nb_eval_examples = 0, 0
    predictions , true_labels = [], []
    for batch in valid_dataloader:

        # Add batch to gpu
        batch = tuple(t.to(device) for t in batch)
        b_input_ids, b_input_mask, _,b_labels,b_frame_indicator,b_anim_indicator,b_pred_indicator  = batch

        # Check for global variable to initilize the specfic model
        if model_type == "fairy_model_withAnim": embedder_indicator = b_anim_indicator 
        if model_type == "fairy_model_withPred": embedder_indicator = b_pred_indicator 


        # Check for global variable to initilize the specfic model
        with torch.no_grad():
            if model_type == "both_model_simple" or model_type == "fairy_model_simple":
              outputs = model(b_input_ids, token_type_ids=b_frame_indicator,attention_mask=b_input_mask, labels=b_labels)
              loss    = outputs[0]
              logits  = outputs[1]
            else:
              loss,logits = model(b_input_ids, token_type_ids=b_frame_indicator,attention_mask=b_input_mask, labels=b_labels,embedder_indicator=embedder_indicator)
        
        # Move logits and labels to CPU
        logits    = logits.detach().cpu().numpy()
        label_ids = b_labels.to('cpu').numpy()

        eval_loss += loss.mean().item()
        predictions.extend([list(p) for p in np.argmax(logits, axis=2)])
        true_labels.extend(label_ids)


    # Calculate the accuracy for the whole epoch. Save predictions in 
    # list for later access.
    eval_loss = eval_loss / len(valid_dataloader)
    validation_loss_values.append(eval_loss)
    print("Validation loss: {}".format(eval_loss))
    pred_tags = [tag_values[p_i] for p, l in zip(predictions, true_labels)
                                 for p_i, l_i in zip(p, l) if tag_values[l_i] != "PAD"]
    valid_tags = [tag_values[l_i] for l in true_labels
                                  for l_i in l if tag_values[l_i] != "PAD"]
    print("Validation Accuracy: {}".format(accuracy_score(pred_tags, valid_tags)))
    val_accuracies.append(accuracy_score(pred_tags, valid_tags))

    print()

Epoch:   0%|          | 0/15 [00:00<?, ?it/s]

Average train loss: 1.3287757574408143


Epoch:   7%|▋         | 1/15 [00:46<10:56, 46.86s/it]

Validation loss: 0.8648726840813955
Validation Accuracy: 0.7841269841269841

Average train loss: 0.7207662002355965


Epoch:  13%|█▎        | 2/15 [01:35<10:17, 47.49s/it]

Validation loss: 0.5690808420379957
Validation Accuracy: 0.84

Average train loss: 0.4779277957148022


Epoch:  20%|██        | 3/15 [02:23<09:31, 47.65s/it]

Validation loss: 0.43902693192164105
Validation Accuracy: 0.8791534391534391

Average train loss: 0.3418442108840854


Epoch:  27%|██▋       | 4/15 [03:12<08:46, 47.86s/it]

Validation loss: 0.3676605820655823
Validation Accuracy: 0.9013756613756614

Average train loss: 0.26638245258342336


Epoch:  33%|███▎      | 5/15 [04:00<08:00, 48.01s/it]

Validation loss: 0.3207383304834366
Validation Accuracy: 0.9082539682539682

Average train loss: 0.20940720752157546


Epoch:  40%|████      | 6/15 [04:49<07:13, 48.15s/it]

Validation loss: 0.30398569876948994
Validation Accuracy: 0.917989417989418

Average train loss: 0.1729775776879655


Epoch:  47%|████▋     | 7/15 [05:37<06:25, 48.23s/it]

Validation loss: 0.3011486480633418
Validation Accuracy: 0.921058201058201

Average train loss: 0.14789561437511886


Epoch:  53%|█████▎    | 8/15 [06:25<05:38, 48.30s/it]

Validation loss: 0.2823372036218643
Validation Accuracy: 0.9267724867724868

Average train loss: 0.12679987207606988


Epoch:  60%|██████    | 9/15 [07:14<04:50, 48.34s/it]

Validation loss: 0.2792387157678604
Validation Accuracy: 0.9296296296296296

Average train loss: 0.10900005367067125


Epoch:  67%|██████▋   | 10/15 [08:02<04:01, 48.36s/it]

Validation loss: 0.2814231564601262
Validation Accuracy: 0.9307936507936508

Average train loss: 0.09468761531429158


Epoch:  73%|███████▎  | 11/15 [08:51<03:13, 48.36s/it]

Validation loss: 0.2947673213978608
Validation Accuracy: 0.9306878306878307

Average train loss: 0.08348076833687999


Epoch:  80%|████████  | 12/15 [09:39<02:25, 48.34s/it]

Validation loss: 0.30159852902094525
Validation Accuracy: 0.9312169312169312

Average train loss: 0.07656184059602243


Epoch:  87%|████████▋ | 13/15 [10:27<01:36, 48.34s/it]

Validation loss: 0.2971922419965267
Validation Accuracy: 0.9314285714285714

Average train loss: 0.06974895236392815


Epoch:  93%|█████████▎| 14/15 [11:16<00:48, 48.33s/it]

Validation loss: 0.29927175864577293
Validation Accuracy: 0.9315343915343915

Average train loss: 0.06747123348975072


Epoch: 100%|██████████| 15/15 [12:04<00:00, 48.29s/it]

Validation loss: 0.2975124195218086
Validation Accuracy: 0.9308994708994709






## Save Model and Results
In this step of the notebook all results are saved into files. The trained model is saved into the folder structure /models/. The validation file containing the accuray and loss scorings as well as a confusion matrix file are saved in . 

In [None]:
# Create results dataframe
temp_dict = {"validation_accuracies":val_accuracies,"loss_value":loss_values,"validation_loss_value":validation_loss_values,"model":model_type}
temp_data_df = pd.DataFrame(temp_dict)

# Save validation data results
output_validation_data = working_dir + "data/srl_detection/output/" + model_type + "_validation_results.csv"
temp_data_df.to_csv(output_validation_data, index=False) 

In [None]:
temp_data_df

Unnamed: 0,validation_accuracies,loss_value,validation_loss_value,model
0,0.784127,1.328776,0.864873,fairy_model_withAnim
1,0.84,0.720766,0.569081,fairy_model_withAnim
2,0.879153,0.477928,0.439027,fairy_model_withAnim
3,0.901376,0.341844,0.367661,fairy_model_withAnim
4,0.908254,0.266382,0.320738,fairy_model_withAnim
5,0.917989,0.209407,0.303986,fairy_model_withAnim
6,0.921058,0.172978,0.301149,fairy_model_withAnim
7,0.926772,0.147896,0.282337,fairy_model_withAnim
8,0.92963,0.1268,0.279239,fairy_model_withAnim
9,0.930794,0.109,0.281423,fairy_model_withAnim


In [None]:
# Create and save confusion matrix file
confusion_matrix_both = pd.DataFrame(confusion_matrix(valid_tags, pred_tags, labels=tag_values))
output_confusion_matrix = working_dir + "data/srl_detection/output/" + model_type + "_confusion_matrix.cvs"
confusion_matrix_both.to_csv(output_confusion_matrix,index=False)

In [None]:
# Save the trained model into file
output_trained_model = working_dir + "models/" + model_type + "_trained"
torch.save(model.state_dict(),output_trained_model)

# Trained Model Application
This is the final step of the notebook. Here we showcase our best peforming previously trained models. First we load both into the environment. Next we make a final prediction on target sentences from mainstream literature not yet presented to the models. 
Unfortunately, the entire preprocessed training model exceeds the maximum file size allowed by github. The required file can be manually downloaded by clicking on the following link: https://drive.google.com/u/0/uc?export=download&confirm=S9Jm&id=1-CpTgM7WfSPdpNnEFEignMgWheFwrbue

Since Google implemented an extra information that large files can not be scanned for viruses, the download can not be automated and performed by wget. The file needs to be saved under `"./data/srl_detection/input/"`. After the download is completed, one can continue


In [None]:
# Loading the pretained animacy neural network model
# First the defined class needs to be loaded into environment
class Net(nn.Module):
  
  def __init__(self):
      super().__init__()
      self.fc1 = nn.Linear(1582, 300)
      self.relu1 = nn.ReLU()
      self.dout = nn.Dropout(0.2)
      self.fc2 = nn.Linear(300, 100)
      self.prelu = nn.PReLU(1)
      self.out = nn.Linear(100, 1)
      self.out_act = nn.Sigmoid()
      
  def forward(self, input_):
      a1 = self.fc1(input_)
      h1 = self.relu1(a1)
      dout = self.dout(h1)
      a2 = self.fc2(dout)
      h2 = self.prelu(a2)
      a3 = self.out(h2)
      y = self.out_act(a3)
      return y
  
net = Net()
input_animate   = working_dir + "models/AnimacyDetection_MLP_model"
input_model_srl = working_dir + "data/srl_detection/input/"

# Loading the model
net.load_state_dict(torch.load(input_animate, map_location='cpu'))
net.eval()

# Loading the class
model = BertForTokenClassification.from_pretrained(
    "bert-base-cased",
    num_labels=52,
    output_attentions = False,
    output_hidden_states = False
)

# Loading the model

input_model_srl = input_model_srl + "both_model_simple"
model.load_state_dict(torch.load(input_model_srl, map_location='cpu'))
model.eval()


Some weights of the model checkpoint at bert-base-cased were not used when initializing BertForTokenClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForTokenClassification were not initialized from the model checkpoint at bert-base-cas

BertForTokenClassification(
  (bert): BertModel(
    (embeddings): BertEmbeddings(
      (word_embeddings): Embedding(28996, 768, padding_idx=0)
      (position_embeddings): Embedding(512, 768)
      (token_type_embeddings): Embedding(2, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): BertEncoder(
      (layer): ModuleList(
        (0): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwis

In [None]:
#Load tokenizer class
model_class, tokenizer_class, pretrained_weights = (transformers.BertModel, transformers.BertTokenizer, 'bert-base-cased')
tokenizer = tokenizer_class.from_pretrained(pretrained_weights)

In [None]:
# Example maintstream sentences with predicate index vector
one_liners = [
("When you play the game of thrones, you win or you die.",[0,0,1,0,0,0,0,0,0,0,0,0,0,0,0]),  #GOT ONE LINERS
("A man with no motive is a man no one suspects.",[0,0,0,0,0,0,0,0,0,0,1,0],[0,0,0,0,0,1,0,0,0,0,0,0]),#GOT ONE LINERS
("Now he realised the cruelty of his gift.",[0,0,1,0,0,0,0,0,0]),                            #GREEK MYTHS
("Even the smallest person can change the course of the future.",[0,0,0,0,0,1,0,0,0,0,0,0]) ,#LORD OF THE RINGS
("No other country has counted so many deaths in the pandemic.",[0,0,0,0,1,0,0,0,0,0,0,0,0,0]),
("A talking tree sits on a bench and looks at a dog.",[0,0,0,0,0,0,0,0,1,0,0,0,0])]#NEW YOKR TIMES ARTICLE

In [None]:
# Helper function to predict a sentence using a pretained model
def predicte_sentence(sentence,pred_ind):
    tokenized_sentence = tokenizer.encode(sentence,add_special_tokens=False)
    input_ids          = torch.tensor([tokenized_sentence]).cpu()
    verb_indicator     = torch.tensor([pred_ind]).cpu()
    outputs = model(input_ids,token_type_ids=verb_indicator)
    logits = outputs[0].detach().cpu().numpy()
    test = [list(p) for p in np.argmax(logits, axis=2)]
    pred_tags = [tag_values[p_i] for p in test for p_i in p]
    return (tokenizer.tokenize(sentence),pred_tags)

In [None]:

# Due to the highly comples preprocessing of the Animacy steps we use 
# the preprocessed files of the given target sentences and load these 
# into the environment
input_path = working_dir + "data/srl_detection/input/"
all_data = []
for i in range(6):
    file_name = input_path + "sentence" + str(i) + ".pt"
    all_data.append(file_name)


pre_one_liners = []

for i in all_data:
    test1          =  torch.load(i)
    tr_inputs      = torch.tensor(test1,dtype=torch.float32)
    pre_one_liners.append(tr_inputs)

  app.launch_new_instance()


In [None]:
def classifier_final(tr_inputs,tokens,predindicator):
    tokens,preds = predicte_sentence(tokens,predindicator)
    output       = net(tr_inputs)
    pred_labels  = output.squeeze()>=0.5
    pred_labels  = pred_labels.detach().cpu().numpy()
    index = 0
    for i,j in zip(tokens,preds):

        if i.startswith("#"):
            anim = "O"
        else:
            if pred_labels[index] == True:
                anim = "Animate"
            else:
                anim = "O"
            index += 1

        print("TOKEN: " + i.ljust(10) + "SRL: " + j.ljust(10) + "\t ANIMACY: " + anim)

## New Sentence Prediction

"When you play the game of thrones, you win or you die"


In [None]:
classifier_final(pre_one_liners[0],one_liners[0][0],one_liners[0][1])

TOKEN: When      SRL: B-ARGM-TMP	 ANIMACY: O
TOKEN: you       SRL: B-ARG0    	 ANIMACY: Animate
TOKEN: play      SRL: O         	 ANIMACY: O
TOKEN: the       SRL: B-ARG1    	 ANIMACY: O
TOKEN: game      SRL: I-ARG1    	 ANIMACY: O
TOKEN: of        SRL: I-ARG1    	 ANIMACY: O
TOKEN: throne    SRL: I-ARG1    	 ANIMACY: O
TOKEN: ##s       SRL: PAD       	 ANIMACY: O
TOKEN: ,         SRL: O         	 ANIMACY: O
TOKEN: you       SRL: O         	 ANIMACY: Animate
TOKEN: win       SRL: O         	 ANIMACY: O
TOKEN: or        SRL: O         	 ANIMACY: O
TOKEN: you       SRL: O         	 ANIMACY: Animate
TOKEN: die       SRL: O         	 ANIMACY: O
TOKEN: .         SRL: O         	 ANIMACY: O


"A man with no motive is a man no one suspects."

In [None]:
classifier_final(pre_one_liners[1],one_liners[1][0],one_liners[1][1])

TOKEN: A         SRL: O         	 ANIMACY: O
TOKEN: man       SRL: O         	 ANIMACY: Animate
TOKEN: with      SRL: O         	 ANIMACY: O
TOKEN: no        SRL: O         	 ANIMACY: O
TOKEN: motive    SRL: O         	 ANIMACY: O
TOKEN: is        SRL: O         	 ANIMACY: O
TOKEN: a         SRL: B-ARG1    	 ANIMACY: O
TOKEN: man       SRL: I-ARG1    	 ANIMACY: Animate
TOKEN: no        SRL: B-ARG0    	 ANIMACY: O
TOKEN: one       SRL: I-ARG0    	 ANIMACY: O
TOKEN: suspects  SRL: O         	 ANIMACY: Animate
TOKEN: .         SRL: O         	 ANIMACY: O


In [None]:
classifier_final(pre_one_liners[1],one_liners[1][0],one_liners[1][2])

TOKEN: A         SRL: B-ARG0    	 ANIMACY: O
TOKEN: man       SRL: I-ARG0    	 ANIMACY: Animate
TOKEN: with      SRL: I-ARG0    	 ANIMACY: O
TOKEN: no        SRL: I-ARG0    	 ANIMACY: O
TOKEN: motive    SRL: I-ARG0    	 ANIMACY: O
TOKEN: is        SRL: O         	 ANIMACY: O
TOKEN: a         SRL: B-ARG1    	 ANIMACY: O
TOKEN: man       SRL: I-ARG2    	 ANIMACY: Animate
TOKEN: no        SRL: I-ARG2    	 ANIMACY: O
TOKEN: one       SRL: I-ARG0    	 ANIMACY: O
TOKEN: suspects  SRL: I-ARG0    	 ANIMACY: Animate
TOKEN: .         SRL: O         	 ANIMACY: O


 "Now he realised the cruelty of his gift"

In [None]:
classifier_final(pre_one_liners[2],one_liners[2][0],one_liners[2][1])

TOKEN: Now       SRL: B-ARGM-TMP	 ANIMACY: O
TOKEN: he        SRL: B-ARG0    	 ANIMACY: Animate
TOKEN: realised  SRL: O         	 ANIMACY: O
TOKEN: the       SRL: B-ARG1    	 ANIMACY: O
TOKEN: cruelty   SRL: I-ARG1    	 ANIMACY: O
TOKEN: of        SRL: I-ARG1    	 ANIMACY: O
TOKEN: his       SRL: I-ARG1    	 ANIMACY: Animate
TOKEN: gift      SRL: I-ARG1    	 ANIMACY: O
TOKEN: .         SRL: O         	 ANIMACY: O


"Even the smallest person can change the course of the future."

In [None]:
classifier_final(pre_one_liners[3],one_liners[3][0],one_liners[3][1])

TOKEN: Even      SRL: B-ARGM-DIS	 ANIMACY: O
TOKEN: the       SRL: B-ARG0    	 ANIMACY: O
TOKEN: smallest  SRL: I-ARG0    	 ANIMACY: O
TOKEN: person    SRL: I-ARG0    	 ANIMACY: Animate
TOKEN: can       SRL: B-ARGM-MOD	 ANIMACY: O
TOKEN: change    SRL: O         	 ANIMACY: O
TOKEN: the       SRL: B-ARG1    	 ANIMACY: O
TOKEN: course    SRL: I-ARG1    	 ANIMACY: O
TOKEN: of        SRL: I-ARG1    	 ANIMACY: O
TOKEN: the       SRL: I-ARG1    	 ANIMACY: O
TOKEN: future    SRL: I-ARG1    	 ANIMACY: O
TOKEN: .         SRL: O         	 ANIMACY: O


"No other country has counted so many deaths in the pandemic."

In [None]:
classifier_final(pre_one_liners[4],one_liners[4][0],one_liners[4][1])


TOKEN: No        SRL: B-ARG0    	 ANIMACY: O
TOKEN: other     SRL: I-ARG0    	 ANIMACY: O
TOKEN: country   SRL: I-ARG0    	 ANIMACY: O
TOKEN: has       SRL: O         	 ANIMACY: O
TOKEN: counted   SRL: O         	 ANIMACY: O
TOKEN: so        SRL: B-ARG1    	 ANIMACY: O
TOKEN: many      SRL: I-ARG1    	 ANIMACY: O
TOKEN: deaths    SRL: I-ARG1    	 ANIMACY: O
TOKEN: in        SRL: I-ARG1    	 ANIMACY: O
TOKEN: the       SRL: I-ARG1    	 ANIMACY: O
TOKEN: pan       SRL: I-ARG1    	 ANIMACY: O
TOKEN: ##de      SRL: PAD       	 ANIMACY: O
TOKEN: ##mic     SRL: PAD       	 ANIMACY: O
TOKEN: .         SRL: O         	 ANIMACY: O


"A talking tree sits on a bench and looks at a dog."

In [None]:
classifier_final(pre_one_liners[5],one_liners[5][0],one_liners[5][1])


TOKEN: A         SRL: B-ARG0    	 ANIMACY: O
TOKEN: talking   SRL: I-ARG0    	 ANIMACY: O
TOKEN: tree      SRL: I-ARG0    	 ANIMACY: O
TOKEN: sits      SRL: O         	 ANIMACY: O
TOKEN: on        SRL: O         	 ANIMACY: O
TOKEN: a         SRL: O         	 ANIMACY: O
TOKEN: bench     SRL: O         	 ANIMACY: O
TOKEN: and       SRL: O         	 ANIMACY: O
TOKEN: looks     SRL: O         	 ANIMACY: O
TOKEN: at        SRL: O         	 ANIMACY: O
TOKEN: a         SRL: B-ARG1    	 ANIMACY: O
TOKEN: dog       SRL: I-ARG1    	 ANIMACY: Animate
TOKEN: .         SRL: O         	 ANIMACY: O
