## Code for training and evaluating the LSTM classifiers

This notebook reproduces results reported in Table 5 of the Living Machines paper (COLING 2020). The notebook consists of two parts, (a) training and (b) evaluation. In all experiments make use of the FLAIR framework. We use BERT as feature extractor (i.e. to obtian representation for each token) which are then forward to an LSTM Sequence Classifier that assigns classes "O", "ANIMATE" and "INANIMATE" to each word. In the process, we allow for fine tuning (the last three) layers of the transformer itself.

### Training

This section provides code for training tree models BERT-based LSTM Models.

- Experiment 0 uses the Jahan et. al data.
- Experiment 1 uses the annotations of "machine" sentences, based on the 19th century British Library books corpus. The target category is "Animacy". In this scenario, we continue fine-tuning the model trained on Jahan data in Experiment 0.
- Experiment 2 use the same data mentioned in Experiment 1, but in this case the target category is "Humanness".

In [None]:
%load_ext autoreload

In [None]:
autoreload 2

In [None]:
%matplotlib inline
import flair
from flair.data import Sentence
from flair.datasets import ColumnCorpus
from flair.data import Corpus
from flair.trainers import ModelTrainer
from flair.embeddings import *
from typing import List
from pathlib import Path
from flair.datasets import DataLoader
from flair.models import SequenceTagger
import pandas as pd
import pickle
from tqdm.notebook import tqdm
from tools.helpers import fscore_results # write_csv_data,categorize_data,split_data
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics import classification_report, precision_score, recall_score
from sklearn.svm import SVC,LinearSVC

Make sure that you use the correct FLAIR version. The cell below should return True

In [None]:
flair.__version__ == '0.6.0.post1'

### Load Embeddings

In [None]:
embeddings_dict = {'glove' : WordEmbeddings('glove'),
                   'bert_ft' : TransformerWordEmbeddings('bert-base-uncased',
                                                      fine_tune=True,
                                                      layers='-1,-2,-3', 
                                                      pooling_operation='mean'), #
    
#                  'bert' : TransformerWordEmbeddings('bert-base-uncased',
#                                                       fine_tune=False,
#                                                       pooling_operation='mean'), #
#                  'histo_bert':BertEmbeddings('/datadrive/khosseini/LM_with_bert/models/bert/FT_bert_base_uncased_after_1875_before_1890_v002/final_model')
                  }

### Select Experiment

Select which experiment to run. Experiments are described at the beginning of this Notebook.

In [None]:
experiment_dict = {0: {"train_data":'jahan', "test_data":'jahan',"animacy":"animacy"},
                   2: {"train_data":'machine_sents', "test_data":'machine_sents',"animacy":"humanness"},
                   1: {"train_data":'machine_sents', "test_data":'machine_sents',"animacy":"animacy"}}
                   

In [None]:
experiment_number = 2 # 0, 2, 3

### Data Parameters

Selects data based on the selected experiment. Change the `save_experiments_to` and `path_to_data` variables to the preferred folders.

In [None]:
save_experiments_to = '/deezy_datadrive/kaspar-playground/rnn_experiments/coling'
path_to_data = '/deezy_datadrive/kaspar-playground/livmach_code_review/AtypicalAnimacy/data/'

In [None]:
train_data = experiment_dict[experiment_number]['train_data']
test_data = experiment_dict[experiment_number]['test_data']
animacy = experiment_dict[experiment_number]['animacy']
root_dir = Path(save_experiments_to)
root_dir.mkdir(exist_ok=True)

In [None]:
path_data = Path(path_to_data)

if train_data == 'machine_sents':
    sent_data_path_train = path_data / f'machines19thC/{animacy}_train.pkl'
elif train_data == 'jahan':
    sent_data_path_train = path_data / f'stories/train.pkl'
    

if test_data == 'machine_sents':
    sent_data_path_test = path_data / f'machines19thC/{animacy}_test.pkl'
elif test_data == 'jahan':
    sent_data_path_test = path_data / f'stories/test.pkl'
    
df_train = pd.read_pickle(sent_data_path_train).sample(frac=1,random_state=42).reset_index(drop=True)
# further divide train into train and dev
cutoff = int(df_train.shape[0]*.8)

df_dev = df_train.iloc[cutoff:]
df_train = df_train.iloc[:cutoff]


df_test = pd.read_pickle(sent_data_path_test)
print(f"{train_data}-train\n",df_train.animated.value_counts())
print(f"{train_data}-dev\n",df_dev.animated.value_counts())
print(f"{test_data}-test\n",df_test.animated.value_counts())

### Data processing parameters

In [None]:
target='center' # 'all' or 'center'
masked=False # True or False
data_format = f'sequential_{target}_{masked}'
csv_data_path = root_dir / f'{train_data}_{animacy}_{data_format}'
csv_data_path.mkdir(exist_ok=True)

### Split data in train, dev, test

In [None]:
def write_csv_data(df,path,name,target='all',masked=False,tokenize_target_expr=False,num2str = {0:'INANIMATE',1:'ANIMATE'}):
    
    with open(path / (name+'.txt'),'w') as out_csv:
        
        for i,row in df.iterrows():
            masked_sentence = row.maskedSentence
            if len(masked_sentence.split('[MASK]')) > 2:
                  continue
            if target == 'center':
                
                for s in masked_sentence.split('[SEP]'):
                    if '[MASK]' in s:
                        masked_sentence = s
                        #print(masked_sentence)
                        break
            
    
            if masked:
                target_expr = '[TARGET]'
                
            else:
                target_expr = row.targetExpression
            
            try:
                csv = []
                masked_sentence = Sentence(masked_sentence)
                for t in masked_sentence:
                    #print(t.text)
                    if not t.text == 'MASK':
                        csv.append([t.text.lower(),'O'])
                        #print('O')
                    else:
                        #print(row.Animate)
                        csv.append(['TARGET','O'])
                        csv.extend([[te.text.lower() ,num2str[row.animated]] for te in Sentence(target_expr)]) # row.TargetExpression
                        csv.append(['TARGET',"O"])
                
                csv = '\n'.join(['\t'.join(l) for l in csv])
                out_csv.write(f'{csv}\n\n')
            
            except Exception as e:
                print(e)
                #print(traceback.print_exc())
                pass

In [None]:
write_csv_data(df=df_train,
               path=csv_data_path,
               name='train', 
               target=target,
               masked=masked)

write_csv_data(df=df_dev,
               path=csv_data_path,
               name='dev',
               target=target,
               masked=masked)

write_csv_data(df=df_test,
               path=csv_data_path,
               name='test',
               target=target,
               masked=masked)

In [None]:
!head -n 10 {str(csv_data_path)}/dev.txt

### Model hyperparameters and embeddings

In [None]:
#embeddings_list = ['bert_ft','glove'] # ,'glove'
train_with_dev = False
if train_data == 'machine_sents':
    learning_rate = 1e-3 #.05
    epochs = 3
    continue_from = "/deezy_datadrive/kaspar-playground/rnn_experiments/coling/classifier/classifier_jahan_0_seq/best-model.pt" # path or False
if train_data == 'jahan':
    learning_rate = .05
    epochs = 20
    continue_from = False

In [None]:
trainer_folder = root_dir / 'classifier' 
trainer_folder.mkdir(exist_ok=True)
trainer_path = trainer_folder/ ('classifier_' + f'{train_data}' + f'_{experiment_number}_seq')
trainer_path.mkdir(exist_ok=True)

### Save hyperparameters in file

### Load corpus

In [None]:
tag_type = 'animacy'
columns = {0: "text", 1: "animacy"}

corpus = ColumnCorpus(csv_data_path, columns,
                              train_file='train.txt',
                              test_file='test.txt',
                              dev_file='dev.txt')


tag_dictionary = corpus.make_tag_dictionary(tag_type=tag_type)


In [None]:
tag_dictionary.get_items()

### Train Model

In [None]:
if continue_from:
    tagger = SequenceTagger.load(continue_from)
    trainer = ModelTrainer(tagger, corpus)
else:
    embeddings = embeddings_dict['bert_ft']

    tagger = SequenceTagger(hidden_size=256,
                        embeddings=embeddings,
                        tag_dictionary=tag_dictionary,
                        tag_type='animacy',
                        use_crf=False,
                        use_rnn=True,
                        loss_weights={'ANIMATE': 10., 'INANIMATE':10., 'O':.1}) # 

    trainer = ModelTrainer(tagger, corpus)

results = trainer.train(trainer_path,
              learning_rate=learning_rate,
              mini_batch_size=32,
              patience=3,
              anneal_with_restarts=True,
              monitor_test=False,
              max_epochs=epochs)

### Evaluate Classifier

In [None]:
model_folder = '/deezy_datadrive/kaspar-playground/rnn_experiments/coling/classifier/'

In [None]:
!ls {model_folder}

In [None]:
trainer_path = '/deezy_datadrive/kaspar-playground/rnn_experiments/coling/classifier/classifier_machine_sents_2_seq/'

In [None]:
!ls {trainer_path}

In [None]:
!head -n 10 {trainer_path}"/test.tsv"

In [None]:
def sentence_level_prediction(tsv_file):
    string2int = {'ANIMATE':1, "INANIMATE":0}
    with open(tsv_file,'r') as in_tsv:
        tsv = in_tsv.read()
    
    sents = tsv.strip().split("\n\n")
    y_true, y_pred = [], [ ]
    for s in sents:
        s = s.strip().split("\n")
        y_t = [l.split(" ")[1] for l in s if l.split(" ")[1] != 'O']
        y_p = [l.split(" ")[2] for l in s if l.split(" ")[2] != 'O']
        
        if y_p and y_t:
        
            y_true.append(string2int[max(set(y_t), key=y_t.count)])
            y_pred.append(string2int[max(set(y_p), key=y_p.count)])
        else:
            print(y_t,y_p)
    return y_true, y_pred
        

In [None]:
y_true, y_pred = sentence_level_prediction(Path(trainer_path) / "test.tsv")
print(classification_report(y_true,y_pred,digits=3)) 

In [None]:
def fscore_results(y_true, y_pred,all_scores=True):
    precision = precision_score(y_true, y_pred, average='macro')
    recall = recall_score(y_true, y_pred, average='macro')
    fscore = 0.0
    if precision == 0 and recall == 0:
        fscore = 0.0
    else:
        fscore = (2.0 * precision * recall) / (precision + recall)
        
    precision_micro = precision_score(y_true, y_pred, average='micro')
    recall_micro = recall_score(y_true, y_pred, average='micro')
    fscore_micro = 0.0
    if precision_micro == 0 and recall_micro == 0:
        fscore_micro = 0.0
    else:
        fscore_micro = (2.0 * precision_micro * recall_micro) / (precision_micro + recall_micro)
    
    rank = [[y_true[x],y_pred[x]] for x in range(len(y_pred))]
        
    rank.sort(key=lambda x: x[1],reverse=True)
    
    map_ = 0
    correct =0

    for x in range(len(rank)):
        g = rank[x][0]
        if g== 1:
            correct+=1
            map_+=correct/(x+1)
    final_map = map_/correct
    if all_scores:
        return round(precision,3),round(recall,3),round(fscore,3),round(final_map,3) # round(fscore_micro,3),
    return round(fscore,3)

In [None]:
fscore_results(y_true,y_pred)