# Train and test using the complete DCC
In this notebook we'll first train a biLSTM using the complete DCC. We'll use MetaCAT's eval() function to evaluate the model on the complete DCC again, and we'll also use a custom evaluation function to process every example seperately and save those results in a result dataframe.

In [1]:
import numpy as np
from pathlib import Path
from medcat.meta_cat import MetaCAT
from medcat.config_meta_cat import ConfigMetaCAT
from medcat.tokenizers.meta_cat_tokenizers import TokenizerWrapperBPE
from utils import evaluate_per_example




In [2]:
# Input & output
data_dir = Path.cwd().parents[0] / 'data'
annotation_file = data_dir / 'emc-dcc_ann_Augmented.json'
model_dir = Path.cwd().parents[0] / 'models' / 'bilstm'
embeddings_file = model_dir / 'embeddings.npy'
result_dir = Path.cwd().parents[0] / 'results'
bilstm_result_file = result_dir / 'bilstm_predictions.csv.gz'

# Config
config_metacat = ConfigMetaCAT()
config_metacat.general['category_name'] = 'Temporality'
config_metacat.train['nepochs'] = 10
config_metacat.train['score_average'] = 'macro'
config_metacat.model['nclasses'] = 3
config_metacat.model['hidden_size'] = 256
config_metacat.model['dropout'] = 0.5
config_metacat.model['num_directions'] = 2
config_metacat.model['input_size'] = 300


In [3]:
config_metacat

ConfigMetaCAT(general=General(device='cpu', disable_component_lock=False, seed=13, description='No description', category_name='Temporality', category_value2id={}, vocab_size=None, lowercase=True, cntx_left=15, cntx_right=10, replace_center=None, batch_size_eval=5000, annotate_overlapping=False, tokenizer_name='bbpe', save_and_reuse_tokens=False, pipe_batch_size_in_chars=20000000), model=Model(model_name='lstm', num_layers=2, input_size=300, hidden_size=256, dropout=0.5, num_directions=2, nclasses=3, padding_idx=-1, emb_grad=True, ignore_cpos=False), train=Train(batch_size=100, nepochs=10, lr=0.001, test_size=0.1, shuffle_data=True, class_weights=None, score_average='macro', prerequisites={}, cui_filter=None, auto_save_model=True, last_train_on=None, metric={'base': 'weighted avg', 'score': 'f1-score'}))

## Load tokenizer and embeddings matrix
Load a project-wide tokenizer and embeddings matrix which are created in `01_tokenizer_embeddings.ipynb`.

In [4]:
tokenizer = TokenizerWrapperBPE.load(model_dir)
embeddings = np.load(embeddings_file)

## Train biLSTM

In [5]:
# Initiate MetaCAT
meta_cat = MetaCAT(tokenizer=tokenizer, embeddings=embeddings, config=config_metacat)

In [6]:
# Train model
results = meta_cat.train_from_json(json_path=str(annotation_file), save_dir_path=str(model_dir))

  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


In [7]:
meta_cat.save(save_dir_path=str(model_dir))

## Simple evaluation with MetaCAT's eval()
MetaCAT's eval() function does not return the example ID.

In [8]:
# Load biLSTM
meta_cat = MetaCAT.load(model_dir)
# Evalate with MetaCAT's eval()
results = meta_cat.eval(json_path=annotation_file)

In [9]:
results

{'precision': 0.9912969199430934,
 'recall': 0.9450875273524172,
 'f1': 0.9668101469809479,
 'examples': {'FP': {'recent': [('Predicted: recent, True: historical',
     'x-thoracolumbale wervelkolom: status na thoracale xii <<werv>> elfractuur met operatieve stabilisatie.\n|  '),
    ('Predicted: recent, True: historical',
     " na arthrotomie.\nin 1990 is patient gezien na een <<trauma>> , waarbij rontgenfoto's van de rechter"),
    ('Predicted: recent, True: historical',
     ' op 15-11-1995 een scheef liggend bekken, met een milde <<heup>> dysplasie rechts.\nadvies: ik heb de heup'),
    ('Predicted: recent, True: historical',
     'aan geplaatst tussen de tweede en derde metacarpaal na klieven van de <<syn>> ostose om deze verder gescheiden te houden.\ne'),
    ('Predicted: recent, True: historical',
     " de horeca volgt, had in augustus'95 na een val waarschijnlijk een <<luxatie>>  van het pip-gewricht van de linker ring"),
    ('Predicted: recent, True: historical',
     'arpa

In [8]:
# Print full F1 score to check for changes in result
print(f'F1: {results["f1"]}')

KeyError: 'f1'

## Custom evaluation per example
In this project we are interested per example whether a negation has been identified or not. MetaCAT does not have such functionality, it only returns the scores, predictions and examples.

In this part we iterate through all annotations in a an anntation file (MedCAT Trainer format), create an ID for every example (based on `exampleID=documentID_start_end`), and collect the prediction per example.

In [12]:
bilstm_predictions = evaluate_per_example(annotation_file, meta_cat, f'bilstm', label=config_metacat.general['category_name'])
bilstm_predictions.to_csv(bilstm_result_file, index=False, compression='gzip', line_terminator='\n')
bilstm_predictions

TypeError: evaluate_per_example() got an unexpected keyword argument 'label'