# Train and test using the complete DCC
In this notebook we'll first train a biLSTM using the complete DCC. We'll use MetaCAT's eval() function to evaluate the model on the complete DCC again, and we'll also use a custom evaluation function to process every example seperately and save those results in a result dataframe.

In [1]:
import numpy as np
from pathlib import Path
from medcat.meta_cat import MetaCAT
from medcat.config_meta_cat import ConfigMetaCAT
from medcat.tokenizers.meta_cat_tokenizers import TokenizerWrapperBPE
from utils import evaluate_per_example




In [15]:
# Input & output
data_dir = Path.cwd().parents[0] / 'data'
annotation_file = data_dir / 'emc-dcc_ann_Augmented.json'
model_dir = Path.cwd().parents[0] / 'models' / 'bilstm'
embeddings_file = model_dir / 'embeddings.npy'
result_dir = Path.cwd().parents[0] / 'results'
bilstm_result_file = result_dir / 'bilstm_predictions.csv.gz'

# Config
config_metacat = ConfigMetaCAT()
config_metacat.general['category_name'] = 'Experiencer'
config_metacat.train['nepochs'] = 10
config_metacat.train['score_average'] = 'binary' # binary
config_metacat.model['nclasses'] = 2
config_metacat.model['hidden_size'] = 256
config_metacat.model['dropout'] = 0.5
config_metacat.model['num_directions'] = 2
config_metacat.model['input_size'] = 300


In [16]:
config_metacat

ConfigMetaCAT(general=General(device='cpu', disable_component_lock=False, seed=13, description='No description', category_name='Experiencer', category_value2id={}, vocab_size=None, lowercase=True, cntx_left=15, cntx_right=10, replace_center=None, batch_size_eval=5000, annotate_overlapping=False, tokenizer_name='bbpe', save_and_reuse_tokens=False, pipe_batch_size_in_chars=20000000), model=Model(model_name='lstm', num_layers=2, input_size=300, hidden_size=256, dropout=0.5, num_directions=2, nclasses=2, padding_idx=-1, emb_grad=True, ignore_cpos=False), train=Train(batch_size=100, nepochs=10, lr=0.001, test_size=0.1, shuffle_data=True, class_weights=None, score_average='binary', prerequisites={}, cui_filter=None, auto_save_model=True, last_train_on=None, metric={'base': 'weighted avg', 'score': 'f1-score'}))

## Load tokenizer and embeddings matrix
Load a project-wide tokenizer and embeddings matrix which are created in `01_tokenizer_embeddings.ipynb`.

In [17]:
tokenizer = TokenizerWrapperBPE.load(model_dir)
embeddings = np.load(embeddings_file)

## Train biLSTM

In [18]:
# Initiate MetaCAT
meta_cat = MetaCAT(tokenizer=tokenizer, embeddings=embeddings, config=config_metacat)

In [19]:
# Train model
results = meta_cat.train_from_json(json_path=str(annotation_file), save_dir_path=str(model_dir))

In [20]:
meta_cat.save(save_dir_path=str(model_dir))

## Simple evaluation with MetaCAT's eval()
MetaCAT's eval() function does not return the example ID.

In [21]:
# Load biLSTM
meta_cat = MetaCAT.load(model_dir)
# Evalate with MetaCAT's eval()
results = meta_cat.eval(json_path=annotation_file)

In [22]:
results

{'precision': 0.9991161819058332,
 'recall': 0.9983942191890807,
 'f1': 0.998755070077507,
 'examples': {'FP': {'other': [('Predicted: other, True: patient',
     'spreekuur, voor beoordeling van haar wervelkolom, in verband met een mogelijke <<scoliose>> .\nfamilie-anamnese: patientje'),
    ('Predicted: other, True: patient',
     'er heeft wel een <<infectie>>  plaatsgevonden laat in de zwangerschap in de zin van een'),
    ('Predicted: other, True: patient',
     ' distale radiusfractuur links.\nhet capitulum radii lijkt niet te <<sporen>> .\n'),
    ('Predicted: other, True: patient',
     '-team in ons ziekenhuis, die samen met de andere teamleden de diagnose <<neuro>> fibromatose heeft gesteld.\nconclusie: anter'),
    ('Predicted: other, True: patient',
     'de schoolarts ontdekte een <<scoliose>> .\nx-thoraxcolumbalewervelk'),
    ('Predicted: other, True: patient',
     'ze is bekend met een <<oste>> ogenesis imperfecta.\nbij onderzoek'),
    ('Predicted: other, True: patien

In [8]:
# Print full F1 score to check for changes in result
print(f'F1: {results["f1"]}')

KeyError: 'f1'

## Custom evaluation per example
In this project we are interested per example whether a negation has been identified or not. MetaCAT does not have such functionality, it only returns the scores, predictions and examples.

In this part we iterate through all annotations in a an anntation file (MedCAT Trainer format), create an ID for every example (based on `exampleID=documentID_start_end`), and collect the prediction per example.

In [12]:
bilstm_predictions = evaluate_per_example(annotation_file, meta_cat, f'bilstm', label=config_metacat.general['category_name'])
bilstm_predictions.to_csv(bilstm_result_file, index=False, compression='gzip', line_terminator='\n')
bilstm_predictions

TypeError: evaluate_per_example() got an unexpected keyword argument 'label'