# Evaluation of a pretrained BERT model on a data subset of Kyoto-2016
<b> We show how to load a data subset, load a pretrained model and evaluate ROC_AUC, AUCPR_IN, AUCPR_OUT, F1_Inlier and F1_Outlier </b>

* Load Bert for Masked Language Modelling from checkpoint (we will load a model trained in iid mode on data containing sets 2006 to 2011)

In [9]:
import sys
sys.path.append("..")

from transformers import AutoModelForMaskedLM

train_from = 2006
train_until = 2011
train_mode = 'iid'

model_path = f'../saved_models/bert_small_word_level/principal/kyoto-2016_subset_principal_{train_mode}_trainon_{train_from}-{train_until}_final/'
print("Loading model from ", model_path)
model = AutoModelForMaskedLM.from_pretrained(model_path).cuda()

Loading model from  ../saved_models/bert_small_word_level/principal/kyoto-2016_subset_principal_iid_trainon_2006-2011_final/


* Load dataframe for a given year that we will test on (we will use a subset of 300k samples from the 2013 set)

In [10]:
import pandas as pd
pd.options.mode.chained_assignment = None

test_year = 2013
ds_size = "subset"
label_col_name = '18'
label_col_pos_val = '1'

# We only keep features 0 to 13
cols = [str(i) for i in range(14)] + [label_col_name, ]

test_year_path = f"../datasets/Kyoto-2016_AnoShift/{ds_size}/{test_year}_{ds_size}.parquet"
print("Loading test set:", test_year_path)

df_test_year = pd.read_parquet(test_year_path, columns=cols)

df_test = [(str(test_year), df_test_year),]


from data_processor.data_loader import split_set

# Split set in inliers and outliers
df_test_inlier, df_test_outlier = split_set(
        df_test_year, label_col_name=label_col_name, label_col_pos_val=label_col_pos_val
)

df_test = [(str(test_year), df_test_inlier, df_test_outlier),]

Loading test set: ../datasets/Kyoto-2016_AnoShift/subset/2013_subset.parquet


* We instantiate a Word Level Tokenizer preloaded from the checkpoint
* The tokenizer contains the vocabulary for the kyoto-2016 dataset (which is of finite size, due to our binning schema)

In [11]:
from transformers import PreTrainedTokenizerFast

tokenizer_path = '../saved_tokenizers/kyoto-2016.json'
tokenizer = PreTrainedTokenizerFast(tokenizer_file=tokenizer_path)
tokenizer.add_special_tokens(
    {"pad_token": "[PAD]", "unk_token": "[UNK]", "mask_token": "[MASK]"}
)


0

Prepare the dataset object from the dataframes by tokenizing each entry

In [12]:
from language_models.data_utils import prepare_test_ds

ds_test = prepare_test_ds(
    dfs_test=df_test, tokenizer=tokenizer, block_size=len(cols)-1
)


Evaluate the model on the dataset

In [13]:
from language_models.evaluation_utils import eval_rocauc
eval_rocauc(
    model=model,
    dss_test=ds_test,
    bs_eval=256,
    tokenizer=tokenizer,
    epoch=0,
    tb_writer=None,
)

{'inlier': Dataset({
    features: ['input_ids', 'token_type_ids', 'attention_mask'],
    num_rows: 299995
}), 'outlier': Dataset({
    features: ['input_ids', 'token_type_ids', 'attention_mask'],
    num_rows: 1237765
})}


100%|██████████| 1170/1170 [02:12<00:00,  8.83it/s]


Class: inlier Anomaly score: 0.4263514987247481


100%|██████████| 4834/4834 [09:03<00:00,  8.89it/s]


Class: outlier Anomaly score: 0.5505516537948927
ROC AUC       2013: 0.8498379975910056
AUCPR INLIER  2013: 0.5689764262412097
AUCPR OUTLIER 2013: 0.9574600471765427
F1 INLIER 2013: 0.5753914810648456
F1 OUTLIER 2013: 0.9039597922799439
