<a href="https://colab.research.google.com/github/konan-91/OcularClassification/blob/master/notebooks/3_model_testing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Benchmarking Our Model

While training we were given various metrics regarding the model's accuracy. However, in order to test whether it generalises to novel data, we will test it on an entirely new dataset. We can also benchmark it against MNEPython's built in blink removal algorithms, to get a sense of how well it performs in practice.



In [1]:
from fastai.vision.all import *
from fastai.vision.widgets import *

In [2]:
import os

Loading the model and benchmark dataset.

In [3]:
learn = load_learner('/content/ocular_classifier_25.pkl')

If you only need to load model weights and optimizer state, use the safe `Learner.load` instead.
  warn("load_learner` uses Python's insecure pickle module, which can execute malicious arbitrary code when loading. Only load files you trust.\nIf you only need to load model weights and optimizer state, use the safe `Learner.load` instead.")


In [None]:
!unzip /content/img_test_blinks.zip
!rm -rf /content/img_test_blinks.zip

In [None]:
!unzip /content/img_rest_new.zip
!rm -rf /content/img_rest_new.zip

In [None]:
!unzip /content/img_h_saccades_new.zip
!rm -rf /content/img_h_saccades_new.zip

In [None]:
!unzip /content/img_v_saccades_new.zip
!rm -rf /content/img_v_saccades_new.zip

## Evaluating Model on Test Data

Run on epoched MNE data, calculate accuracy

In [None]:
'''
pred = learn.predict('/content/img_test_blinks/img__test_blinks_0.png')
print(pred[2][0], pred[2][1])
print(pred[2][0] > pred[2][1])

tensor(0.9799) tensor(0.0201)
tensor(True)


Blink accuracy test

In [8]:
blink_score = 0
for img in os.listdir('/content/img_test_blinks'):
    pred = learn.predict('/content/img_test_blinks/' + img)
    if pred[2][0] > pred[2][1]:
        blink_score += 1
print(blink_score / len(os.listdir('/content/img_test_blinks')))

1.0


In [9]:
print(len(os.listdir('/content/img_test_blinks')))
print(blink_score)

319
319


Rest

In [11]:
rest_score = 0
for img in os.listdir('/content/img_rest_new'):
    pred = learn.predict('/content/img_rest_new/' + img)
    if pred[2][0] < pred[2][1]:
        rest_score += 1
print(rest_score / len(os.listdir('/content/img_rest_new')))

0.7556818181818182


In [12]:
print(len(os.listdir('/content/img_rest_new')))
print(rest_score)

704
532


Saccades


In [17]:
h_score = 0
for img in os.listdir('/content/img_h_saccades_new/'):
    if img.endswith('.png'):
        pred = learn.predict('/content/img_h_saccades_new/' + img)
        if pred[2][0] < pred[2][1]:
            h_score += 1
print(h_score / len(os.listdir('/content/img_h_saccades_new')))

0.908235294117647


In [19]:
print(len(os.listdir('/content/img_h_saccades_new')))
print(h_score)

425
386


In [21]:
v_score = 0
for img in os.listdir('/content/img_v_saccades_new/'):
    if img.endswith('.png'):
        pred = learn.predict('/content/img_v_saccades_new/' + img)
        if pred[2][0] < pred[2][1]:
            v_score += 1
print(v_score / len(os.listdir('/content/img_v_saccades_new')))

0.4959349593495935


In [22]:
print(len(os.listdir('/content/img_v_saccades_new')))
print(v_score)

492
244


## Testing ICA on Test Data

McNemar's Test is a statistical test for paired categorical data.

## McNemar’s Test

Testing whether the two methods' performances differ significantly from one another  

In [None]:
from statsmodels.stats.contingency_tables import mcnemar
import numpy as np