# McNemar's Test for Image Classification

Our goal is to test the significance of how our model is improved from the baseline model via test set evaluation.

## What is McNemar's test?

The McNemar's test is a non-parametric test for paired nominal data. It’s used when you are interested in finding a change in proportion for the paired data. In this test, we use it for comparing the prediction between baseline and our model to determine whether our model performs better than the baseline one. Learn further about McNemar's Test: [McNemar's Test](https://www.statisticshowto.com/mcnemar-test/)

![mc](http://rasbt.github.io/mlxtend/user_guide/evaluate/mcnemar_table_files/mcnemar_contingency_table.png)

We figure out this statistical test via an [online calculator](https://www2.ccrb.cuhk.edu.hk/stat/confidence%20interval/McNemar%20Test.htm)

## Imports

In [3]:
#imports
import zipfile
import pandas as pd

## Dataset from Repository

Get dataset from [Pinkshepz/Alzheimers-Class](https://github.com/Pinkshepz/Alzheimers-Class). MRI images dataset are downloaded from [kaggle.com/alzheimers-dataset-4-class-of-images](https://www.kaggle.com/tourist55/alzheimers-dataset-4-class-of-images).

In [4]:
!wget --no-check-certificate \
    "https://github.com/Pinkshepz/Alzheimers-Class/archive/refs/heads/main.zip" \
    -O "/content/alzheimers-image.zip"


zip_ref = zipfile.ZipFile('/content/alzheimers-image.zip', 'r') #Opens the zip file in read mode
zip_ref.extractall('/content') #Extracts the files into the /tmp folder
zip_ref.close()

--2021-06-30 02:50:26--  https://github.com/Pinkshepz/Alzheimers-Class/archive/refs/heads/main.zip
Resolving github.com (github.com)... 192.30.255.113
Connecting to github.com (github.com)|192.30.255.113|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://codeload.github.com/Pinkshepz/Alzheimers-Class/zip/refs/heads/main [following]
--2021-06-30 02:50:26--  https://codeload.github.com/Pinkshepz/Alzheimers-Class/zip/refs/heads/main
Resolving codeload.github.com (codeload.github.com)... 192.30.255.120
Connecting to codeload.github.com (codeload.github.com)|192.30.255.120|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/zip]
Saving to: ‘/content/alzheimers-image.zip’

/content/alzheimers     [      <=>           ]  35.11M  32.5MB/s    in 1.1s    

2021-06-30 02:50:27 (32.5 MB/s) - ‘/content/alzheimers-image.zip’ saved [36814126]



### Obtain prediction dataframe

In [22]:
preds_base = pd.read_csv('/content/Alzheimers-Class-main/base_preds.csv').loc[:,['fname', 'Score_base']]
preds_mod = pd.read_csv('/content/Alzheimers-Class-main/mod_preds.csv').loc[:,['fname', 'Score_mod']]

In [24]:
predict = preds_base.merge(preds_mod, how='inner', on='fname')

In [25]:
predict

Unnamed: 0,fname,Score_base,Score_mod
0,/content/Alzheimers-Class-main/Image-dataset/t...,1,1
1,/content/Alzheimers-Class-main/Image-dataset/t...,1,1
2,/content/Alzheimers-Class-main/Image-dataset/t...,1,1
3,/content/Alzheimers-Class-main/Image-dataset/t...,1,1
4,/content/Alzheimers-Class-main/Image-dataset/t...,1,1
...,...,...,...
1274,/content/Alzheimers-Class-main/Image-dataset/t...,0,0
1275,/content/Alzheimers-Class-main/Image-dataset/t...,0,0
1276,/content/Alzheimers-Class-main/Image-dataset/t...,0,1
1277,/content/Alzheimers-Class-main/Image-dataset/t...,0,1


In [27]:
print('Base accuracy = ' + str(predict['Score_base'].mean()))
print('Model accuracy = ' + str(predict['Score_mod'].mean()))

Base accuracy = 0.6262705238467553
Model accuracy = 0.743549648162627


In [30]:
# model_base
pos_neg = 0
pos_pos = 0
neg_neg = 0
neg_pos = 0
i = 0

while i < int(len(predict.index)):
  if predict.loc[i, 'Score_mod'] == 1 and predict.loc[i, 'Score_base'] == 1:
    pos_pos += 1
  if predict.loc[i, 'Score_mod'] == 1 and predict.loc[i, 'Score_base'] == 0:
    pos_neg += 1
  if predict.loc[i, 'Score_mod'] == 0 and predict.loc[i, 'Score_base'] == 1:
    neg_pos += 1
  if predict.loc[i, 'Score_mod'] == 0 and predict.loc[i, 'Score_base'] == 0:
    neg_neg += 1
  i += 1

In [31]:
pos_neg, pos_pos, neg_neg, neg_pos 

(240, 711, 238, 90)

In [32]:
chi_table = pd.DataFrame({'Model_pos':[pos_pos, pos_neg], 
                          'Model_neg':[neg_pos, neg_neg]}, 
                         index=['Base_pos', 'Base_neg'], 
                         columns=['Model_pos', 'Model_neg'])
chi_table

Unnamed: 0,Model_pos,Model_neg
Base_pos,711,90
Base_neg,240,238


From the online calculator, we get test statistic = 67.27576 , p-value < 0.0001 and odds ratio = 0.375

We conclude that our model predict more accurately than the baseline model at p < 0.0001