# Model Comparison

**Objectives**
- Take note of the results of previous performance tests that used the Holdout Method (i.e., during Model Robustness Test)
- Using `McNemar Test`, determine any differences between GBDT Models (e.g., LGBM Default vs CatBoost Default), between configurations (e.g., LGBM Default vs LGBM Tuned), and between behavior-types (i.e., Time-based LGBM Tuned vs Time-based CatBoost Tuned)
- Use whichever dataset is appropriate (probably the Test/Holdout Split).
- Take note of the results

Assume a `significance level` of **0.05 (5%)** as it was mentioned in RRL relating to Model Comparison (let's just use it the reference no. for significane level).

<hr>

*Kindly double check the statement(s) that will follow:*

Assume null hypothesis as *"there is a significant difference between the two models"*. `<== Modify this accordingly depending on which will be compared (whether if GBDT vs GBDT or Default vs Tuned)`

If the resulting `p-value` is larger than the `significance level`, the null hypothesis is not rejected. Else if otherwise (`p-value` < `significance level`).

Interpretting the resulting array:
[[a,b]
 [c,d]]
 
- a = Both models are correct
- b = Model 1 wrong, Model 2 correct
- c = Model 1 correct, Model 2 wrong
- d = Both models are wrong

References:
- [https://rasbt.github.io/mlxtend/user_guide/evaluate/mcnemar/](https://rasbt.github.io/mlxtend/user_guide/evaluate/mcnemar/)

In [1]:
import statsmodels.stats.contingency_tables as statsmodels #mcnemar
import mlxtend.evaluate as mlxtend #mcnemar_table, mcnemar
import pandas as pd
import numpy as np
import lightgbm as lgbm
import catboost as catb
from joblib import load

import warnings
warnings.filterwarnings("ignore")

In [2]:
DF_LGBM_TB = pd.read_csv('../Dataset/TB/LGBM_TB_Test.csv', low_memory=False) #<== Point these to the proper Test/Holdout datasets.
DF_LGBM_IB = pd.read_csv('../Dataset/IB/LGBM_IB_Test.csv', low_memory=False)
DF_CATB_TB = pd.read_csv('../Dataset/TB/CATB_TB_Test.csv', low_memory=False) #<== Point these to the proper Test/Holdout datasets.
DF_CATB_IB = pd.read_csv('../Dataset/IB/CATB_IB_Test.csv', low_memory=False)
DF_CATB_IB.iloc[:,1:101] = DF_CATB_IB.iloc[:,1:101].astype('str')
DF_CATB_IB.replace("nan", "NaN", inplace=True)

y_target = DF_LGBM_TB['malware'] # <---- labels are equal across all datasets

In [3]:
display(DF_LGBM_TB)
display(DF_LGBM_IB)
display(DF_CATB_TB)
display(DF_CATB_IB)

Unnamed: 0,malware,t_0,t_1,t_2,t_3,t_4,t_5,t_6,t_7,t_8,...,t_92,t_93,t_94,t_95,t_96,t_97,t_98,t_99,hash,type
0,1,240,117,240,117,240,117,240,117,240,...,240,117,240,117,172,60,225,35,0fe987c56cfb02db5d810534d6098d93,trojan
1,1,82,198,86,82,274,37,240,117,260,...,274,215,274,158,215,37,158,215,f58d31adac5b879b50ce07a9da086736,trojan
2,1,215,208,228,117,228,240,117,228,159,...,230,35,240,117,208,89,225,35,f07d9fa9d2852bd4b7b36f39dd531b4a,pua
3,1,159,208,260,141,65,208,20,34,215,...,187,135,171,262,208,262,187,262,7a493fa07f0f7d0c9e372eedae03036b,ransomware
4,1,82,240,117,240,117,240,117,240,117,...,260,141,260,141,260,141,260,141,ba60236d9f9fe6cd0a10ffbbf2296669,trojan
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4119,1,208,228,240,117,240,117,82,240,117,...,208,172,117,187,208,187,172,117,b2e6e058b25a25175f4f2d264f0bb83c,trojan
4120,1,82,208,172,117,172,208,16,110,172,...,279,208,82,112,123,65,208,112,e768ea96da507ed239ce925c24b8fd1a,trojan
4121,1,82,240,117,240,117,93,117,16,147,...,230,240,117,225,35,208,89,225,5a99618b63178d7a221552fe962992e3,trojan
4122,1,112,274,158,215,274,158,215,298,76,...,297,135,171,215,35,208,56,71,10a935e723a4b1cc416adb7af2bc4965,trojan


Unnamed: 0,malware,t_0,t_1,t_2,t_3,t_4,t_5,t_6,t_7,t_8,...,t_92,t_93,t_94,t_95,t_96,t_97,t_98,t_99,hash,type
0,1,240,117,228,215,274,158,172,198,208,...,307,307,307,307,307,307,307,307,0fe987c56cfb02db5d810534d6098d93,trojan
1,1,82,198,86,274,37,240,117,260,40,...,307,307,307,307,307,307,307,307,f58d31adac5b879b50ce07a9da086736,trojan
2,1,215,208,228,117,240,159,187,260,141,...,307,307,307,307,307,307,307,307,f07d9fa9d2852bd4b7b36f39dd531b4a,pua
3,1,159,208,260,141,65,20,34,215,172,...,307,307,307,307,307,307,307,307,7a493fa07f0f7d0c9e372eedae03036b,ransomware
4,1,82,240,117,172,16,11,274,158,215,...,307,307,307,307,307,307,307,307,ba60236d9f9fe6cd0a10ffbbf2296669,trojan
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4119,1,208,228,240,117,82,245,210,65,172,...,307,307,307,307,307,307,307,307,b2e6e058b25a25175f4f2d264f0bb83c,trojan
4120,1,82,208,172,117,16,110,286,257,215,...,307,307,307,307,307,307,307,307,e768ea96da507ed239ce925c24b8fd1a,trojan
4121,1,82,240,117,93,16,147,228,208,71,...,307,307,307,307,307,307,307,307,5a99618b63178d7a221552fe962992e3,trojan
4122,1,112,274,158,215,298,76,208,172,117,...,307,307,307,307,307,307,307,307,10a935e723a4b1cc416adb7af2bc4965,trojan


Unnamed: 0,malware,t_0,t_1,t_2,t_3,t_4,t_5,t_6,t_7,t_8,...,t_92,t_93,t_94,t_95,t_96,t_97,t_98,t_99,hash,type
0,1,LdrLoadDll,LdrGetProcedureAddress,LdrLoadDll,LdrGetProcedureAddress,LdrLoadDll,LdrGetProcedureAddress,LdrLoadDll,LdrGetProcedureAddress,LdrLoadDll,...,LdrLoadDll,LdrGetProcedureAddress,LdrLoadDll,LdrGetProcedureAddress,LdrGetDllHandle,FindResourceExW,DrawTextExW,GetSystemMetrics,0fe987c56cfb02db5d810534d6098d93,trojan
1,1,GetSystemTimeAsFileTime,GetSystemInfo,NtCreateMutant,GetSystemTimeAsFileTime,NtOpenKey,NtOpenKeyEx,LdrLoadDll,LdrGetProcedureAddress,RegOpenKeyExW,...,NtOpenKey,NtClose,NtOpenKey,NtQueryValueKey,NtClose,NtOpenKeyEx,NtQueryValueKey,NtClose,f58d31adac5b879b50ce07a9da086736,trojan
2,1,NtClose,NtAllocateVirtualMemory,NtProtectVirtualMemory,LdrGetProcedureAddress,NtProtectVirtualMemory,LdrLoadDll,LdrGetProcedureAddress,NtProtectVirtualMemory,NtDelayExecution,...,GetUserNameW,GetSystemMetrics,LdrLoadDll,LdrGetProcedureAddress,NtAllocateVirtualMemory,NtDuplicateObject,DrawTextExW,GetSystemMetrics,f07d9fa9d2852bd4b7b36f39dd531b4a,pua
3,1,NtDelayExecution,NtAllocateVirtualMemory,RegOpenKeyExW,RegQueryValueExW,RegCloseKey,NtAllocateVirtualMemory,NtOpenFile,NtQueryInformationFile,NtClose,...,NtFreeVirtualMemory,NtCreateSection,NtMapViewOfSection,NtQuerySystemInformation,NtAllocateVirtualMemory,NtQuerySystemInformation,NtFreeVirtualMemory,NtQuerySystemInformation,7a493fa07f0f7d0c9e372eedae03036b,ransomware
4,1,GetSystemTimeAsFileTime,LdrLoadDll,LdrGetProcedureAddress,LdrLoadDll,LdrGetProcedureAddress,LdrLoadDll,LdrGetProcedureAddress,LdrLoadDll,LdrGetProcedureAddress,...,RegOpenKeyExW,RegQueryValueExW,RegOpenKeyExW,RegQueryValueExW,RegOpenKeyExW,RegQueryValueExW,RegOpenKeyExW,RegQueryValueExW,ba60236d9f9fe6cd0a10ffbbf2296669,trojan
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4119,1,NtAllocateVirtualMemory,NtProtectVirtualMemory,LdrLoadDll,LdrGetProcedureAddress,LdrLoadDll,LdrGetProcedureAddress,GetSystemTimeAsFileTime,LdrLoadDll,LdrGetProcedureAddress,...,NtAllocateVirtualMemory,LdrGetDllHandle,LdrGetProcedureAddress,NtFreeVirtualMemory,NtAllocateVirtualMemory,NtFreeVirtualMemory,LdrGetDllHandle,LdrGetProcedureAddress,b2e6e058b25a25175f4f2d264f0bb83c,trojan
4120,1,GetSystemTimeAsFileTime,NtAllocateVirtualMemory,LdrGetDllHandle,LdrGetProcedureAddress,LdrGetDllHandle,NtAllocateVirtualMemory,SetUnhandledExceptionFilter,OleInitialize,LdrGetDllHandle,...,RegisterHotKey,NtAllocateVirtualMemory,GetSystemTimeAsFileTime,RegOpenKeyExA,RegQueryValueExA,RegCloseKey,NtAllocateVirtualMemory,RegOpenKeyExA,e768ea96da507ed239ce925c24b8fd1a,trojan
4121,1,GetSystemTimeAsFileTime,LdrLoadDll,LdrGetProcedureAddress,LdrLoadDll,LdrGetProcedureAddress,GetFileType,LdrGetProcedureAddress,SetUnhandledExceptionFilter,FindWindowA,...,GetUserNameW,LdrLoadDll,LdrGetProcedureAddress,DrawTextExW,GetSystemMetrics,NtAllocateVirtualMemory,NtDuplicateObject,DrawTextExW,5a99618b63178d7a221552fe962992e3,trojan
4122,1,RegOpenKeyExA,NtOpenKey,NtQueryValueKey,NtClose,NtOpenKey,NtQueryValueKey,NtClose,NtQueryAttributesFile,LoadStringA,...,NtCreateFile,NtCreateSection,NtMapViewOfSection,NtClose,GetSystemMetrics,NtAllocateVirtualMemory,CreateActCtxW,GetSystemWindowsDirectoryW,10a935e723a4b1cc416adb7af2bc4965,trojan


Unnamed: 0,malware,t_0,t_1,t_2,t_3,t_4,t_5,t_6,t_7,t_8,...,t_92,t_93,t_94,t_95,t_96,t_97,t_98,t_99,hash,type
0,1,LdrLoadDll,LdrGetProcedureAddress,NtProtectVirtualMemory,NtClose,NtOpenKey,NtQueryValueKey,LdrGetDllHandle,GetSystemInfo,NtAllocateVirtualMemory,...,,,,,,,,,0fe987c56cfb02db5d810534d6098d93,trojan
1,1,GetSystemTimeAsFileTime,GetSystemInfo,NtCreateMutant,NtOpenKey,NtOpenKeyEx,LdrLoadDll,LdrGetProcedureAddress,RegOpenKeyExW,RegQueryInfoKeyW,...,,,,,,,,,f58d31adac5b879b50ce07a9da086736,trojan
2,1,NtClose,NtAllocateVirtualMemory,NtProtectVirtualMemory,LdrGetProcedureAddress,LdrLoadDll,NtDelayExecution,NtFreeVirtualMemory,RegOpenKeyExW,RegQueryValueExW,...,,,,,,,,,f07d9fa9d2852bd4b7b36f39dd531b4a,pua
3,1,NtDelayExecution,NtAllocateVirtualMemory,RegOpenKeyExW,RegQueryValueExW,RegCloseKey,NtOpenFile,NtQueryInformationFile,NtClose,LdrGetDllHandle,...,,,,,,,,,7a493fa07f0f7d0c9e372eedae03036b,ransomware
4,1,GetSystemTimeAsFileTime,LdrLoadDll,LdrGetProcedureAddress,LdrGetDllHandle,SetUnhandledExceptionFilter,CryptAcquireContextW,NtOpenKey,NtQueryValueKey,NtClose,...,,,,,,,,,ba60236d9f9fe6cd0a10ffbbf2296669,trojan
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4119,1,NtAllocateVirtualMemory,NtProtectVirtualMemory,LdrLoadDll,LdrGetProcedureAddress,GetSystemTimeAsFileTime,RegCreateKeyExA,RegSetValueExA,RegCloseKey,LdrGetDllHandle,...,,,,,,,,,b2e6e058b25a25175f4f2d264f0bb83c,trojan
4120,1,GetSystemTimeAsFileTime,NtAllocateVirtualMemory,LdrGetDllHandle,LdrGetProcedureAddress,SetUnhandledExceptionFilter,OleInitialize,SetErrorMode,FindFirstFileExW,NtClose,...,,,,,,,,,e768ea96da507ed239ce925c24b8fd1a,trojan
4121,1,GetSystemTimeAsFileTime,LdrLoadDll,LdrGetProcedureAddress,GetFileType,SetUnhandledExceptionFilter,FindWindowA,NtProtectVirtualMemory,NtAllocateVirtualMemory,GetSystemWindowsDirectoryW,...,,,,,,,,,5a99618b63178d7a221552fe962992e3,trojan
4122,1,RegOpenKeyExA,NtOpenKey,NtQueryValueKey,NtClose,NtQueryAttributesFile,LoadStringA,NtAllocateVirtualMemory,LdrGetDllHandle,LdrGetProcedureAddress,...,,,,,,,,,10a935e723a4b1cc416adb7af2bc4965,trojan


**Battle Chart:**

**GBDT vs GBDT**
- LGBM TB vs CatBoost TB
- LGBM IB vs CatBoost IB
- Tuned LGBM TB vs Tuned CatBoost TB
- Tuned LGBM IB vs Tuned CatBoost IB

**Default vs Tuned**
- LGBM TB vs Tuned LGBM TB
- LGBM IB vs Tuned LGBM IB
- CatBoost TB vs Tuned CatBoost TB
- CatBoost IB vs Tuned CatBoost IB

In [4]:
def mcnemar_test(model1, model2, dataset1, dataset2):
    y_pred1 = model1.predict(dataset1.iloc[:,1:101])
    y_pred2 = model2.predict(dataset2.iloc[:,1:101])
    table = mlxtend.mcnemar_table(y_target,y_pred1,y_pred2)
    display(table)
    print("statsmodels.mcnemar:")
    print(statsmodels.mcnemar(table, exact=False, correction=False))
    chi2, p = mlxtend.mcnemar(table, exact=False, corrected=False)
    print("\nmlxtend.mcnemar (sanity check):")
    print(f"pvalue:\t{p}\nchi2:\t{chi2}\n")
    print("")

In [5]:
print('COMPARISON 1: DEFAULT LGBM TB vs DEFAULT CATBOOST TB\n')
lgbm_tb = load('../GBDT_Training/Outputs/LGBM/Default/RYZEN3b_LGBM_TB.model')
catb_tb = catb.CatBoostClassifier().load_model("../GBDT_Training/Outputs/CATB/Default/RYZEN3b_CATB_TB.model", format='json')

mcnemar_test(lgbm_tb, catb_tb, DF_LGBM_TB, DF_CATB_TB)

COMPARISON 1: DEFAULT LGBM TB vs DEFAULT CATBOOST TB



array([[4068,    6],
       [  14,   36]])

statsmodels.mcnemar:
pvalue      0.07363827012030257
statistic   3.2

mlxtend.mcnemar (sanity check):
pvalue:	0.07363827012030257
chi2:	3.2




In [6]:
print('COMPARISON 2: DEFAULT LGBM IB vs DEFAULT CATBOOST IB\n')
lgbm_ib = load('../GBDT_Training/Outputs/LGBM/Default/RYZEN3b_LGBM_IB.model')
catb_ib = catb.CatBoostClassifier().load_model("../GBDT_Training/Outputs/CATB/Default/RYZEN3b_CATB_IB.model", format='json')

mcnemar_test(lgbm_ib, catb_ib, DF_LGBM_IB, DF_CATB_IB)

COMPARISON 2: DEFAULT LGBM IB vs DEFAULT CATBOOST IB



array([[4075,    6],
       [   9,   34]])

statsmodels.mcnemar:
pvalue      0.4385780260809997
statistic   0.6

mlxtend.mcnemar (sanity check):
pvalue:	0.4385780260809997
chi2:	0.6




In [7]:
print('COMPARISON 3: TUNED LGBM TB vs TUNED CATBOOST TB\n')
lgbm_tb = load('../GBDT_Training/Outputs/LGBM/Tuned/TUNED_RYZEN3b_LGBM_TB.model')
catb_tb = catb.CatBoostClassifier().load_model("../GBDT_Training/Outputs/CATB/Tuned/TUNED_RYZEN3b_CATB_TB.model", format='json')

mcnemar_test(lgbm_tb, catb_tb, DF_LGBM_TB, DF_CATB_TB)

COMPARISON 3: TUNED LGBM TB vs TUNED CATBOOST TB



array([[4069,    8],
       [  11,   36]])

statsmodels.mcnemar:
pvalue      0.4912971242158931
statistic   0.47368421052631576

mlxtend.mcnemar (sanity check):
pvalue:	0.4912971242158931
chi2:	0.47368421052631576




In [8]:
print('COMPARISON 4: TUNED LGBM IB vs TUNED CATBOOST IB\n')
lgbm_ib = load('../GBDT_Training/Outputs/LGBM/Tuned/TUNED_RYZEN3b_LGBM_IB.model')
catb_ib = catb.CatBoostClassifier().load_model("../GBDT_Training/Outputs/CATB/Tuned/TUNED_RYZEN3b_CATB_IB.model", format='json')

mcnemar_test(lgbm_ib, catb_ib, DF_LGBM_IB, DF_CATB_IB)

COMPARISON 4: TUNED LGBM IB vs TUNED CATBOOST IB



array([[4077,   10],
       [   7,   30]])

statsmodels.mcnemar:
pvalue      0.46685427082272524
statistic   0.5294117647058824

mlxtend.mcnemar (sanity check):
pvalue:	0.46685427082272524
chi2:	0.5294117647058824




In [9]:
print('COMPARISON 5: DEFAULT LGBM TB vs TUNED LGBM TB\n')
default_tb = load('../GBDT_Training/Outputs/LGBM/Default/RYZEN3b_LGBM_TB.model')
tuned_tb = load('../GBDT_Training/Outputs/LGBM/Tuned/TUNED_RYZEN3b_LGBM_TB.model')

mcnemar_test(default_tb, tuned_tb, DF_LGBM_TB, DF_LGBM_TB)

COMPARISON 5: DEFAULT LGBM TB vs TUNED LGBM TB



array([[4072,    2],
       [   5,   45]])

statsmodels.mcnemar:
pvalue      0.25683925795785334
statistic   1.2857142857142858

mlxtend.mcnemar (sanity check):
pvalue:	0.25683925795785334
chi2:	1.2857142857142858




In [10]:
print('COMPARISON 6: DEFAULT LGBM IB vs TUNED LGBM IB\n')
default_ib = load('../GBDT_Training/Outputs/LGBM/Default/RYZEN3b_LGBM_IB.model')
tuned_ib = load('../GBDT_Training/Outputs/LGBM/Tuned/TUNED_RYZEN3b_LGBM_IB.model')

mcnemar_test(default_ib, tuned_ib, DF_LGBM_IB, DF_LGBM_IB)

COMPARISON 6: DEFAULT LGBM IB vs TUNED LGBM IB



array([[4079,    2],
       [   8,   35]])

statsmodels.mcnemar:
pvalue      0.05777957112359715
statistic   3.6

mlxtend.mcnemar (sanity check):
pvalue:	0.05777957112359715
chi2:	3.6




In [11]:
print('COMPARISON 7: DEFAULT CATBOOST TB vs TUNED CATBOOST TB\n')
default_tb = catb.CatBoostClassifier().load_model("../GBDT_Training/Outputs/CATB/Default/RYZEN3b_CATB_TB.model", format='json')
tuned_tb = catb.CatBoostClassifier().load_model("../GBDT_Training/Outputs/CATB/Tuned/TUNED_RYZEN3b_CATB_TB.model", format='json')

mcnemar_test(default_tb, tuned_tb, DF_CATB_TB, DF_CATB_TB)

COMPARISON 7: DEFAULT CATBOOST TB vs TUNED CATBOOST TB



array([[4074,    8],
       [   6,   36]])

statsmodels.mcnemar:
pvalue      0.5929800980174266
statistic   0.2857142857142857

mlxtend.mcnemar (sanity check):
pvalue:	0.5929800980174266
chi2:	0.2857142857142857




In [12]:
print('COMPARISON 8: DEFAULT CATBOOST IB vs TUNED CATBOOST IB\n')
default_ib = catb.CatBoostClassifier().load_model("../GBDT_Training/Outputs/CATB/Default/RYZEN3b_CATB_IB.model", format='json')
tuned_ib = catb.CatBoostClassifier().load_model("../GBDT_Training/Outputs/CATB/Tuned/TUNED_RYZEN3b_CATB_IB.model", format='json')

mcnemar_test(default_ib, tuned_ib, DF_CATB_IB, DF_CATB_IB)

COMPARISON 8: DEFAULT CATBOOST IB vs TUNED CATBOOST IB



array([[4080,    4],
       [   4,   36]])

statsmodels.mcnemar:
pvalue      1.0
statistic   0.0

mlxtend.mcnemar (sanity check):
pvalue:	1.0
chi2:	0.0




In [13]:
print('COMPARISON 9: DEFAULT LIGHTGBM TB vs DEFAULT LIGHTGBM IB\n')
default_tb = load('../GBDT_Training/Outputs/LGBM/Default/RYZEN3b_LGBM_TB.model')
default_ib = load('../GBDT_Training/Outputs/LGBM/Default/RYZEN3b_LGBM_IB.model')

mcnemar_test(default_tb, default_ib, DF_LGBM_TB, DF_LGBM_IB)

COMPARISON 9: DEFAULT LIGHTGBM TB vs DEFAULT LIGHTGBM IB



array([[4069,    5],
       [  12,   38]])

statsmodels.mcnemar:
pvalue      0.08955507441364248
statistic   2.8823529411764706

mlxtend.mcnemar (sanity check):
pvalue:	0.08955507441364248
chi2:	2.8823529411764706




In [14]:
print('COMPARISON 10: TUNED LIGHTGBM TB vs TUNED LIGHTGBM IB\n')
tuned_tb = load('../GBDT_Training/Outputs/LGBM/Tuned/TUNED_RYZEN3b_LGBM_TB.model')
tuned_ib = load('../GBDT_Training/Outputs/LGBM/Tuned/TUNED_RYZEN3b_LGBM_IB.model')

mcnemar_test(tuned_tb, tuned_ib, DF_LGBM_TB, DF_LGBM_IB)

COMPARISON 10: TUNED LIGHTGBM TB vs TUNED LIGHTGBM IB



array([[4073,    4],
       [  14,   33]])

statsmodels.mcnemar:
pvalue      0.01842212545409897
statistic   5.555555555555555

mlxtend.mcnemar (sanity check):
pvalue:	0.01842212545409897
chi2:	5.555555555555555




In [15]:
print('COMPARISON 11: DEFAULT CATBOOST TB vs DEFAULT CATBOOST IB\n')
default_tb = catb.CatBoostClassifier().load_model("../GBDT_Training/Outputs/CATB/Default/RYZEN3b_CATB_TB.model", format='json')
default_ib = catb.CatBoostClassifier().load_model("../GBDT_Training/Outputs/CATB/Default/RYZEN3b_CATB_IB.model", format='json')

mcnemar_test(default_tb, default_ib, DF_CATB_TB, DF_CATB_IB)

COMPARISON 11: DEFAULT CATBOOST TB vs DEFAULT CATBOOST IB



array([[4072,   10],
       [  12,   30]])

statsmodels.mcnemar:
pvalue      0.6698153575994165
statistic   0.18181818181818182

mlxtend.mcnemar (sanity check):
pvalue:	0.6698153575994165
chi2:	0.18181818181818182




In [16]:
print('COMPARISON 12: TUNED CATBOOST TB vs TUNED CATBOOST IB\n')
tuned_tb = catb.CatBoostClassifier().load_model("../GBDT_Training/Outputs/CATB/Tuned/TUNED_RYZEN3b_CATB_TB.model", format='json')
tuned_ib = catb.CatBoostClassifier().load_model("../GBDT_Training/Outputs/CATB/Tuned/TUNED_RYZEN3b_CATB_IB.model", format='json')

mcnemar_test(tuned_tb, tuned_ib, DF_CATB_TB, DF_CATB_IB)

COMPARISON 12: TUNED CATBOOST TB vs TUNED CATBOOST IB



array([[4067,   13],
       [  17,   27]])

statsmodels.mcnemar:
pvalue      0.4652088184521417
statistic   0.5333333333333333

mlxtend.mcnemar (sanity check):
pvalue:	0.4652088184521417
chi2:	0.5333333333333333


