# Hyperparameter experiment - effect of # samples on the KME and Wishart test

In this experiment I test how much the sample size per variable influence the scores of the KME and Wishart test. This is an interesting experiment because the amount of samples is one of the few settings that can impact the Wishart test. For this experiment 10 test datasets where used per hyperparameter setting combination. It was assumed that giving the KME the same samples size during training and testing would give the best results.

The Wishart test clearly perfors better when given more samples per variable. Interesting to see is that this is caused by the Wishart test getting better results on cases that are not t-separated. Meaning that on low samples sizes, the Wishart test too easily classifies a tetrad as t-separated.

In [1]:
import pandas as pd
import numpy as np
import tabulate

ModuleNotFoundError: No module named 'tabulate'

In [2]:
KME_results = pd.read_csv('experiment_sample_size_results.csv')
KME_results2 = pd.read_csv('experiment_large_sample_trees_weights_results.csv')
wishart_results = pd.read_csv('wishart_experiment_samplesize_results.csv')

In [3]:
wish_res_nonlin = wishart_results[wishart_results['linear'] == False]  
wish_res_lin = wishart_results[wishart_results['linear'] == True]

Overview of the Wishart test results for the linear case.

In [23]:
wish_lin_mean = wish_res_lin.groupby(['b','d','nsamples']).mean()
wish_lin_mean.insert(2, 'acc_std', wish_res_lin.groupby(['b','d','nsamples'])[['accuracy']].std())
print(wish_lin_mean.reset_index().to_markdown())

|    |    b |   d |   nsamples | linear   |   accuracy |    acc_std |   trueneg |   falseneg |   truepos |   falsepos |
|---:|-----:|----:|-----------:|:---------|-----------:|-----------:|----------:|-----------:|----------:|-----------:|
|  0 | 0    |   0 |        200 | True     |   0.57037  | 0.0317539  |     252.1 |        5.1 |     594.9 |      632.9 |
|  1 | 0    |   0 |        500 | True     |   0.741886 | 0.0356175  |     506.8 |        5.1 |     594.9 |      378.2 |
|  2 | 0    |   0 |       1000 | True     |   0.832795 | 0.015716   |     640.7 |        4   |     596   |      244.3 |
|  3 | 0    |   0 |       2000 | True     |   0.898047 | 0.00941798 |     736.9 |        3.3 |     596.7 |      148.1 |
|  4 | 0    |   0 |      10000 | True     |   0.942761 | 0.0336506  |     804   |        4   |     596   |       81   |
|  5 | 0.05 |   0 |        200 | True     |   0.577037 | 0.0430888  |     260.9 |        4   |     596   |      624.1 |
|  6 | 0.05 |   0 |        500 | True   

Overview of the Wishart test results for the nonlineaer case.

In [22]:
wish_nonlin_mean = wish_res_nonlin.groupby(['b','d','nsamples']).mean()
wish_nonlin_mean.insert(2, 'acc_std', wish_res_nonlin.groupby(['b','d','nsamples'])[['accuracy']].std())
print(wish_nonlin_mean.reset_index().to_markdown())

|    |    b |    d |   nsamples | linear   |   accuracy |   acc_std |   trueneg |   falseneg |   truepos |   falsepos |
|---:|-----:|-----:|-----------:|:---------|-----------:|----------:|----------:|-----------:|----------:|-----------:|
|  0 | 0.01 | 0.01 |        200 | False    |   0.563165 | 0.026451  |     240   |        3.7 |     596.3 |      645   |
|  1 | 0.01 | 0.01 |        500 | False    |   0.73899  | 0.0267751 |     502.4 |        5   |     595   |      382.6 |
|  2 | 0.01 | 0.01 |       1000 | False    |   0.824983 | 0.0236458 |     631.8 |        6.7 |     593.3 |      253.2 |
|  3 | 0.01 | 0.01 |       2000 | False    |   0.895623 | 0.0116377 |     734.4 |        4.4 |     595.6 |      150.6 |
|  4 | 0.01 | 0.01 |      10000 | False    |   0.9567   | 0.0199461 |     825.1 |        4.4 |     595.6 |       59.9 |
|  5 | 0.01 | 0.05 |        200 | False    |   0.587677 | 0.0407839 |     279.8 |        7.1 |     592.9 |      605.2 |
|  6 | 0.01 | 0.05 |        500 | False 

The results per sample size for the Wishart test and KME. 

In [None]:
wish_mean = wish_res_nonlin[(wish_res_nonlin['b'] == 0.05) & (wish_res_nonlin['d'] == 0.05)].groupby(['nsamples']).mean()
wish_mean.insert(4, 'acc_std', wish_mean.groupby(['nsamples'])[['accuracy']].std()['accuracy'])
wish_mean['accuracy'] = wish_mean['accuracy'] * 100
wish_mean['acc_std'] = wish_mean['acc_std'] * 100
print(wish_mean.to_markdown())

In [31]:
KME_mean = KME_results[(KME_results['b'] == 0.05) & (KME_results['d'] == 0.05)].groupby(['n_samples']).mean()
KME_mean = KME_mean.rename(columns={'var_score':'stdev_score'})
KME_mean['best_score'] = KME_mean['best_score'] * 100
print(KME_mean.to_markdown())

|   n_samples |    b |    d |   KME |    E |   K |   n_distributions |   best_score |   mean_score |   stdev_score |
|------------:|-----:|-----:|------:|-----:|----:|------------------:|-------------:|-------------:|--------------:|
|         200 | 0.05 | 0.05 |     4 | 1000 | 400 |              4000 |      81.6162 |      74.8687 |       4.12293 |
|         500 | 0.05 | 0.05 |     4 | 1000 | 400 |              4000 |      82.6936 |      80.3098 |       1.38242 |
|        1000 | 0.05 | 0.05 |     4 | 1000 | 400 |              4000 |      81.4141 |      79.0707 |       1.2619  |
|        2000 | 0.05 | 0.05 |     4 | 1000 | 400 |              4000 |      84.3771 |      79.9529 |       3.67877 |


In [34]:
KME_results2.groupby(['b']).mean()

Unnamed: 0_level_0,d,KME,E,K,n_samples,n_distributions,score,trueneg,falseneg,truepos,falsepos
b,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
0.05,0.05,4.0,500.0,400.0,2000.0,4000.0,0.80835,824.8,224.4,375.6,60.2


In [33]:
print(KME_results2.to_markdown())

|    | train_lin   |    b |    d |   KME |   E |   K |   n_samples |   n_distributions |    score |   trueneg |   falseneg |   truepos |   falsepos |
|---:|:------------|-----:|-----:|------:|----:|----:|------------:|------------------:|---------:|----------:|-----------:|----------:|-----------:|
|  0 | [False]     | 0.05 | 0.05 |     4 | 500 | 400 |        2000 |              4000 | 0.816835 |       843 |        230 |       370 |         42 |
|  1 | [False]     | 0.05 | 0.05 |     4 | 500 | 400 |        2000 |              4000 | 0.823569 |       848 |        225 |       375 |         37 |
|  2 | [False]     | 0.05 | 0.05 |     4 | 500 | 400 |        2000 |              4000 | 0.783165 |       809 |        246 |       354 |         76 |
|  3 | [False]     | 0.05 | 0.05 |     4 | 500 | 400 |        2000 |              4000 | 0.847138 |       870 |        212 |       388 |         15 |
|  4 | [False]     | 0.05 | 0.05 |     4 | 500 | 400 |        2000 |              4000 | 0.826263 | 