# Stats analysis

This notebook performins a one-way analysis of variance (ANOVA) or a Kruskal-Wallis test on the classification accuracies of the `WANG2016` dataset. 
- If the classification data if found to have independence of observations, be normally distributed, and have homogeneity of variances, the one-way ANOVA was performed. 
- Otherwise, the Kruskal-Wallis test will take place. 

The interest of doing the one-way ANOVA is to test the null hypothesis ($\text{H}_0$) that one of the groups is different from the other. Our expectation is that we will reject the null hypothesis. Thus, implying that the proposed classifiers of the SSVEP toolbox perform as well as the Riemmanian geometry classifier suggested by the MOABB tool. However, if the null hypothesis is not rejected, then a post-hoc test with a Bonferroni correction for multiple comparisons were used to compare the classifiers.

## Import libraries

In [1]:
import numpy as np
import pandas as pd
from scipy.stats import shapiro, f_oneway, kruskal
from statsmodels.stats.multitest import multipletests
from pingouin import sphericity
import matplotlib.pyplot as plt
import scikit_posthocs as sp

## Import data

In [2]:
results_file = "results_ncan_moabb.csv"

data = pd.read_csv(results_file)
data.set_index('Subject', inplace=True)

# Normality and variance test

Do a Shapiro-Wilk test for normality and a Mauchly's test for sphericity to check the variance of the data

In [3]:
def check_normality(data):
    stat, pvalue = shapiro(data)
    return pvalue >= 0.05

# Mauchly's test for sphericity
def check_sphericity(data):
    result = sphericity(
        data,
        method='mauchly',
        alpha=0.05
        )
    return result

# Check normality and sphericity for each column
normality_passed = all(check_normality(data[col]) for col in data.columns)
sphericity_passed = check_sphericity(data).spher

# Statistical tests

If normality and sphericity test have passed, perform a one-way ANOVA. Otherwise, perform a Kruskal-Wallis test

In [4]:
if (normality_passed and sphericity_passed):
    print("Data is normal and spheric")
    stat, pvalue = f_oneway(*[data[col] for col in data.columns])
    print(f"- One-way ANOVA p-value: {pvalue}")
else:
    print("Data is not normal or not spheric")
    stat, pvalue = kruskal(*[data[col] for col in data.columns])
    print(f"- Kruskal-Wallis p-value: {pvalue}")

Data is not normal or not spheric
- Kruskal-Wallis p-value: 2.2164269995166644e-07


## Multiple test comparison

Do a Bonferroni correction to do a multiple test comparison

In [5]:
from scipy.stats import f_oneway, friedmanchisquare
stats = []
pvalues = []
if normality_passed and sphericity_passed:
    # Perform repeated measures ANOVA
    stats.append, pvalue.append = f_oneway(*[data[col] for col in data.columns])
    print(f"Repeated Measures ANOVA p-value: {pvalue}")
else:
    # Perform Friedman's test
    stats.append(friedmanchisquare(*[data[col] for col in data.columns])[0])
    pvalues.append(friedmanchisquare(*[data[col] for col in data.columns])[1])
    print(f"Friedman's test p-value: {pvalue}")


Friedman's test p-value: 2.2164269995166644e-07


## Posthoc tests

Got suggestion for this processing [here](https://scikit-posthocs.readthedocs.io/en/latest/generated/scikit_posthocs.posthoc_nemenyi_friedman.html). Look at the first paper reference.


In [38]:
from pingouin import anova, rm_anova

if normality_passed and sphericity_passed:
    # Perform repeated measures ANOVA
    long_format_df = data.reset_index().melt(
        id_vars='Subject',
        var_name='Classifier',
        value_name='Accuracy'
        )
    multiple_comparison_results = sp.posthoc_tukey_hsd(
        long_format_df["Accuracy"],
        long_format_df["Classifier"],
        alpha=0.05
        )
    print("Tukey Honestly Significant Difference:")
else:
    multiple_comparison_results = sp.posthoc_nemenyi_friedman(data)
    print(f"Nemenyi Friedman test:")

print(multiple_comparison_results)

Nemenyi Friedman test:
              fbCCA       MSI    MEC  RG_logreg
fbCCA      1.000000  0.180649  0.001      0.001
MSI        0.180649  1.000000  0.001      0.001
MEC        0.001000  0.001000  1.000      0.900
RG_logreg  0.001000  0.001000  0.900      1.000
