# 09 Non-Parametric Tests - Task 2

Does the testStimulus (independent variable) have a significant influence on
speech quality ratings (dependent variable)? If yes, for which cases?

Please assume that each of the six files was assessed by a different set of test participants and only use the ratings of the first repetition.
Use the quality ratings provided in the dataset “speech_quality_repetition_dataset”.

In [22]:
import numpy as np
import pandas as pd
import scipy

# !pip install pingouin
import pingouin as pg

import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style="whitegrid", context="talk")
cm = sns.diverging_palette(127, 14, s=99, l=55, as_cmap=True)

FIGSIZE = (20,4)

### Loading the data

In [23]:
df = pd.read_csv("../datasets/DB03_speech_quality_repetition_dataset.csv")

mask = (df['repetition'] == 1)
df = df.loc[mask]
print(df.testStimulus.unique())

df

['haus_m_700_bpf_200_2800_normAsl_-26' 'haus_m_700_mnru_Q_14_normAsl_-26'
 'haus_m_700_normAsl_-26' 'maus_m_700_bpf_200_2800_normAsl_-26'
 'maus_m_700_mnru_Q_14_normAsl_-26' 'maus_m_700_normAsl_-26']


Unnamed: 0,subjectCode,testStimulus,repetition,rating
0,vp01,haus_m_700_bpf_200_2800_normAsl_-26,1,2
4,vp02,haus_m_700_bpf_200_2800_normAsl_-26,1,2
8,vp03,haus_m_700_bpf_200_2800_normAsl_-26,1,3
12,vp04,haus_m_700_bpf_200_2800_normAsl_-26,1,3
16,vp05,haus_m_700_bpf_200_2800_normAsl_-26,1,2
...,...,...,...,...
868,vp33,maus_m_700_normAsl_-26,1,4
872,vp34,maus_m_700_normAsl_-26,1,5
876,vp35,maus_m_700_normAsl_-26,1,5
880,vp36,maus_m_700_normAsl_-26,1,5


## Non-parametric Tests
![flowchart_nonparametric](https://pingouin-stats.org/_images/flowchart_nonparametric.svg)
Source: https://pingouin-stats.org/guidelines.html#id7

### Kruskal-Wallis-Test

In [28]:
df.groupby("testStimulus").ngroups

6

In [24]:
result = pg.kruskal(df, dv="rating", between="testStimulus")
result.style.background_gradient(cmap=cm, subset=["p-unc"])

Unnamed: 0,Source,ddof1,H,p-unc
Kruskal,testStimulus,5,130.897,0.0


- `H` : The Kruskal-Wallis H statistic, corrected for ties
- `p-unc` : Uncorrected p-value
- `dof` : degrees of freedom

Source: https://pingouin-stats.org/generated/pingouin.kruskal.html

In [25]:
result = pg.pairwise_ttests(df, dv="rating", between="testStimulus", parametric=False, padjust='bonf')
result.style.background_gradient(cmap=cm, subset=["p-corr"])

Unnamed: 0,Contrast,A,B,Paired,Parametric,U-val,Tail,p-unc,p-corr,p-adjust,hedges
0,testStimulus,haus_m_700_bpf_200_2800_normAsl_-26,haus_m_700_mnru_Q_14_normAsl_-26,False,False,827.5,two-sided,0.101691,1.0,bonf,0.375
1,testStimulus,haus_m_700_bpf_200_2800_normAsl_-26,haus_m_700_normAsl_-26,False,False,88.5,two-sided,0.0,0.0,bonf,-2.344
2,testStimulus,haus_m_700_bpf_200_2800_normAsl_-26,maus_m_700_bpf_200_2800_normAsl_-26,False,False,667.5,two-sided,0.849032,1.0,bonf,-0.056
3,testStimulus,haus_m_700_bpf_200_2800_normAsl_-26,maus_m_700_mnru_Q_14_normAsl_-26,False,False,913.0,two-sided,0.008877,0.133151,bonf,0.623
4,testStimulus,haus_m_700_bpf_200_2800_normAsl_-26,maus_m_700_normAsl_-26,False,False,110.0,two-sided,0.0,0.0,bonf,-2.118
5,testStimulus,haus_m_700_mnru_Q_14_normAsl_-26,haus_m_700_normAsl_-26,False,False,40.0,two-sided,0.0,0.0,bonf,-3.034
6,testStimulus,haus_m_700_mnru_Q_14_normAsl_-26,maus_m_700_bpf_200_2800_normAsl_-26,False,False,516.5,two-sided,0.052347,0.78521,bonf,-0.449
7,testStimulus,haus_m_700_mnru_Q_14_normAsl_-26,maus_m_700_mnru_Q_14_normAsl_-26,False,False,774.5,two-sided,0.293264,1.0,bonf,0.254
8,testStimulus,haus_m_700_mnru_Q_14_normAsl_-26,maus_m_700_normAsl_-26,False,False,64.0,two-sided,0.0,0.0,bonf,-2.683
9,testStimulus,haus_m_700_normAsl_-26,maus_m_700_bpf_200_2800_normAsl_-26,False,False,1280.5,two-sided,0.0,0.0,bonf,2.4


- `A` : Name of first measurement
- `B` : Name of second measurement
- `Paired` : indicates whether the two measurements are paired or not
- `Parametric` : indicates if (non)-parametric tests were used
- `Tail` : indicate whether the p-values are one-sided or two-sided
- `T` : T statistic (only if parametric=True)
- `U-val` : Mann-Whitney U stat (if parametric=False and unpaired data)
- `W-val` : Wilcoxon W stat (if parametric=False and paired data)
- `p-unc` : Uncorrected p-values
- `p-corr` : Corrected p-values
- `p-adjust` : p-values correction method
- `BF10` : Bayes Factor
- `hedges` : effect size (or any effect size defined in ``effsize``)

Source: https://pingouin-stats.org/generated/pingouin.pairwise_ttests.html