# 09 Non-Parametric Tests - Task 1

Provide a script and html file which calulates the suitable **non-parametric test** to answer the
following research questions (RQ). Please also report the results as a text
conclusion including the test statistic value (F) with degree of freedom, significance
value as well as pairwise comparisions.

Does changing the bitrate (independent variable: 2000, 4000, 6000, 50000 kbps) have a
significant effect on the video quality (VQ) ratings (dependent variable).
Please consider all ratings at
a resolution of 1080p and framerate of 60 fps for the first game. Use the ratings provided in the
gaming video quality dataset.

## Import and Initializing

In [106]:
import numpy as np
import pandas as pd
import scipy

#! pip install pingouin
import pingouin as pg

import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style="whitegrid", context="talk")
cm = sns.diverging_palette(127, 14, s=99, l=55, as_cmap=True)

FIGSIZE = (20,4)

## Loading the data

In [116]:
dataset = pd.read_excel(
    "../datasets/DB01_gaming_video_quality_dataset.xlsx",
    usecols=["PID", "Game", "Resolution", "Framerate", "Bitrate", "VQ"],
    dtype={"Bitrate": str}
).dropna()
mask_condition = ((dataset['Resolution'] == 1080) & (dataset['Framerate'] == 60))
print(dataset.loc[mask_condition].Bitrate.unique())
mask_bitrate = dataset["Bitrate"].isin(["2000", "4000", "6000", "50000"])
mask_game =(dataset['Game'] == 'Game1')
dataset = dataset.loc[mask_condition & mask_bitrate & mask_game]
dataset.info()

['2000' '4000' '6000' '50000']
<class 'pandas.core.frame.DataFrame'>
Int64Index: 100 entries, 2867 to 3361
Data columns (total 6 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   PID         100 non-null    int64  
 1   Game        100 non-null    object 
 2   Resolution  100 non-null    int64  
 3   Framerate   100 non-null    int64  
 4   Bitrate     100 non-null    object 
 5   VQ          100 non-null    float64
dtypes: float64(1), int64(3), object(2)
memory usage: 5.5+ KB


## Check general requirements

### Measurement

Independent variable: `Bitrate`.<br>
The dependent variable (`VQ`) is measured at the interval level.

### Balance
Is there the same amount of measurements for each bitrate for `Game1`?

In [108]:
dataset.groupby(['Bitrate']).size()

Bitrate
2000     25
4000     25
50000    25
6000     25
dtype: int64

Same amount of measurements -> no changes necessary

## Non-parametric Tests
![flowchart_nonparametric](https://pingouin-stats.org/_images/flowchart_nonparametric.svg)
Source: https://pingouin-stats.org/guidelines.html#id7

## Friedmann Test

In [109]:
resultBitrate = pg.friedman(dataset, dv='VQ', subject='PID', within= 'Bitrate')
resultBitrate

Unnamed: 0,Source,ddof1,Q,p-unc
Friedman,Bitrate,3,56.716,2.954771e-12


- `Q` : The Friedman Q statistic, corrected for ties
- `p-unc` : Uncorrected p-value
- `dof` : degrees of freedom

Source: https://pingouin-stats.org/generated/pingouin.friedman.html#pingouin.friedman

In [110]:
result = pg.pairwise_ttests(dataset, dv="VQ", subject="PID", within='Bitrate', parametric=False, padjust='bonf')
result.style.background_gradient(cmap=cm, subset=["p-unc"])

Unnamed: 0,Contrast,A,B,Paired,Parametric,W-val,Tail,p-unc,p-corr,p-adjust,hedges
0,Bitrate,2000,4000,True,False,27.5,two-sided,0.000486,0.002914,bonf,-0.926
1,Bitrate,2000,6000,True,False,15.0,two-sided,0.00012,0.00072,bonf,-1.489
2,Bitrate,2000,50000,True,False,1.0,two-sided,1.5e-05,8.8e-05,bonf,-3.032
3,Bitrate,4000,6000,True,False,30.5,two-sided,0.001915,0.011489,bonf,-0.669
4,Bitrate,4000,50000,True,False,1.0,two-sided,1.5e-05,8.8e-05,bonf,-2.268
5,Bitrate,6000,50000,True,False,5.0,two-sided,5.5e-05,0.000329,bonf,-1.421


- `A` : Name of first measurement
- `B` : Name of second measurement
- `Paired` : indicates whether the two measurements are paired or not
- `Parametric` : indicates if (non)-parametric tests were used
- `W-val` : Wilcoxon W stat (if parametric=False and paired data)
- `Tail` : indicate whether the p-values are one-sided or two-sided
- `p-unc` : Uncorrected p-values
- `p-corr` : Corrected p-values
- `p-adjust` : p-values correction method
- `hedges` : effect size (or any effect size defined in ``effsize``)

Source: https://pingouin-stats.org/generated/pingouin.pairwise_ttests.html