# Epsilon Square
*By P. Stikker*<br>
https://PeterStatistics.com<br>
https://www.youtube.com/stikpet<br>

## Introduction

Unfortunately for the Kruskal-Wallis test there is not a single agreed upon effect size measure. However Epsilon square (ε<sup>2</sup>) (Kelley, 1935) seems to be a good choice (see King & Minium (2009), as cited in Tomczak & Tomczak, 2014).

An epsilon square of 0 would mean no differences (and no influence), while one of 1 would indicate a full dependency. Unfortunately there is no formal way to determine if 0.40 is high or low, and I have not been able to find any rule of thumbs for the interpretation. Since this is a squared variable, I would use the same rule of thumb as for a correlation coefficient, but then squaring the upper and lower bounds of each bin. This would give if we use from Rea and Parker (2014) their interpretation for r, the following:

|ε<sup>2</sup>| Interpretation|
|-------|---------------|
|0.00 < 0.01| Negligible|
|0.01 < 0.04 |Weak|
|0.04 < 0.16| Moderate|
|0.16 < 0.36| Relatively strong|
|0.36 < 0.64| Strong|
|0.64 <= 1.00| Very strong|

Lets find out how we can determine this ε<sup>2</sup> with Python, by example.

## Example

To show an example, I'll load some data as a pandas dataframe. So I'll need the '<a href="https://pandas.pydata.org">pandas</a>' library:

In [1]:
#!pip install pandas
import pandas as pd

And then load the example data using the <a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html">'read_csv'</a>. 

In [2]:
myDf = pd.read_csv('../../Data/csv/StudentStatistics.csv')
myDf.head()

Unnamed: 0,RespNr,Location,OAA_ObjCourse,OAA_ObjClass,OAA_CourseExec,OAA_RelActObj,OAA_RelActExa,OAA_RelObjExa,OAA_LearProcAct,OAA_LearProcPrep,...,Over_Grade,Over_Strong,Over_Impr,Gen_Gender,Gen_Age,Gen_SecSchool,Gen_Classes,Gen_NumberSubj,Gen_Time,Comments
0,1.0,Rotterdam,Fully Disagree,Fully Disagree,Fully Disagree,Disagree,Fully Disagree,Fully Disagree,Fully Disagree,Fully Disagree,...,20.0,"None, if there was a teacher that teaches how ...",A better teacher/teaching method,Female,22.0,,,Fully agree,20 < 30,Even when I revise my work I still cannot unde...
1,2.0,Haarlem,Disagree,Disagree,,Fully Disagree,Neither disagree nor agree,Agree,Disagree,Neither disagree nor agree,...,50.0,Blackboard,More motivation! Clearer explanation in class,Male,,The Netherlands,6.0,Disagree,10 < 20,"If the survey is anonymous, there shouldn't be..."
2,3.0,Diemen,Fully agree,Fully agree,Agree,Fully agree,Fully agree,Fully agree,Fully agree,Agree,...,80.0,Notably it has motivated alot about my study c...,,Male,37.0,Africa,7.0,Agree,10 < 20,
3,4.0,Rotterdam,Fully Disagree,Neither disagree nor agree,Disagree,Neither disagree nor agree,Neither disagree nor agree,Fully Disagree,Fully Disagree,Neither disagree nor agree,...,15.0,The clearly layout of every subject eacht week,The explanation of the teacher and motivation,Female,24.0,The Netherlands,6.0,Agree,10 < 20,Practice exams
4,5.0,Haarlem,Disagree,Agree,Fully Disagree,Neither disagree nor agree,Fully agree,Fully agree,Neither disagree nor agree,Fully agree,...,40.0,The online learning material,Classes were just really bad and were very con...,Male,19.0,The Netherlands,7.0,Fully agree,10 < 20,


The example will use as a nominal field the 'Location', and as an ordinal field the 'Teach_Motivate' (if the teacher was able to motivate the student).

In [3]:
myNom = myDf['Location']
myOrd = myDf['Teach_Motivate']

To get a quick look at the counts from this, we can use pandas '<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.crosstab.html">crosstab</a>'. 

In [4]:
myCrosstable = pd.crosstab(myOrd, myNom)
myCrosstable

Location,Diemen,Haarlem,Rotterdam
Teach_Motivate,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Agree,5,1,1
Disagree,0,6,3
Fully Disagree,1,10,9
Fully agree,5,1,0
Neither disagree nor agree,6,3,3


The categories in the ordinal field are their original labels, but need to be numeric values, so we should re-code the field into numeric values.

Lets first see which options there were (using '<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.unique.html">unique</a>').

In [5]:
myOrd.unique()

array(['Fully Disagree', 'Disagree', 'Fully agree',
       'Neither disagree nor agree', nan, 'Agree'], dtype=object)

Lets assign these to numeric values, by making a dictionary out of the coding:

In [6]:
myCoding = {'Fully Disagree': 1, 'Disagree': 2, 'Neither disagree nor agree': 3, 'Agree': 4, 'Fully agree': 5}

And now to replace the labels with their new codes (using '<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.replace.html">replace</a>'):

In [7]:
myDf['Teach_MotivateRec'] = myDf['Teach_Motivate'].replace(myCoding)

A quick check to see if it worked:

In [8]:
myOrd = myDf['Teach_MotivateRec']
myOrd.value_counts()

1.0    20
3.0    12
2.0     9
4.0     7
5.0     6
Name: Teach_MotivateRec, dtype: int64

I'm not aware of any package or library that can calculate epsilon square immediately, but the formula is not that complicated:

\begin{equation*}
\epsilon_{KW}^2 = H\times\frac{n+1}{n^2-1}
\end{equation*}

The $H$ is the test-value of the Kruskal-Wallis Test itself, and $n$ the sample size. 

In a separate documentation an explanation can be found on how to obtain the H-value, so see for more details the Kruskal-Wallis H test documentation. Here is quickly the code from that document:

In [9]:
# !pip install pingouin
from pingouin import kruskal
kwTest = kruskal(data=myDf, dv='Teach_MotivateRec', between='Location')

We can get the H-value from the test results:

In [10]:
H = kwTest.H[0]
H

21.328066442489817

Using Pandas '<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sum.html">sum</a>' function we can get the total number of cases. We sum the crosstable twice (the first will return the row sums, then sum those to get the total sum of all counts:

In [11]:
n = myCrosstable.sum().sum()
n

54

Now we can complete the formula for epsilon square:

In [12]:
esq = H * (n + 1)/(n**2 - 1)
esq

0.40241634797150594

From the table in the introduction we can see that a value of 0.40 would fit in the 'strong' category, so there appears to be a 'strong' influence of the location on the opinion of the motivational qualities of the teacher.

With some if and elif statements we could also let Python look this up for us:

In [13]:
if esq < .01:
    qual = 'Negligible'
elif esq < .04:
    qual = 'Weak'
elif esq < .16:
    qual = 'Moderate'
elif esq < .36:
    qual = 'Relatively strong'
elif esq < .64:
    qual = 'Strong'
else:
    qual = 'Very strong'
    
qual

'Strong'

## References

Kelley, T. L. (1935). An Unbiased Correlation Ratio Measure. *Proceedings of the National Academy of Sciences of the United States of America, 21*(9), 554–559.

Rea, Louis M., and Richard A. Parker. 2014. *Designing and Conducting Survey Research: A Comprehensive Guide* (4th ed). San Francisco, CA: Jossey-Bass.

Tomczak, M., & Tomczak, E. (2014). The need to report effect size estimates revisited: An overview of some recommended measures of effect size. *Trends in Sport Sciences, 1*(21), 19–25.