# Post HOC Test - Tukey's HSD Test

Tukey's range test, also known as Tukey's test, Tukey method, Tukey's honest significance test, or Tukey's HSD (honestly significant difference) test, is a single-step multiple comparison procedure and statistical test. It can be used to find means that are significantly different from each other.

Tukey's test compares the means of every treatment to the means of every other treatment; that is, it applies simultaneously to the set of all pairwise comparisons.

### Example 1:

From mpg Dataset in Seaborn Library, Check if Miles Per Gallon (mpg) is different for 4 Cylinder Cars for USA, Japan and Europe.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [2]:
sns.get_dataset_names()

['anagrams',
 'anscombe',
 'attention',
 'brain_networks',
 'car_crashes',
 'diamonds',
 'dots',
 'dowjones',
 'exercise',
 'flights',
 'fmri',
 'geyser',
 'glue',
 'healthexp',
 'iris',
 'mpg',
 'penguins',
 'planets',
 'seaice',
 'taxis',
 'tips',
 'titanic']

In [3]:
mpg = sns.load_dataset("mpg")

In [4]:
mpg.head()

Unnamed: 0,mpg,cylinders,displacement,horsepower,weight,acceleration,model_year,origin,name
0,18.0,8,307.0,130.0,3504,12.0,70,usa,chevrolet chevelle malibu
1,15.0,8,350.0,165.0,3693,11.5,70,usa,buick skylark 320
2,18.0,8,318.0,150.0,3436,11.0,70,usa,plymouth satellite
3,16.0,8,304.0,150.0,3433,12.0,70,usa,amc rebel sst
4,17.0,8,302.0,140.0,3449,10.5,70,usa,ford torino


In [6]:
mpg.shape

(398, 9)

In [7]:
mpg.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 398 entries, 0 to 397
Data columns (total 9 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   mpg           398 non-null    float64
 1   cylinders     398 non-null    int64  
 2   displacement  398 non-null    float64
 3   horsepower    392 non-null    float64
 4   weight        398 non-null    int64  
 5   acceleration  398 non-null    float64
 6   model_year    398 non-null    int64  
 7   origin        398 non-null    object 
 8   name          398 non-null    object 
dtypes: float64(4), int64(3), object(2)
memory usage: 28.1+ KB


In [8]:
mpg.describe()

Unnamed: 0,mpg,cylinders,displacement,horsepower,weight,acceleration,model_year
count,398.0,398.0,398.0,392.0,398.0,398.0,398.0
mean,23.514573,5.454774,193.425879,104.469388,2970.424623,15.56809,76.01005
std,7.815984,1.701004,104.269838,38.49116,846.841774,2.757689,3.697627
min,9.0,3.0,68.0,46.0,1613.0,8.0,70.0
25%,17.5,4.0,104.25,75.0,2223.75,13.825,73.0
50%,23.0,4.0,148.5,93.5,2803.5,15.5,76.0
75%,29.0,8.0,262.0,126.0,3608.0,17.175,79.0
max,46.6,8.0,455.0,230.0,5140.0,24.8,82.0


In [None]:
# Storing Data for all 4 Cylinders Car in Different Dataset.

In [9]:
df = mpg[mpg["cylinders"] == 4][["mpg","origin"]]

In [10]:
df

Unnamed: 0,mpg,origin
14,24.0,japan
18,27.0,japan
19,26.0,europe
20,25.0,europe
21,24.0,europe
...,...,...
393,27.0,usa
394,44.0,europe
395,32.0,usa
396,28.0,usa


In [11]:
# Performing Tukey's HSD Test Using Statsmodels Library:

from statsmodels.stats.multicomp import pairwise_tukeyhsd

In [12]:
pairwise_tukeyhsd(endog = df["mpg"], groups = df["origin"], alpha=0.05)

<statsmodels.sandbox.stats.multicomp.TukeyHSDResults at 0x18460c995b0>

In [13]:
result = pairwise_tukeyhsd(endog = df["mpg"], groups = df["origin"], alpha=0.05)

In [15]:
print(result)

Multiple Comparison of Means - Tukey HSD, FWER=0.05 
group1 group2 meandiff p-adj   lower   upper  reject
----------------------------------------------------
europe  japan   3.1845  0.003  0.9266  5.4425   True
europe    usa  -0.5708 0.7995 -2.8062  1.6646  False
 japan    usa  -3.7554  0.001 -5.9383 -1.5724   True
----------------------------------------------------


In [16]:
# Seeing p-Values, We fail to Reject Null Hypothesis for Europe and USA (meaning means of Europe and USA are not significantlly
# different).

# Means of Japan and USA AND Japan and Europe are Significantlly Different as we can Reject The Null Hypothesis for these
# comparisions.

### Example 2:

In [17]:
machine_1 = [150, 151, 152, 152, 151, 150]
machine_2 = [153, 152, 148, 151, 149, 152]
machine_3 = [156, 154, 155, 156, 157, 155]

In [18]:
# We need to Store all these Data in Single Data Frame:

machine_df = pd.concat([pd.DataFrame(data = {"Machine" : "Machine 1", "Volume" : machine_1}),
                       pd.DataFrame(data = {"Machine" : "Machine 2", "Volume" : machine_2}),
                       pd.DataFrame(data = {"Machine" : "Machine 3", "Volume" : machine_3})])

In [19]:
machine_df

Unnamed: 0,Machine,Volume
0,Machine 1,150
1,Machine 1,151
2,Machine 1,152
3,Machine 1,152
4,Machine 1,151
5,Machine 1,150
0,Machine 2,153
1,Machine 2,152
2,Machine 2,148
3,Machine 2,151


In [20]:
# Performing Tukey's HSD Test Using Statsmodels Library:

from statsmodels.stats.multicomp import pairwise_tukeyhsd

In [22]:
results = pairwise_tukeyhsd(endog= machine_df["Volume"], groups= machine_df["Machine"], alpha= 0.05)

In [23]:
print(results)

  Multiple Comparison of Means - Tukey HSD, FWER=0.05   
  group1    group2  meandiff p-adj  lower  upper  reject
--------------------------------------------------------
Machine 1 Machine 2  -0.1667   0.9 -2.2269 1.8936  False
Machine 1 Machine 3      4.5 0.001  2.4397 6.5603   True
Machine 2 Machine 3   4.6667 0.001  2.6064 6.7269   True
--------------------------------------------------------


In [24]:
# Seeing p-Values, We fail to Reject Null Hypothesis for Machine 1 and Machine 2 (meaning means of Machine 1 and Machine 2 
# are not significantlly different).

# Means of Machine 1 and Machine 3 AND Machine 2 and Machine 3 are Significantlly Different as we can Reject The Null 
# Hypothesis for these comparisions.