---
author: Krtin Juneja (KJUNEJA@falcon.bentley.edu)
---

The solution below uses an example dataset that details the counts of insects in an agricultural experiment with six types of insecticides, labeled A through F.  (See how to quickly load some sample data.)

In [1]:
from rdatasets import data
df = data('InsectSprays')
df

Unnamed: 0,count,spray
0,10,A
1,7,A
2,20,A
3,14,A
4,14,A
...,...,...
67,10,F
68,26,F
69,26,F
70,24,F


Before we perform any post hoc analysis, we need to see if the count of insects depends on the type of insecticide given by conducting a one way ANOVA.  (See also how to do a one-way analysis of variance (ANOVA).)

In [2]:
from statsmodels.formula.api import ols
model = ols('count ~ spray', data = df).fit()
import statsmodels.api as sm
sm.stats.anova_lm(model, typ=1)

Unnamed: 0,df,sum_sq,mean_sq,F,PR(>F)
spray,5.0,2668.833333,533.766667,34.702282,3.182584e-17
Residual,66.0,1015.166667,15.381313,,


At the 5% significance level, we see that the count differs according to the type of insecticide used. We assume that the model assumptions are met, but do not verify that here.

If we would like to compare the pairs without any corrections, we can use the 'pairwise t test' in the `scikit_posthocs` package.

In [3]:
import scikit_posthocs as sp
sp.posthoc_ttest(df, val_col='count', group_col='spray', p_adjust=None, pool_sd=True )

Unnamed: 0,A,B,C,D,E,F
A,1.0,0.6044761,7.266893e-11,9.81691e-08,2.753922e-09,0.1805998
B,0.6044761,1.0,8.509776e-12,1.212803e-08,3.257986e-10,0.4079858
C,7.266893e-11,8.509776e-12,1.0,0.08141205,0.379475,2.794343e-13
D,9.81691e-08,1.212803e-08,0.08141205,1.0,0.379475,4.03561e-10
E,2.753922e-09,3.257986e-10,0.379475,0.379475,1.0,1.054387e-11
F,0.1805998,0.4079858,2.794343e-13,4.03561e-10,1.054387e-11,1.0


Techniques to adjust the above table for multiple comparisons include the Bonferroni correction, Fisher’s Least Significant Difference (LSD) method, Dunnett’s procedure, and Scheffe’s method.  These can be used in place of 'None' for the `p.adjust` argument; [see details here](https://scikit-posthocs.readthedocs.io/en/latest/generated/scikit_posthocs.posthoc_ttest/).

You can also determine the magnitude of these differences; see how to perform post-hoc analysis with Tukey's HSD test.