## Table Of Contents
1. State null and alternative hypotheses
2. Set alpha value
3. Import relevant libraries
4. Import athleisure dataset
5. Hypothesis Testing
6. 2-Factor ANOVA
7. 3-Factor ANOVA

## 1. State Null and Alternative Hypotheses

$H_{01}$ - All athleisure-related keywords are equal in terms of average search volume.  
$H_{A1}$ - Some athleisure-related keywords have greater average search volumes than others.



$H_{02}$ - People will be equally likely to search for activewear-related terms in any given month.  
$H_{A2}$ - People will be more likely to search for activewear-related terms depending on the month.


$H_{03}$ - There will be an equal search volume for activewear-related terms on any platform.  
$H_{A3}$ - There will be a greater search volume for activewear-related terms on one particular platform.

## 2. Set Alpha Value

We will reject the null hypothesis when **alpha** < **0.05**


## 3. Import Relevant Libraries

We will reject the null hypothesis when **alpha** < **0.05**


In [2]:
import pandas as pd
import numpy as np
from scipy import stats

## 4. Import Athleisure Dataset


In [9]:
athleisure_df = pd.read_csv('../athleisure.csv')
athleisure_df.drop(['Unnamed: 0'], axis =1, inplace = True)

## 5. Hypothesis Testing


### Hypothesis Test 1

Are there differences between pre-selected keywords?

$H_{01}$ - All athleisure-related keywords are equal in terms of average search volume.  
$H_{A1}$ - Some athleisure-related keywords have greater average search volumes than others. 

Mean search volume(keyword 1) = Mean search volume(keyword 2) = Mean search volume(keyword 3) = ...

In [11]:
keys = list(athleisure_df.keyword.unique())
values = []

for keyword in list(athleisure_df.keyword.unique()):
    values.append(list(athleisure_df.loc[athleisure_df.keyword == keyword, 'volume']))
    
data = dict(zip(keys, values))

In [None]:
# Obtain list of dict keyword references for easy input into anova test parameter
for keyword in keys:
    text = f" data['{keyword}'],"
    print(text)

In [12]:
import scipy.stats as stats
# stats f_oneway functions takes the groups as input and returns F and P-value
fvalue, pvalue = stats.f_oneway(data['yoga pants'],
 data['sweatpants'],
 data['sweatshirt'],
 data['crew neck'],
 data['thumb hole'],
 data['pullover'],
 data['fleece'],
 data['joggers'],
 data['hoodie'],
 data['hooded'],
 data['capri'],
 data['muscle tee'],
 data['basic tee'],
 data['sweatband'],
 data['windbreaker'],
 data['leggings'],
 data['tank top'],
 data['muscle tank'],
 data['long sleeve'],
 data['short sleeve'],
 data['mesh'],
 data['striped'],
 data['stripes'],
 data['3 stripes'],
 data['stripe'],
 data['stretch'],
 data['stretchy'],
 data['stretchable'],
 data['flex'],
 data['flexing'],
 data['lightweight'],
 data['spandex'],
 data['breathable'],
 data['loose'],
 data['loose fit'],
 data['fitted'],
 data['core'],
 data['blend'],
 data['cotton'],
 data['high waist'],
 data['tights'],
 data['baggy'],
 data['slim'],
 data['activewear'],
 data['sleeveless'],
 data['active'],
 data['athletic'],
 data['relaxed'],
 data['performance'],
 data['pockets'],
 data['drawstring'],
 data['squat'],
 data['tummy'],
 data['movement'],
 data['skinny'],
 data['workout'],
 data['racerback'],
 data['scoop neck'],
 data['v neck'],
 data['raglan'],
 data['tapered'],
 data['lined'],
 data['quick dry'],
 data['padded'],
 data['running'],
 data['ventilated'],
 data['warm up'],
 data['crop top'],
 data['crop hoodie'],
 data['elastic'],
 data['dri fit'],
 data['cropped'],
 data['wicking'],
 data['mid rise'],
 data['active gear'],
 data['running gear'],
 data['split back'])
print(f"Results of ANOVA test:\n The F-statistic is: {fvalue}\n The p-value is: {pvalue}")

Results of ANOVA test:
 The F-statistic is: 12.048623344578461
 The p-value is: 1.3293563590514185e-119


### Results and Conclusions

**Result:** Reject the null hypothesis that mean search volume is equal across all athleisure-related keywords.

**Conclusion:** Keyword on its own, does indeed constitute a difference in average search volume for athleisure-related items.

### Tukey Test
 - We run the Tukey test to examine individual between specific groups, or keywords.

In [13]:
from statsmodels.stats.multicomp import pairwise_tukeyhsd
# perform multiple pairwise comparison (Tukey HSD)
m_comp = pairwise_tukeyhsd(endog=athleisure_df['volume'], groups=athleisure_df['keyword'], alpha=0.05)

In [14]:
tukey_data = pd.DataFrame(data=m_comp._results_table.data[1:], columns = m_comp._results_table.data[0])

group1_comp =tukey_data.loc[tukey_data.reject == True].groupby('group1').reject.count()
group2_comp = tukey_data.loc[tukey_data.reject == True].groupby('group2').reject.count()
tukey_data = pd.concat([group1_comp, group2_comp], axis=1)

tukey_data = tukey_data.fillna(0)
tukey_data.columns = ['reject1', 'reject2']
tukey_data['total_sum'] = tukey_data.reject1 + tukey_data.reject2

of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=False'.


  """


In [15]:
tukey_data.sort_values('total_sum',ascending=False).head(10)

Unnamed: 0,reject1,reject2,total_sum
hoodie,50.0,25.0,75.0
running,28.0,46.0,74.0
sweatshirt,11.0,62.0,73.0
workout,1.0,70.0,71.0
flex,33.0,13.0,46.0
3 stripes,5.0,0.0,5.0
spandex,2.0,3.0,5.0
muscle tank,3.0,2.0,5.0
muscle tee,3.0,2.0,5.0
pullover,3.0,2.0,5.0


### Results and Conclusions

Top 5 terms that are the "most" statistically different than the rest are:
- **hoodie, running, sweatshirt, workout, flex**  

<br>
<br>
<br>


***


### Hypothesis Test 2

Are there any differences between months when considering search volume?

$H_{02}$ - People will be equally likely to search for activewear-related terms in any given month.  
$H_{A2}$ - People will be more likely to search for activewear-related terms depending on the month.

Average Volume(Jan) = Average Volume(Feb) = Average Volume(Mar) = Average Volume(Apr) = Average Volume(May) = Average Volume(Jun) = Average Volume(Jul) = Average Volume(Aug) = Average Volume(Sep) = Average Volume(Oct) = Average Volume(Nov) = Average Volume(Dec)

In [17]:
keys = list(athleisure_df.month_abbr.unique())

values = []
for month in list(athleisure_df.month_abbr.unique()):
    values.append(list(athleisure_df.loc[athleisure_df['month_abbr'] == month, 'volume']))

data = dict(zip(keys, values))

month_df = pd.DataFrame.from_dict(data)

In [18]:
import scipy.stats as stats
# stats f_oneway functions takes the groups as input and returns F and P-value
fvalue, pvalue = stats.f_oneway(month_df['Jan'],
                                month_df['Feb'], 
                                month_df['Mar'], 
                                month_df['Apr'],
                                month_df['May'],
                                month_df['Jun'],
                                month_df['Jul'],
                                month_df['Aug'],
                                month_df['Sep'],
                                month_df['Oct'],
                                month_df['Nov'],
                                month_df['Dec'])

print(f"Results of ANOVA test:\n The F-statistic is: {fvalue}\n The p-value is: {pvalue}")

Results of ANOVA test:
 The F-statistic is: 0.5315732325700696
 The p-value is: 0.8831258135517717


In [19]:
# get ANOVA table as R like output
import statsmodels.api as sm
from statsmodels.formula.api import ols
# reshape the d dataframe suitable for statsmodels package 
month_df_melt = pd.melt(month_df.reset_index(), id_vars=['index'], value_vars=['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
# replace column names
month_df_melt.columns = ['index', 'month', 'volume']
# Ordinary Least Squares (OLS) model
model = ols('volume ~ C(month)', data=month_df_melt).fit()
anova_table = sm.stats.anova_lm(model, typ=2)
anova_table

Unnamed: 0,sum_sq,df,F,PR(>F)
C(month),859181600.0,11.0,0.531573,0.883126
Residual,377332500000.0,2568.0,,


### Results and Conclusions 

**Result:** Fail to reject the null hypothesis that mean search volume is equal across all months.

**Conclusion:** Month on its own, does not constitute a difference in search volumes for athleisure-related items.  

<br>
<br>
There is no need to run the Tukey multiple comparisons test due to failling to reject the null hypothesis
<br>
<br>
<br>
<br>

***

### Hypothesis Test 3 

Are there any differences among search engines when considering search volumes?

$H_{03}$ - There will be an equal search volume for activewear-related terms on any platform.  
$H_{A3}$ - There will be a greater search volume for activewear-related terms on one particular platform.

Average Volume(Google) = Average Volume(YouTube) = Average Volume(Amazon)

In [20]:
keys = list(athleisure_df.engine.unique())

values = []
for engine in list(athleisure_df.engine.unique()):
    values.append(list(athleisure_df.loc[athleisure_df['engine'] == engine, 'volume']))

data = dict(zip(keys, values))

In [21]:
import scipy.stats as stats
# stats f_oneway functions takes the groups as input and returns F and P-value
fvalue, pvalue = stats.f_oneway(data['google'],
                                data['youtube'], 
                                data['amazon'])

print(f"Results of ANOVA test:\n The F-statistic is: {fvalue}\n The p-value is: {pvalue}")

Results of ANOVA test:
 The F-statistic is: 40.08443136373594
 The p-value is: 7.19196465389629e-18


### Results and Conclusions

**Result:** Reject the null hypothesis that mean search volume is equal across all search engines.

**Conclusion:** Search engine on its own, does indeed constitute a difference in average search volume for athleisure-related items.

### Tukey Test

In [22]:
from statsmodels.stats.multicomp import pairwise_tukeyhsd
# perform multiple pairwise comparison (Tukey HSD)
m_comp = pairwise_tukeyhsd(endog=athleisure_df['volume'], groups=athleisure_df['engine'], alpha=0.05)
print(m_comp)

    Multiple Comparison of Means - Tukey HSD, FWER=0.05     
group1  group2  meandiff p-adj    lower      upper    reject
------------------------------------------------------------
amazon  google -3527.733  0.001 -4851.7536 -2203.7124   True
amazon youtube 1442.6709 0.0373    66.3663  2818.9755   True
google youtube 4970.4039  0.001   3615.639  6325.1687   True
------------------------------------------------------------


### Results and Conclusions

**Result:** In all cases, reject the null hypothesis that search engine 1 is equal to search engine 2 in terms of average search volume.

**Conclusion:** Search volumes are unique to each platform.  
<br>

Arranging by mean difference: YouTube > Amazon > Google
<br>
<br>
<br>
***

## 6. 2-Factor ANOVA

Can we determine which specific 2-factor combinations of keyword/month/search engine generate the highest search volume?

$H_{01}$ - All keyword/engine combinations are equal in terms of mean search volume.  
$H_{A1}$ - Some keyword/engine combinations have greater mean search volume.



$H_{02}$ - All keyword/month combinations are equal in terms of mean search volume.  
$H_{A2}$ - Some keyword/month combinations have greater mean search volume.


$H_{03}$ - All engine/month combinations are equal in terms of mean search volume.  
$H_{A3}$ - Some engine/month combinations have greater mean search volume.

In [25]:
anova_df = athleisure_df.loc[:,['volume','keyword','engine','month_abbr']]

In [31]:
# ANOVA results with combinations of 2 groups:
formula = 'volume ~ C(keyword) + C(engine) + C(month_abbr) + C(keyword):C(engine) + C(keyword):C(month_abbr) + C(engine):C(month_abbr)'
lm = ols(formula, anova_df).fit()
table = sm.stats.anova_lm(lm, typ=2)
print(table)



                                sum_sq      df          F         PR(>F)
C(keyword)                1.021546e+11    76.0  19.630897  9.219712e-149
C(engine)                 1.072592e+10     2.0  78.324956   4.462158e-33
C(month_abbr)             8.591816e+08    11.0   1.140743   3.249134e-01
C(keyword):C(engine)      1.156474e+11   152.0  11.111892  1.008919e-151
C(keyword):C(month_abbr)  5.446891e+10   836.0   0.951564   7.896266e-01
C(engine):C(month_abbr)   1.143128e+09    22.0   0.758871   7.789742e-01
Residual                  1.024321e+11  1496.0        NaN            NaN


### Results and Conclusions

**Result:**
- Reject the null hypothesis that the mean search volume is equal among all Keyword/Engine combinations
- Fail to reject the null hypothesis that the mean search volume is equal among all Keyword/Month and Engine/Month combinations  

**Conclusion:**
- Keyword/Engine combinations mean search volumes are statistically different from each other
- Keyword/Month and Engine/Month are not statistically different from each other
<br>
<br>
<br>
***

### Tukey Test

In [35]:
anova_df['combination'] = anova_df.keyword + " / " + anova_df.engine

In [36]:
from statsmodels.stats.multicomp import pairwise_tukeyhsd
# perform multiple pairwise comparison (Tukey HSD)
m_comp = pairwise_tukeyhsd(endog=anova_df['volume'], groups=anova_df['combination'], alpha=0.05)

In [37]:
tukey_data = pd.DataFrame(data=m_comp._results_table.data[1:], columns = m_comp._results_table.data[0])

group1_comp =tukey_data.loc[tukey_data.reject == True].groupby('group1').reject.count()
group2_comp = tukey_data.loc[tukey_data.reject == True].groupby('group2').reject.count()
tukey_data = pd.concat([group1_comp, group2_comp], axis=1)

tukey_data = tukey_data.fillna(0)
tukey_data.columns = ['reject1', 'reject2']
tukey_data['total_sum'] = tukey_data.reject1 + tukey_data.reject2

tukey_data.sort_values('total_sum',ascending=False).head(20)

of pandas will change to not sort by default.

To accept the future behavior, pass 'sort=False'.


  """


Unnamed: 0,reject1,reject2,total_sum
running / youtube,85.0,128.0,213.0
hoodie / youtube,142.0,71.0,213.0
sweatshirt / youtube,35.0,177.0,212.0
workout / amazon,4.0,206.0,210.0
flex / youtube,148.0,60.0,208.0
hoodie / amazon,141.0,67.0,208.0
leggings / amazon,127.0,67.0,194.0
workout / youtube,2.0,191.0,193.0
joggers / amazon,126.0,63.0,189.0
cotton / youtube,155.0,28.0,183.0


### Results and Conclusions

There are 10 Keyword/Engine combinations that are significantly different in search volume than the rest of the combinations:  

**running / youtube**  
**hoodie / youtube**  
**sweatshirt / youtube**   
**workout / amazon**  
**flex / youtube**  
**hoodie / amazon**  
**leggings / amazon**    
**workout / youtube**  
**joggers / amazon**  
**cotton / youtube**  
<br>
<br>
<br>

***

# 7. 3-Factor ANOVA

In [33]:
anova_df['combination2'] = anova_df.keyword + " / " + anova_df.month_abbr + " / " + anova_df.engine

**Cannot run below due to lack of processing power**

In [32]:
# ANOVA results with combinations of 2 groups:
formula = 'volume ~ C(keyword) + C(engine) + C(month_abbr) + C(keyword):C(engine) + C(keyword):C(month_abbr) + C(engine):C(month_abbr) + C(engine):C(month_abbr):C(keyword)'
lm = ols(formula, anova_df).fit()
table = sm.stats.anova_lm(lm, typ=2)

ValueError: array must not contain infs or NaNs

In [None]:
from statsmodels.stats.multicomp import pairwise_tukeyhsd
# perform multiple pairwise comparison (Tukey HSD)
m_comp = pairwise_tukeyhsd(endog=anova_df['volume'], groups=anova_df['combination2'], alpha=0.05)
print(m_comp)

In [None]:
tukey_data = pd.DataFrame(data=m_comp._results_table.data[1:], columns = m_comp._results_table.data[0])

group1_comp =tukey_data.loc[tukey_data.reject == True].groupby('group1').reject.count()
group2_comp = tukey_data.loc[tukey_data.reject == True].groupby('group2').reject.count()
tukey_data = pd.concat([group1_comp, group2_comp], axis=1)

tukey_data = tukey_data.fillna(0)
tukey_data.columns = ['reject1', 'reject2']
tukey_data['total_sum'] = tukey_data.reject1 + tukey_data.reject2

tukey_data.sort_values('total_sum',ascending=False).head(20)