# Lab 04

## Section 1: Group Comparisons with Continuous Data

The purpose of this lab is to learn some of the functions used to perform exploratory data analysis and statistical analysis on data that is continuous in nature. 

For this lab, we'll use the **Analysis of Variance (ANOVA)** technique to understand the difference in heights and weights between males of different countries. Are men of different countries significantly different heights from each other?


In [None]:
import pandas as pd
import streamlit as st
import altair as alt
import seaborn as sns
from statsmodels.stats.multicomp import pairwise_tukeyhsd

### 1. Read the `males_ht_wt_ctry.csv` file into a dataframe

In [None]:
males = pd.read_csv('males_ht_wt_cntry.csv')
males

### 2. Examine the data

* Display some rows tomake sure it imported correctly.
* Generate histograms of heights by country using Altair or Seaborn
* Generate histograms of weights by country using Altair or Seaborn

In [None]:
# Selector for Height or Weight
metric = st.selectbox(
    label = 'Metric',
    options = ['Height','Weight'],
    index = 0
)

# Selector for which Countries to show
countries = st.multiselect(
    label = 'Countries',
    options = males['Country'].unique(),
    default = males['Country'].unique()
)

# Create a subset of our data based on the selected Countries and Metric
males_subset = males[males['Country'].isin(countries)][['Country',metric]]

# Build a histogram
chart = alt.Chart(males_subset).mark_bar().encode(
    x=alt.X(metric, title=metric, bin=alt.Bin(maxbins=20)),
    y=alt.Y("count()", title="Count"),
    color="Country",
    column="Country"
)

st.altair_chart(chart)

In [None]:
# Seaborn has a great kdeplot that will show both histogram and kde
sns.histplot(males,
        x='Weight',
        hue='Country',
        alpha=.2,
        kde=True)

### 3. Conduct ANOVA to determine if the weights differ by nationality

Use this [link](https://towardsdev.com/anova-test-in-python-af106c5142eb) as a reference. 

Make sure you use Levene’s test to check if the variance is close to equal.

In [None]:
from scipy import stats

# Create separate dataframes for each Country
groups_w = [males.loc[males['Country'] == c, 'Weight'].dropna() for c in males['Country'].unique()]

F, p = stats.f_oneway(*groups_w)

print(f'ANOVA F-stat: {F:.3f}')
print(f'ANOVA p-val:  {p:.3g}')

In [None]:
lev, p = stats.levene(*groups_w, center='median')

print(f'Levene Stat:   {lev:.3f}')
print(f'Levene p-val:  {p:.3g}')

H0: The weights of males are the same across all countries.

H1: The weights of males from at least one country is different than the other countries.


The very small p-value for the ANOVA tells us that we must reject the null hypothesis an conclude that at least the males from one country have weights that differ significantly from the other countries.

We **H0** and accept **H1**.


We also used the LEVENE test to understand if the **variances of male weights** across different countries are the same. We got a p-value >0.05 in this case and cannot reject the null hypothesis that the variances are the same. Therefore, we can use a standard ANOVA rather than Welch's ANOVA.


### 4. Groupwise Comparison

Using [this link](https://www.statology.org/two-sample-t-test-python/) as a reference, compare the Italian to the Dutch. Then compare the American to the Dutch.

In [None]:
italy, usa, netherlands = [males.loc[males['Country'] == c, 'Weight'].dropna() for c in ['Italy','USA','Netherlands']]

t, p = stats.ttest_ind(italy, netherlands)
print(f'Italy-Netherlands : t-stat: {t:.3f}')
print(f'                  : p-val : {p:.3g}')

t, p = stats.ttest_ind(usa, netherlands)
print(f'USA  -Netherlands : t-stat: {t:.3f}')
print(f'                  : p-val : {p:.3g}')

t, p = stats.ttest_ind(usa, italy)
print(f'USA  -Italy       : t-stat: {t:.3f}')
print(f'                  : p-val : {p:.3g}')


What we can conclude from this groupwise analysis is that the Italians are different in Weight than those from the Netherlands and those from the USA. But the USA and Netherlands males cannot be distinguished.

### 5. False Positives?

Conducting multiple tests like this increases the odds of getting false significant results. If you had conducted tests for 3 comparisons (Italian vs Dutch, Italian vs American, American vs Dutch), what is the probability one of these t-tests is not actually significant (i.e. false positive)?

In [None]:
# The false positive rate for one trial is 0.05
# Meaning we have a 0.95 chance of being right
# But to be right 3 times is 0.95 * 0.95 *.95
# The chance of a false positive is then 1 - (0.95 ^ 3)
print(f'Probability of a false positive ', 1-(0.95**3))

### 6. Family-Wise Error Rate (FWER)

6.	When comparing these groups, it’s better to control the Family-Wise Error Rate (FWER). Use a multiple comparison procedure with a Tukey adjustment. See [this link](https://www.statology.org/tukey-test-python/) for how to do this in using the pairwise_tukeyhsd() function (statsmodels.stats.multicomp.pairwise_tukeyhsd).

In [None]:
# Pairwise Tukey
pw_cmp = pairwise_tukeyhsd(endog=males['Weight'], groups=males['Country'], alpha=0.05)

print(pw_cmp)


In the Tukey results, we can see the p-value for the USA/Netherlands comparison is even larger (0.9119 versus 0.695). This is because Tukey adjusts up for the likelihood of a false positive increasing.