This is the notebook containing all the code for the post about [carrying out two-sample t-test with Python](https://www.marsja.se/how-to-perform-a-two-sample-t-test-with-python-3-different-methods/). 

## Importing data
Here's the code to import the example data:

In [1]:
import pandas as pd

data = 'https://gist.githubusercontent.com/baskaufs/1a7a995c1b25d6e88b45/raw/4bb17ccc5c1e62c27627833a4f25380f27d30b35/t-test.csv'
df = pd.read_csv(data)

df.head()

Unnamed: 0,grouping,height
0,men,181.5
1,men,187.3
2,men,175.3
3,men,178.3
4,men,169.0


## How to Subset Data
Here's how to create two variables by subsetting them:

In [2]:
# Subset data
male = df.query('grouping == "men"')['height']
female = df.query('grouping == "women"')['height']

## Summary Statistics
Here's how to quickly do summary statistics of the data:

In [3]:
df.groupby('grouping').describe()

Unnamed: 0_level_0,height,height,height,height,height,height,height,height
Unnamed: 0_level_1,count,mean,std,min,25%,50%,75%,max
grouping,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2
men,7.0,179.871429,6.216836,169.0,176.8,181.5,183.85,187.3
women,7.0,171.057143,5.697619,165.2,166.65,170.3,173.75,181.1


## Normality Tests:

In [None]:
from scipy import stats

stats.shapiro(male)

In [None]:
stats.shapiro(female)

## Levene's test of equal variances:

In [None]:
stats.levene(male, female)

## T-test using Scipy:

In [None]:
res = stats.ttest_ind(male, female, 
                      equal_var=True)

display(res)

## T-test with Pingouin:

In [None]:
import pingouin as pg

res = pg.ttest(male, female, correction=False)


display(res)

## T-test using Statsmodels:

In [None]:
from statsmodels.stats.weightstats import ttest_ind


ttest_ind(male, female)

## Plotting
Here's how to do a boxplot and a violin plot:


### Boxplot in Python:


In [None]:
import seaborn as sns

sns.boxplot(x='grouping', y='height', data=df)

### Violin Plot with Python:


In [None]:
import seaborn as sns

sns.violinplot(x='grouping', y='height', data=df)