# one-tailed t-test

In [8]:
from scipy import stats
import numpy as np

new_machine = [42.1, 41, 41.3, 41.8, 42.4, 42.8, 43.2, 42.3, 41.8, 42.7]
old_machine = [42.7, 43.6, 43.8, 43.3, 42.5, 43.5, 43.1, 41.7, 44, 44.1]

t_stat, p_value_two_tailed = stats.ttest_ind(new_machine, old_machine)
p_value_one_tailed = p_value_two_tailed / 2

mean_new_machine = np.mean(new_machine)
mean_old_machine = np.mean(old_machine)

(mean_new_machine, mean_old_machine, t_stat, p_value_one_tailed)



(42.14, 43.230000000000004, -3.3972307061176026, 0.0016055712503872579)

Null hypothesis: The new machine does not pack faster than the old machine. Since the p-value is less than 0.05. We reject the null hypothesis and conclude that the new machine is faster on average than the old machine.

# Matched Pairs Test

In [11]:
import pandas as pd
from scipy import stats


pokemon_df = pd.read_csv('pokemon.csv')

print(pokemon_df.head())

t_stat, p_value = stats.ttest_rel(pokemon_df['Attack'], pokemon_df['Defense'])

print(f"t-statistic: {t_stat}, p-value: {p_value}")




   #                   Name Type 1  Type 2  Total  HP  Attack  Defense  \
0  1              Bulbasaur  Grass  Poison    318  45      49       49   
1  2                Ivysaur  Grass  Poison    405  60      62       63   
2  3               Venusaur  Grass  Poison    525  80      82       83   
3  3  VenusaurMega Venusaur  Grass  Poison    625  80     100      123   
4  4             Charmander   Fire     NaN    309  39      52       43   

   Sp. Atk  Sp. Def  Speed  Generation  Legendary  
0       65       65     45           1      False  
1       80       80     60           1      False  
2      100      100     80           1      False  
3      122      120     80           1      False  
4       60       50     65           1      False  
t-statistic: 4.325566393330478, p-value: 1.7140303479358558e-05


The p-value is much smaller than our significance level of 0.05, we reject the null hypothesis. This means there is a statistically significant difference between the Pokémon's defense and attack scores in the dataset provided.

# ANOVA

State the null hypothesis: There is no effect of the power level of the plasma beam on the etching rate, meaning the mean etching rates are equal across all levels of power.


State the alternate hypothesis: There is an effect of the power level on the etching rate, meaning at least one group's mean etching rate is different from the others.


What is the significance level: This is typically set at 0.05 for most scientific studies unless there is a reason to choose a different level.

In [14]:
import pandas as pd
import scipy.stats as stats


data_path = 'anova_lab_data.xlsx'
anova_df = pd.read_excel(data_path)

anova_df.columns = anova_df.columns.str.strip()

f_value, p_value = stats.f_oneway(*[anova_df['Etching Rate'][anova_df['Power'] == level] for level in anova_df['Power'].unique()])

print(f"F-value: {f_value}, P-value: {p_value}")



F-value: 36.87895470100505, P-value: 7.506584272358903e-06


Since the p-value is much less than the common significance level of 0.05, we can reject the null hypothesis.

The conclusion from this experiment is that the power of the plasma beam does indeed have an effect on the etching rate.​