# Lab | Inferential statistics - T-test & P-value

### Instructions

1. *One-tailed t-test* - In a packing plant, a machine packs cartons with jars. It is supposed that a new machine will pack faster on average than the machine currently used. To test that hypothesis, the times each machine takes to pack ten cartons are recorded. The results are in seconds in the tables in the file `files_for_lab/machine.txt`.
   Assume that there is sufficient evidence to conduct the t-test, does the data provide sufficient evidence to show if one machine is better than the other?

In [1]:
import pandas as pd
from scipy.stats import ttest_rel

In [27]:
data = pd.read_csv("files_for_lab/machine.txt", sep='\t', encoding='utf-16')


In [28]:
data = data.rename(columns={'    Old machine': 'old_machine', 'New machine': 'new_machine'})

In [41]:
data

Unnamed: 0,new_machine,old_machine
0,42.1,42.7
1,41.0,43.6
2,41.3,43.8
3,41.8,43.3
4,42.4,42.5
5,42.8,43.5
6,43.2,43.1
7,42.3,41.7
8,41.8,44.0
9,42.7,44.1


## H0: The new machine works with the same speed as the old one. $\mu$1 = $\mu$2

In [44]:
# Extract the relevant columns into NumPy arrays
new_machine = np.array(data['new_machine'])
old_machine = np.array(data['old_machine'])

# paired t-test 
t_statistic, p_value = ttest_rel(new_machine, old_machine)

# significance level
alpha = 0.05

# one-tailed, divide the p-value by 2
p_value /= 2

# Print results
print("T-Statistic:", t_statistic)
print("P-Value (one-tailed):", p_value)

# Check the hypothesis
if p_value < alpha:
    print("Reject the null hypothesis. There is sufficient evidence that the new machine works faster than the old machine.")
else:
    print("Fail to reject the null hypothesis. There is not sufficient evidence that the new machine works faster than the old machine.")

T-Statistic: -3.0614273841115844
P-Value (one-tailed): 0.0067701678258162605
Reject the null hypothesis. There is sufficient evidence that the new machine works faster than the old machine.


2. *Matched Pairs Test* - In this challenge we will compare dependent samples of data describing our Pokemon (file `files_for_lab/pokemon.csv`). Our goal is to see whether there is a significant difference between each Pokemon's defense and attack scores. Our hypothesis is that the defense and attack scores are equal. Compare the two columns to see if there is a statistically significant difference between them and comment on your result.

In [49]:
pokemon_data = pd.read_csv('files_for_lab/pokemon.csv')

In [50]:
pokemon_data

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...
795,719,Diancie,Rock,Fairy,600,50,100,150,100,150,50,6,True
796,719,DiancieMega Diancie,Rock,Fairy,700,50,160,110,160,110,110,6,True
797,720,HoopaHoopa Confined,Psychic,Ghost,600,80,110,60,150,130,70,6,True
798,720,HoopaHoopa Unbound,Psychic,Dark,680,80,160,60,170,130,80,6,True


## H0: There is no difference between defense and attack scores for Pokemons. $\mu$1 = $\mu$2

In [54]:
sample = pokemon_data.sample(n=30, random_state=42)  

# Extract the relevant columns into NumPy arrays
defense_sample = sample['Defense']
attack_sample = sample['Attack']

# Perform a paired t-test (matched pairs test) on the sample
t_statistic, p_value = ttest_rel(defense_sample, attack_sample)

# Define significance level
alpha = 0.05

# Print results
print("T-Statistic:", t_statistic.round(3))
print("P-Value:", p_value.round(3))

if p_value < alpha:
    print("Reject the null hypothesis. There is a significant difference between defense and attack scores for Pokemon.")
else:
    print("Fail to reject the null hypothesis. There is no significant difference between defense and attack scores for Pokemon.")

T-Statistic: 2.061
P-Value: 0.048
Reject the null hypothesis. There is a significant difference between defense and attack scores for Pokemon.


# Inferential statistics - ANOVA

Note: The following lab is divided into 2 sections which represent activities 3 and 4.

## Part 1

In this activity, we will look at another example. Your task is to understand the problem and write down all the steps to set up ANOVA. After the next lesson, we will ask you to solve this problem using Python. Here are the steps that you would need to work on:
    - Null hypothesis
    - Alternate hypothesis
    - Level of significance
    - Test statistic
    - P-value
    - F table

### Context

Suppose you are working as an analyst in a microprocessor chip manufacturing plant. You have been given the task of analyzing a plasma etching process with respect to changing Power (in Watts) of the plasma beam. Data was collected and provided to you to conduct statistical analysis and check if changing the power of the plasma beam has any effect on the etching rate by the machine. You will conduct ANOVA and check if there is any difference in the mean etching rate for different levels of power. You can find the data `anova_lab_data.xlsx` file in the `files_for_lab` folder  

- State the null hypothesis
- State the alternate hypothesis
- What is the significance level
- What are the degrees of freedom of the model, error terms, and total DoF

Data were collected randomly and provided to you in the table as shown: [link to the image - Data](https://education-team-2020.s3-eu-west-1.amazonaws.com/data-analytics/7.05/7.05-lab_data.png)


## Part 2

- In this section, use Python to conduct ANOVA.
- What conclusions can you draw from the experiment and why?


In [61]:
anova_data = pd.read_excel('files_for_lab/anova_lab_data.xlsx')

  H0: The mean etching rate is the same for all levels of power in the plasma beam
    H1: There is a significant difference in the mean etching rate for different levels of power in the plasma beam.

Significance Le ly set at 0.05.

Degrees o- eedom:

    Degrees of Freedom of the Model (Between Groups): df_model = 2 (number o- oups - 1)
    Degrees of Freedom of the Error (Within Groups): df_error = 12 (total observations - num- of groups)
    Total Degrees of Freedom: df_total = 14 (total observations - 1)

ANOVA Summary:

    Source of Variation:
        Power
    Sum of Squares:
        18,783
    Degrees of Freedom:
        2
    Mean Squares:
        9,3915
    F-Statistics:
        35,77714286
    P-Value:
        <0.0001 (very low, indicating strong evidence against the null hypothesis)

In [66]:
anova_data

Unnamed: 0,Power,Etching Rate
0,160 W,5.43
1,180 W,6.24
2,200 W,8.79
3,160 W,5.71
4,180 W,6.71
5,200 W,9.2
6,160 W,6.22
7,180 W,5.98
8,200 W,7.9
9,160 W,6.01


In [65]:
anova_data.columns

Index(['Power ', 'Etching Rate'], dtype='object')

In [68]:
from scipy.stats import f_oneway

# Perform ANOVA
f_statistic, p_value = f_oneway(anova_data[anova_data['Power '] == '160 W']['Etching Rate'],
                                 anova_data[anova_data['Power '] == '180 W']['Etching Rate'],
                                 anova_data[anova_data['Power '] == '200 W']['Etching Rate'])

# Print the ANOVA results
print("F-Statistic:", f_statistic)
print("P-Value:", p_value)

# Define significance level
alpha = 0.05

# Check the hypothesis
if p_value < alpha:
    print("Reject the null hypothesis. There is a significant difference in the mean etching rate for different power levels.")
else:
    print("Fail to reject the null hypothesis. There is no significant difference in the mean etching rate for different power levels.")

F-Statistic: 36.87895470100505
P-Value: 7.506584272358903e-06
Reject the null hypothesis. There is a significant difference in the mean etching rate for different power levels.


Based on the ANOVA results, the p-value is very low (<0.0001), which suggests strong evidence against the null hypothesis.

This means that at least one power level significantly differs from the others in terms of the etching rate. Further post-hoc tests can be conducted to identify which specific power levels show significant differences.

In [74]:
from statsmodels.stats.multicomp import pairwise_tukeyhsd
tukey_results = pairwise_tukeyhsd(anova_data['Etching Rate'], anova_data['Power '])
print(tukey_results)

Multiple Comparison of Means - Tukey HSD, FWER=0.05
group1 group2 meandiff p-adj   lower  upper  reject
---------------------------------------------------
 160 W  180 W    0.446 0.3618 -0.3916 1.2836  False
 160 W  200 W    2.526    0.0  1.6884 3.3636   True
 180 W  200 W     2.08 0.0001  1.2424 2.9176   True
---------------------------------------------------


Based on the Tukey-Kramer post-hoc test, the null hypothesis can be rejected, because of comparisons between 160 W vs. 200 W and 180 W vs. 200 W, indicating that there is a significant difference in the mean etching rates for these power levels. However, there is no significant difference between the mean etching rates for 160 W and 180 W.