# LAB | Two Sample Hypothesis Tests

### Before your start:
- Read the README.md file
- Comment as much as you can and use the resources (README.md file)
- Happy learning!

In [1]:
# import numpy and pandas
import numpy as np
import pandas as pd

import scipy.stats as stats


## Challenge 1 - Independent Sample T-tests

In this challenge, we will be using the Pokemon dataset. Before applying statistical methods to this data, let's first examine the data.

To load the data, run the code below.

In [2]:
# Run this code:
pokemon = pd.read_csv('../data/pokemon.csv')


1. Let's start off by looking at the `head` function in the cell below.

In [3]:
# Your code here:
pokemon.head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False


2. The first thing we would like to do is compare the legendary Pokemon to the regular Pokemon. To do this, we should examine the data further. What is the count of legendary vs. non legendary Pokemons?

In [4]:
# Your code here:
s1 = pokemon[pokemon['Legendary']==True]  # for legendary
s2 = pokemon[pokemon['Legendary']==False] # for normal

print("The count of Legendary Pokemons:", s1.shape[0])
print('The count of non Legendary Pokemons:', s2.shape[0])

The count of Legendary Pokemons: 65
The count of non Legendary Pokemons: 735


3. Compute the mean and standard deviation of the total points for both legendary and non-legendary Pokemon.

In [5]:
# Your code here:
s1_leg_mean = s1['Total'].mean()
s1_leg_std = s1['Total'].std()

s2_noleg_mean = s2['Total'].mean()
s2_noleg_std = s2['Total'].std()

print('Legendary Pokemons - Mean:', s1_leg_mean)
print('Legendary Pokemons - St Dv:', s1_leg_std)

print('no Legendary Pokemons - Mean:', s2_noleg_mean)
print('no Legendary Pokemons - St Dv:', s2_noleg_std)


Legendary Pokemons - Mean: 637.3846153846154
Legendary Pokemons - St Dv: 60.93738905315346
no Legendary Pokemons - Mean: 417.21360544217686
no Legendary Pokemons - St Dv: 106.7604174571302


4. The computation of the mean might give us a clue regarding how the statistical test may turn out; However, it certainly does not prove whether there is a significant difference between the two groups.

In the cell below, use the `ttest_ind` function in `scipy.stats` to compare the the total points for legendary and non-legendary Pokemon. Since we do not have any information about the population, assume the variances are not equal.

In [6]:
# Perform t-test on the 'Total' column of both groups
result = stats.ttest_ind(s1['Total'], s2['Total'], alternative='two-sided', equal_var=False)
print(result)

# Perform the t-test
t_stat, p_value = stats.ttest_ind(s1['Total'], s2['Total'], equal_var=False)
print(f"T-statistic: {t_stat:.2f}")
print(f"P-value: {p_value}")

Ttest_indResult(statistic=25.8335743895517, pvalue=9.357954335957446e-47)
T-statistic: 25.83
P-value: 9.357954335957446e-47


5. What do you conclude from this test? Write your conclusions below.

---

### **Step 1: Understand the Numbers**
- **T-statistic**: `25.83`  
  - This tells you *how far apart* the means of the two groups (legendary vs. non-legendary) are, relative to the variation in the data.  
  - A larger absolute value (like 25.83) means a bigger difference between groups.  

- **P-value**: `9.35e-47` (which is **0.000000000000000000000000000000000000000000935**)  
  - This is the probability of seeing this difference *if there was actually no difference* between legendary and non-legendary Pokémon (i.e., if the null hypothesis were true).  

---

### **Step 2: Compare the P-value to a Significance Level**  
- **Significance level (alpha)**: Usually set to **0.05** (5%).  
- **Rule**:  
  - If **p-value < 0.05**: Reject the null hypothesis (the difference is statistically significant).  
  - If **p-value ≥ 0.05**: Fail to reject the null hypothesis (no significant difference).  

**Your result**:  
`9.35e-47` is **WAY smaller than 0.05** (it’s practically zero).  

---

### **Step 3: Conclusion**  
- **Reject the null hypothesis**.  
- **Interpretation**: There is **strong evidence** that legendary Pokémon have a **different average total points** compared to non-legendary Pokémon.  

---

### **Real-World Analogy**  
Imagine testing two fertilizers on plants:  
- **Fertilizer A**: Plants grow 10cm on average.  
- **Fertilizer B**: Plants grow 15cm on average.  
If your p-value is tiny (like yours), you’d conclude: *"Fertilizer B truly makes plants grow taller, and this difference isn’t due to random chance!"*  

---

### **Connecting to Your Data**  
Recall from **Question 3**:  
- You calculated the mean `Total` points for both groups.  
- Legendary Pokémon almost certainly have a **much higher mean** (since the t-statistic is **positive** and huge).  

**Your final conclusion**:  
*"Legendary Pokémon have significantly higher total points than non-legendary Pokémon (p < 0.05). This difference is extremely unlikely to be due to random chance."*  

---

In [7]:
# Your conclusions here:

print("Conclusion:")
print("Reject the null hypothesis. There is a statistically significant difference")
print("in total points between legendary and non-legendary Pokémon (p < 0.05).")

Conclusion:
Reject the null hypothesis. There is a statistically significant difference
in total points between legendary and non-legendary Pokémon (p < 0.05).


6. How about we try to compare the different types of pokemon? In the cell below, list the types of Pokemon from column `Type 1` and the count of each type.

In [8]:
# Your code here:
pokemon['Type 1'].value_counts()


Type 1
Water       112
Normal       98
Grass        70
Bug          69
Psychic      57
Fire         52
Electric     44
Rock         44
Dragon       32
Ground       32
Ghost        32
Dark         31
Poison       28
Steel        27
Fighting     27
Ice          24
Fairy        17
Flying        4
Name: count, dtype: int64

7. Since water is the largest group of Pokemon, compare the mean and standard deviation of water Pokemon to all other Pokemon.

In [9]:
# Your code here:

# For Water-type Pokémon
water_mean = pokemon[pokemon['Type 1'] == 'Water']['Total'].mean()  # Mean
water_std = pokemon[pokemon['Type 1'] == 'Water']['Total'].std()    # Standard deviation

# For ALL Pokémon (population)
total_mean = pokemon['Total'].mean()  # Population mean
total_std = pokemon['Total'].std()    # Population standard deviation


8. Perform a hypothesis test comparing the mean of total points for water Pokemon to all non-water Pokemon. Assume the variances are equal. 

In [10]:
# Your code here:

water_p = pokemon[pokemon['Type 1'] == 'Water']  # df water pokemon 
non_water_p = pokemon[pokemon['Type 1'] != 'Water']  # df non water pokemon


t_stat, p_value = stats.ttest_ind(water_p['Total'], non_water_p['Total'], alternative='two-sided', equal_var=False)

# water pokemon total poi

print(f"T-statistic: {t_stat:.2f}")
print(f"P-value: {p_value:.4f}")



T-statistic: -0.46
P-value: 0.6434


In [11]:
from scipy.stats import ttest_1samp

# Extract Water-type Total points
water_total = pokemon[pokemon['Type 1'] == 'Water']['Total']

# Perform one-sample t-test against the population mean (total_mean)
t_stat, p_value = ttest_1samp(water_total, total_mean)

print(f"T-statistic: {t_stat:.2f}")
print(f"P-value: {p_value:.4f}")

# Conclusion
alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis: Water-type Pokémon differ significantly from the population average.")
else:
    print("No significant difference found.")

T-statistic: -0.43
P-value: 0.6648
No significant difference found.


9. Write your conclusion below.

In [12]:
# Your conclusions here:
# Conclusion
alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis: Water-type Pokémon differ significantly from the population average.")
else:
    print("No significant difference found.")

No significant difference found.


# Challenge 2 - Matched Pairs Test

In this challenge we will compare dependent samples of data describing our Pokemon. Our goal is to see whether there is a significant difference between each Pokemon's defense and attack scores. Our hypothesis is that the defense and attack scores are equal. In the cell below, import the `ttest_rel` function from `scipy.stats` and compare the two columns to see if there is a statistically significant difference between them.

In [13]:
# Your code here:
attack = pokemon['Attack']
defense = pokemon['Defense']


t_stat,  p_value = stats.ttest_rel(attack, defense)

print(f"T-statistic: {t_stat:.2f}")
print(f"P-value: {p_value}")

T-statistic: 4.33
P-value: 1.7140303479358558e-05


Describe the results of the test in the cell below.

Reject the null hypothesis. There is a statistically significant difference between Pokémon attack and defense scores (p < 0.05). On average, attack scores are higher than defense scores.

In [14]:
# Your conclusions here:

# Interpret the results
alpha = 0.05
print("Conclusion:")
if p_value < alpha:
    if t_stat > 0:
        print("Reject null hypothesis: Pokémon have significantly higher Attack scores than Defense scores.")
    else:
        print("Reject null hypothesis: Pokémon have significantly higher Defense scores than Attack scores.")
else:
    print("Fail to reject null hypothesis: No significant difference between Attack and Defense scores.")

Conclusion:
Reject null hypothesis: Pokémon have significantly higher Attack scores than Defense scores.


We are also curious about whether therer is a significant difference between the mean of special defense and the mean of special attack. Perform the hypothesis test in the cell below. 

In [15]:
# Your code here:


# Your code here:
sp_attack = pokemon['Sp. Atk']
sp_defense = pokemon['Sp. Def']


t_stat,  p_value = stats.ttest_rel(sp_attack, sp_defense)

print(f"T-statistic: {t_stat:.2f}")
print(f"P-value: {p_value}")

T-statistic: 0.85
P-value: 0.3933685997548122


Describe the results of the test in the cell below.

In [16]:
# Your conclusions here:

# Add a conclusion
alpha = 0.05
if p_value < alpha:
    if t_stat > 0:
        print("Conclusion: Reject H₀. Special Attack is significantly HIGHER than Special Defense.")
    else:
        print("Conclusion: Reject H₀. Special Defense is significantly HIGHER than Special Attack.")
else:
    print("Conclusion: No significant difference between Special Attack and Special Defense.")

Conclusion: No significant difference between Special Attack and Special Defense.


As you may recall, a two sample matched pairs test can also be expressed as a one sample test of the difference between the two dependent columns.

Import the `ttest_1samp` function and perform a one sample t-test of the difference between defense and attack. Test the hypothesis that the difference between the means is zero. Confirm that the results of the test are the same.

In [17]:
# Your code here:
difference = pokemon['Attack'] - pokemon['Defense']
difference.describe()

count    800.000000
mean       5.158750
std       33.732342
min     -220.000000
25%      -14.250000
50%        5.000000
75%       25.000000
max      160.000000
dtype: float64

In [18]:
t_stat, p_value = stats.ttest_1samp(difference, popmean=0)
print(f"T-statistic: {t_stat:.2f}")
print(f"P-value: {p_value}")

T-statistic: 4.33
P-value: 1.7140303479358558e-05


# Bonus Challenge - The Chi-Square Test

The Chi-Square test is used to determine whether there is a statistically significant difference in frequencies. In other words, we are testing whether there is a relationship between categorical variables or rather when the variables are independent. This test is an alternative to Fisher's exact test and is used in scenarios where sample sizes are larger. However, with a large enough sample size, both tests produce similar results. Read more about the Chi Squared test [here](https://en.wikipedia.org/wiki/Chi-squared_test).

In the cell below, create a contingency table using `pd.crosstab` comparing whether a Pokemon is legenadary or not and whether the Type 1 of a Pokemon is water or not.

In [19]:
pokemon['iswater'] = pokemon['Type 1']=='Water'
# s1 = pokemon[pokemon['Legendary']==True]  # for legendary
pokemon[pokemon['iswater']==True]

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary,iswater
9,7,Squirtle,Water,,314,44,48,65,50,64,43,1,False,True
10,8,Wartortle,Water,,405,59,63,80,65,80,58,1,False,True
11,9,Blastoise,Water,,530,79,83,100,85,105,78,1,False,True
12,9,BlastoiseMega Blastoise,Water,,630,79,103,120,135,115,78,1,False,True
59,54,Psyduck,Water,,320,50,52,48,65,50,55,1,False,True
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
724,656,Froakie,Water,,314,41,56,40,62,44,71,6,False,True
725,657,Frogadier,Water,,405,54,63,52,83,56,97,6,False,True
726,658,Greninja,Water,Dark,530,72,95,67,103,71,122,6,False,True
762,692,Clauncher,Water,,330,50,53,62,58,63,44,6,False,True


In [20]:
# Your code here:

contingency_table = pd.crosstab(index=pokemon['Legendary'], columns=pokemon['iswater'])
contingency_table

iswater,False,True
Legendary,Unnamed: 1_level_1,Unnamed: 2_level_1
False,627,108
True,61,4


Perform a chi-squared test using the `chi2_contingency` function in `scipy.stats`. You can read the documentation of the function [here](https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.stats.chi2_contingency.html).

In [21]:
# Your code here:
chi2_stat, p_value, dof, expected = stats.chi2_contingency(contingency_table)

# Print results  
print(f"Chi-Square Statistic: {chi2_stat:.2f}")  
print(f"P-value: {p_value:.6f}")  

Chi-Square Statistic: 2.94
P-value: 0.086255


Based on a 95% confidence, should we reject the null hypothesis?

In [25]:
# Interpret the results
confidence = 0.95
alpha = 1 - confidence

print("Conclusion:")
if p_value < alpha:
    print('Reject H₀.: There is an association between the variables.')
    # if t_stat > 0:
    #     print("Reject null hypothesis: Pokémon have significantly higher Attack scores than Defense scores.")
    # else:
    #     print("Reject null hypothesis: Pokémon have significantly higher Defense scores than Attack scores.")
else:
    print("Fail to reject H₀: There is no association between the variables (they are independent).")

Conclusion:
Fail to reject H₀: There is no association between the variables (they are independent).
