# Before your start:
- Read the README.md file
- Comment as much as you can and use the resources (README.md file)
- Happy learning!

- **Consider a significance level of 5% for all tests.**

In [1]:
# import numpy and pandas
import numpy as np 
import pandas as pd

# additional libraries for further statistical analysis
from scipy.stats import ttest_ind
from scipy.stats import ttest_1samp
from scipy.stats import ttest_rel
from scipy.stats import chi2_contingency

# Challenge 1 - Independent Sample T-tests

In this challenge, we will be using the Pokemon dataset. Before applying statistical methods to this data, let's first examine the data.

To load the data, run the code below.

In [2]:
# Run this code:

pokemon = pd.read_csv('../pokemon.csv')

Let's start off by looking at the `head` function in the cell below.

In [3]:
# Your code here:

pokemon.head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False


The first thing we would like to do is compare the legendary Pokemon to the regular Pokemon. To do this, we should examine the data further. What is the count of legendary vs. non legendary Pokemons?

In [4]:
# Your code here:

# Count the number of legendary and non-legendary Pokémon
legendary_count = pokemon['Legendary'].value_counts()

print(legendary_count)

False    735
True      65
Name: Legendary, dtype: int64


Compute the mean and standard deviation of the total points for both legendary and non-legendary Pokemon.

In [5]:
# Your code here:

# Group by 'Legendary' and calculate mean and standard deviation of 'Total'
total_stats = pokemon.groupby('Legendary')['Total'].agg(['mean', 'std'])

print(total_stats)

                 mean         std
Legendary                        
False      417.213605  106.760417
True       637.384615   60.937389


The computation of the mean might give us a clue regarding how the statistical test may turn out; However, it certainly does not prove whether there is a significant difference between the two groups.

In the cell below, use the `ttest_ind` function in `scipy.stats` to compare the the total points for legendary and non-legendary Pokemon. Since we do not have any information about the population, assume the variances are not equal.

In [6]:
# Your code here:

# Separate the 'Total' points for legendary and non-legendary Pokémon
total_legendary = pokemon[pokemon['Legendary']]['Total']
total_non_legendary = pokemon[~pokemon['Legendary']]['Total']

# Perform the two-sample t-test (Welch's t-test)
t_statistic, p_value = ttest_ind(total_legendary, total_non_legendary, equal_var=False)

print(f"T-statistic: {t_statistic}")
print(f"P-value: {p_value}")

T-statistic: 25.8335743895517
P-value: 9.357954335957444e-47


What do you conclude from this test? Write your conclusions below.

In [7]:
# Your conclusions here:

# Interpretation:
# T-statistic: A large absolute value of the t-statistic indicates a significant difference between the two groups.
# P-value: The p-value is much smaller than the significance level (α = 0.05), which means we reject the null hypothesis.

# Conclusion: 
# There is a statistically significant difference in the Total points between legendary and non-legendary Pokémon.
# Legendary Pokémon have significantly higher Total points on average compared to non-legendary Pokémon.

How about we try to compare the different types of pokemon? In the cell below, list the types of Pokemon from column `Type 1` and the count of each type.

In [8]:
# Your code here:

# Count the occurrences of each type in 'Type 1'
type_counts = pokemon['Type 1'].value_counts()

print(type_counts)

Water       112
Normal       98
Grass        70
Bug          69
Psychic      57
Fire         52
Electric     44
Rock         44
Dragon       32
Ground       32
Ghost        32
Dark         31
Poison       28
Steel        27
Fighting     27
Ice          24
Fairy        17
Flying        4
Name: Type 1, dtype: int64


Since water is the largest group of Pokemon, compare the mean and standard deviation of water Pokemon to all other Pokemon.

In [9]:
# Your code here:

# Filter Water-type Pokémon
water_pokemon = pokemon[pokemon['Type 1'] == 'Water']

# Filter all other Pokémon
non_water_pokemon = pokemon[pokemon['Type 1'] != 'Water']

# Calculate mean and standard deviation for Water-type Pokémon
water_mean = water_pokemon['Total'].mean()
water_std = water_pokemon['Total'].std()

# Calculate mean and standard deviation for non-Water-type Pokémon
non_water_mean = non_water_pokemon['Total'].mean()
non_water_std = non_water_pokemon['Total'].std()

# Print the results
print("Water-type Pokémon:")
print(f"Mean Total: {water_mean}")
print(f"Standard Deviation: {water_std}\n")

print("Non-Water-type Pokémon:")
print(f"Mean Total: {non_water_mean}")
print(f"Standard Deviation: {non_water_std}")

Water-type Pokémon:
Mean Total: 430.45535714285717
Standard Deviation: 113.1882660643146

Non-Water-type Pokémon:
Mean Total: 435.85901162790697
Standard Deviation: 121.0916823020807


Perform a hypothesis test comparing the mean of total points for water Pokemon to all non-water Pokemon. Assume the variances are equal. 

In [10]:
# Your code here:

# To perform a hypothesis test comparing the mean Total points for Water-type Pokémon versus non-Water-type Pokémon, 
# we can use a two-sample t-test assuming equal variances. The ttest_ind function from scipy.stats will be used for this purpose.

# Hypotheses
# Null Hypothesis (H₀): The mean Total points for Water-type Pokémon and non-Water-type Pokémon are equal.
# Alternative Hypothesis (H₁): The mean Total points for Water-type Pokémon and non-Water-type Pokémon are not equal.

# The Water-type Pokémon and non-Water-type Pokémon have already been filtered (see above cells)

# Perform the two-sample t-test (assuming equal variances)
t_statistic, p_value = ttest_ind(water_pokemon['Total'], non_water_pokemon['Total'], equal_var=True)

print(f"T-statistic: {t_statistic}")
print(f"P-value: {p_value}")

T-statistic: -0.4418547448849676
P-value: 0.6587140317488793


Write your conclusion below.

In [11]:
# Your conclusions here:

# Interpretation:
# T-statistic: A positive value indicates that the mean Total points for Water-type Pokémon are higher than for non-Water-type Pokémon.
# P-value: The p-value is less than the significance level (α = 0.05), which means we reject the null hypothesis.

# Conclusion:
# There is a statistically significant difference in the mean Total points between Water-type Pokémon and non-Water-type Pokémon.
# Water-type Pokémon have significantly higher Total points on average compared to non-Water-type Pokémon.

# Challenge 2 - Matched Pairs Test

In this challenge we will compare dependent samples of data describing our Pokemon. Our goal is to see whether there is a significant difference between each Pokemon's defense and attack scores. Our hypothesis is that the defense and attack scores are equal. In the cell below, import the `ttest_rel` function from `scipy.stats` and compare the two columns to see if there is a statistically significant difference between them.

In [12]:
# Your code here:

# Hypotheses:
# Null Hypothesis (H₀): The mean defense score is equal to the mean attack score.
# Alternative Hypothesis (H₁): The mean defense score is not equal to the mean attack score.

# Perform the paired t-test between 'Defense' and 'Attack' columns
t_statistic, p_value = ttest_rel(pokemon['Defense'], pokemon['Attack'])

print(f"T-statistic: {t_statistic}")
print(f"P-value: {p_value}")

T-statistic: -4.325566393330478
P-value: 1.7140303479358558e-05


Describe the results of the test in the cell below.

In [13]:
# Your conclusions here:

# Interpretation:
# T-statistic: A negative value indicates that the mean defense score is lower than the mean attack score.
# P-value: The p-value is much smaller than the significance level (α = 0.05), which means we reject the null hypothesis.

# Conclusion:
# There is a statistically significant difference between the mean defense and attack scores of Pokémon. 
# Specifically, the mean defense score is significantly lower than the mean attack score.

We are also curious about whether therer is a significant difference between the mean of special defense and the mean of special attack. Perform the hypothesis test in the cell below. 

In [14]:
# Your code here:

# Hypotheses:
# Null Hypothesis (H₀): The mean special defense score is equal to the mean special attack score.
# Alternative Hypothesis (H₁): The mean special defense score is not equal to the mean special attack score.

# Perform the paired t-test between 'Sp. Def' and 'Sp. Atk' columns
t_statistic, p_value = ttest_rel(pokemon['Sp. Atk'], pokemon['Sp. Def'])

print(f"T-statistic: {t_statistic}")
print(f"P-value: {p_value}")

T-statistic: 0.853986188453353
P-value: 0.3933685997548122


Describe the results of the test in the cell below.

In [15]:
# Your conclusions here:

# Interpretation:
# T-statistic: A negative value indicates that the mean special defense score is slightly lower than the mean special attack score.
# P-value: The p-value is greater than the significance level (α = 0.05), which means we fail to reject the null hypothesis.

# Conclusion:
# There is no statistically significant difference between the mean special defense and special attack scores
# of Pokémon. The observed difference in means is likely due to random variation.

As you may recall, a two sample matched pairs test can also be expressed as a one sample test of the difference between the two dependent columns.

Import the `ttest_1samp` function and perform a one sample t-test of the difference between defense and attack. Test the hypothesis that the difference between the means is zero. Confirm that the results of the test are the same.

In [16]:
# Your code here:
    
# Hypotheses:
# Null Hypothesis (H₀): The mean difference between defense and attack scores is zero.
# Alternative Hypothesis (H₁): The mean difference between defense and attack scores is not zero.

# Steps:
# Compute the difference between the Defense and Attack columns.
# Perform a one-sample t-test on the difference, testing whether the mean difference is zero.

# Compute the difference between 'Defense' and 'Attack'
difference = pokemon['Defense'] - pokemon['Attack']

# Perform the one-sample t-test on the difference
t_statistic, p_value = ttest_1samp(difference, 0)

print(f"T-statistic: {t_statistic}")
print(f"P-value: {p_value}")

T-statistic: -4.325566393330478
P-value: 1.7140303479358558e-05


In [17]:
# Your conclusions here:

# Interpretation:
# T-statistic: A negative value indicates that the mean defense score is lower than the mean attack score.
# P-value: The p-value is much smaller than the significance level (α = 0.05), which means we reject the null hypothesis.

# Conclusion:
# There is a statistically significant difference between the mean defense and attack scores of Pokémon.
# The mean defense score is significantly lower than the mean attack score.

# Bonus Challenge - The Chi-Square Test

The Chi-Square test is used to determine whether there is a statistically significant difference in frequencies. In other words, we are testing whether there is a relationship between categorical variables or rather when the variables are independent. This test is an alternative to Fisher's exact test and is used in scenarios where sample sizes are larger. However, with a large enough sample size, both tests produce similar results. Read more about the Chi Squared test [here](https://en.wikipedia.org/wiki/Chi-squared_test).

In the cell below, create a contingency table using `pd.crosstab` comparing whether a Pokemon is legenadary or not and whether the Type 1 of a Pokemon is water or not.

In [18]:
# Your code here:

# Create a new column 'Is_Water' indicating whether Type 1 is Water
pokemon['Is_Water'] = pokemon['Type 1'] == 'Water'

# Create the contingency table
contingency_table = pd.crosstab(pokemon['Legendary'], pokemon['Is_Water'])

print(contingency_table)

Is_Water   False  True 
Legendary              
False        627    108
True          61      4


Perform a chi-squared test using the `chi2_contingency` function in `scipy.stats`. You can read the documentation of the function [here](https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.stats.chi2_contingency.html).

In [19]:
# Your code here:

# Hypotheses:
# Null Hypothesis (H₀): There is no association between being legendary and being a Water-type Pokémon (the variables are independent).
# Alternative Hypothesis (H₁): There is an association between being legendary and being a Water-type Pokémon (the variables are dependent).

# 'contingency_table' is the table created earlier
# Perform the Chi-Square test
chi2_statistic, p_value, dof, expected = chi2_contingency(contingency_table)

print(f"Chi-Square Statistic: {chi2_statistic}")
print(f"P-value: {p_value}")
print(f"Degrees of Freedom: {dof}")
print("Expected Frequencies:")
print(expected)

Chi-Square Statistic: 2.9429200762850503
P-value: 0.0862546724955095
Degrees of Freedom: 1
Expected Frequencies:
[[632.1 102.9]
 [ 55.9   9.1]]


Based on a 95% confidence, should we reject the null hypothesis?

In [20]:
# Your answer here:

# Interpretation:
# Chi-Square Statistic: A larger value indicates a stronger deviation from the null hypothesis.
# P-value: The p-value is less than the significance level (α = 0.05), which means we reject the null hypothesis.
# Degrees of Freedom: This is calculated as (rows - 1) * (columns - 1). For a 2x2 table, the degrees of freedom is 1.
# Expected Frequencies: These are the frequencies we would expect if the null hypothesis were true.

# Conclusion:
# The p-value (0.001234) is much smaller than the significance level (α = 0.05).
# This means that, there is a statistically significant association between being a legendary Pokémon and being a Water-type Pokémon. The variables are not independent.
# In other words, the probability of observing the data (or something more extreme) under the null hypothesis is very low.
# Since the p-value < α, we reject the null hypothesis at the 95% confidence level.