# Before your start:
- Read the README.md file
- Comment as much as you can and use the resources (README.md file)
- Happy learning!

- **Consider a significance level of 5% for all tests.**

In [2]:
# import libraries
import pandas as pd
import numpy as np
import math 
import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt 

import scipy.stats as st 
import statsmodels.api as sm
import statsmodels.formula.api as smf
from scipy.stats import chi2_contingency
from scipy.stats import f_oneway

from scipy.stats import ttest_rel
from scipy.stats import ttest_1samp

from scipy.stats import chi2_contingency
from scipy.stats.contingency import association

# Challenge 1 - Independent Sample T-tests

In this challenge, we will be using the Pokemon dataset. Before applying statistical methods to this data, let's first examine the data.

To load the data, run the code below.

In [4]:
pokemon = pd.read_csv(r'C:\Users\elham\OneDrive\Documents\IRONHACK\lab\EDA\pokemon.csv')

Let's start off by looking at the `head` function in the cell below.

In [6]:
df= pokemon.copy()
df.columns = [col.lower().replace(" ", "_") for col in df.columns]
df.head(5)

Unnamed: 0,#,name,type_1,type_2,total,hp,attack,defense,sp._atk,sp._def,speed,generation,legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False


The first thing we would like to do is compare the legendary Pokemon to the regular Pokemon. To do this, we should examine the data further. What is the count of legendary vs. non legendary Pokemons?

In [8]:
# 735 Pokemon are non-legendary, while 65 are legendary
df['legendary'].value_counts()


legendary
False    735
True      65
Name: count, dtype: int64

Compute the mean and standard deviation of the total points for both legendary and non-legendary Pokemon.

In [10]:
legend = df[df['legendary'] == True]
non_legend = df[df['legendary'] == False]


In [11]:
df['legendary'].mean()

0.08125

In [12]:
df['legendary'].std()

0.27338958434995064

In [13]:
mean_legend = df[df['legendary'] == True].mean(numeric_only=True)
mean_legend

#             470.215385
total         637.384615
hp             92.738462
attack        116.676923
defense        99.661538
sp._atk       122.184615
sp._def       105.938462
speed         100.184615
generation      3.769231
legendary       1.000000
dtype: float64

In [14]:
mean_non_legend = df[df['legendary'] == False].mean(numeric_only=True)
mean_non_legend

#             353.315646
total         417.213605
hp             67.182313
attack         75.669388
defense        71.559184
sp._atk        68.454422
sp._def        68.892517
speed          65.455782
generation      3.284354
legendary       0.000000
dtype: float64

In [15]:
df.dtypes

#              int64
name          object
type_1        object
type_2        object
total          int64
hp             int64
attack         int64
defense        int64
sp._atk        int64
sp._def        int64
speed          int64
generation     int64
legendary       bool
dtype: object

The computation of the mean might give us a clue regarding how the statistical test may turn out; However, it certainly does not prove whether there is a significant difference between the two groups.

In the cell below, use the `ttest_ind` function in `scipy.stats` to compare the the total points for legendary and non-legendary Pokemon. Since we do not have any information about the population, assume the variances are not equal.

Null hypothsis: The means of total points legendary and non_legendary Pokemon are equal.
Alternative hypothesis: the means of total points for legendary and non_legendary are different

In [18]:
legend = df[df['legendary'] == True]['total']
non_legend = df[df['legendary'] == False]['total']

#two-sample t-test for independent samples
t_stat, p_value = st.ttest_ind(legend, non_legend, equal_var=False) 
print(f"Test Statistic (t): {t_stat:.2f}")
print(f"P-Value: {p_value:.60f}")
print()

# Significance level
alpha = 0.05

# Decision-Making
if p_value > alpha:
    print("Fail to Reject the Null Hypothesis: The mean total points for legend and non_legend are not significantly different.")
else:
    print("Reject the Null Hypothesis: There is sufficient evidence to conclude that the mean of total points are different for legends and non_legends.")



Test Statistic (t): 25.83
P-Value: 0.000000000000000000000000000000000000000000000093579543359574

Reject the Null Hypothesis: There is sufficient evidence to conclude that the mean of total points are different for legends and non_legends.


What do you conclude from this test? Write your conclusions below.

conclusion
On average, legenedry and non_legendry pokemon scored differently as mean of total points are different for legends and non_legends


### How about we try to compare the different types of pokemon? In the cell below, list the types of Pokemon from column `Type 1` and the count of each type.

In [22]:
type_1_pokeman = df['type_1'].value_counts()
type_1_pokeman

type_1
Water       112
Normal       98
Grass        70
Bug          69
Psychic      57
Fire         52
Electric     44
Rock         44
Dragon       32
Ground       32
Ghost        32
Dark         31
Poison       28
Steel        27
Fighting     27
Ice          24
Fairy        17
Flying        4
Name: count, dtype: int64

* Null hypothsis: The proportion of Pokeman types are equal.
* Alternative hypothesis: the proportion of Pokemon types are different

In [92]:
#create contingency table to see the frequencies 
contingency_tab = pd.crosstab(index=df['type_1'], columns='count')
chi2_stats, chi2_pvalue, dof, expected = chi2_contingency(contingency_tab)
print(f"chi2_pvalue: {chi2_pvalue:.10f}")

chi2_pvalue: 1.0000000000


In [90]:
from scipy.stats import chisquare
#use another test chi-squared goodness of fit
# Chi-square test for comparison of categorical varaibles
pokeman_1= type_1_pokeman.values

# expected frequencies 
total_count = pokeman_1.sum()
expected_count = [total_count / len(pokeman_1)] * len(pokeman_1)

# chi-square goodness of fit better for comparisionb of frequencies
chi2_stats, chi2_pvalue = chisquare(f_obs=pokeman_1, f_exp=expected_count)

print(f"Chi-square Statistic: {chi2_stats:.4f}")
print(f"P-value: {chi2_pvalue:.40f}")

# Decision making
alpha = 0.05
if p_value > alpha:
    print("Fail to Reject the Null Hypothesis: The proportions of Pokemon types are not significantly different.")
else:
    print("Reject the Null Hypothesis: There is a significant difference in the proportions of Pokemon types.")



Chi-square Statistic: 297.7750
P-value: 0.0000000000000000000000000000000000000000
Reject the Null Hypothesis: There is a significant difference in the proportions of Pokemon types.


In [26]:
from scipy.stats import f_oneway

# extract unique values in type_1
unique_types = df['type_1'].unique()

# create dictionary to store data for each type
type_data = {type_: df[df['type_1'] == type_]['total'] for type_ in unique_types}

# ANOVA test
anova_result = f_oneway(*type_data.values())

print(f"F-Statistic: {anova_result.statistic:.4f}")
print(f"P-Value: {anova_result.pvalue:.10f}")
print()

# Decision making
if p_value > alpha:
    print("Fail to Reject the Null Hypothesis: The means of total points do not significantly differ among Pokémon types.")
else:
    print("Reject the Null Hypothesis: The means of total points significantly differ among Pokémon types.")


F-Statistic: 4.6388
P-Value: 0.0000000021

Reject the Null Hypothesis: The means of total points significantly differ among Pokémon types.


### Since water is the largest group of Pokemon, compare the mean and standard deviation of water Pokemon to all other Pokemon.

In [28]:
# creating two groups of water and non water 
water_pokemon = df[df['type_1'] == 'Water'].dropna()
non_water_pokemon = df[df['type_1'] != 'Water'].dropna()

# Calculate mean and standard deviation for the 'total' column
water_mean = water_pokemon['total'].mean()
water_std = water_pokemon['total'].std()

non_water_pokemon_mean = non_water_pokemon['total'].mean()
non_water_pokemon_std = non_water_pokemon['total'].std()


print(f"Water Pokemon -> Mean : {water_mean:.2f}, Std Dev : {water_std:.2f}")
print(f"Non_water Pokemon -> Mean : {non_water_pokemon_mean:.2f}, Std Dev : {non_water_pokemon_std:.2f}")



Water Pokemon -> Mean : 449.06, Std Dev : 109.27
Non_water Pokemon -> Mean : 457.74, Std Dev : 122.56


### Perform a hypothesis test comparing the mean of total points for water Pokemon to all non-water Pokemon. Assume the variances are equal. 

* Null hypothsis: The means of total points for Water and non-Water Pokemon are equal.
* Alternative hypothesis: the means of total points for Water and non-Water Pokemon are different

In [31]:
# creating two groups of water and non water pokeman
water_pokemon = df[df['type_1'] == 'Water']['total'].dropna()
non_water_pokemon = df[df['type_1'] != 'Water']['total'].dropna()

# two smaple t-test assuming equal variances
t_stat, p_value = st.ttest_ind(water_pokemon, non_water_pokemon, equal_var=True)

print(f"Test Statistic (t): {t_stat:.2f}")
print(f"P-Value: {p_value:.6f}")

# significance level
alpha = 0.05

# Decision-Making
if p_value > alpha:
    print("Fail to Reject the Null Hypothesis: The means of total points for Water and non-Water Pokemon are not significantly different.")
else:
    print("Reject the Null Hypothesis: There is sufficient evidence to conclude that the means of total points for Water and non-Water Pokemon are significantly different.")


Test Statistic (t): -0.44
P-Value: 0.658714
Fail to Reject the Null Hypothesis: The means of total points for Water and non-Water Pokemon are not significantly different.


Write your conclusion below.

Mean of two groups water and other pokeman are not significantly different. Both groups have similar total performance

# Challenge 2 - Matched Pairs Test

In this challenge we will compare dependent samples of data describing our Pokemon. Our goal is to see whether there is a significant difference between each Pokemon's defense and attack scores. Our hypothesis is that the defense and attack scores are equal. In the cell below, import the `ttest_rel` function from `scipy.stats` and compare the two columns to see if there is a statistically significant difference between them.

Null hypothesis: The mean of defense and attack scores are equal.

Alternative hypothesis: There is a significant difference between mean of defense and attack scores

In [36]:
# paired ttest to compare attack and defence scores
t_stat, p_value = ttest_rel(df['defense'], df['attack'])

print(f"Test Statistic (t): {t_stat:.2f}")
print(f"P-Value: {p_value:.6f}")
print()

# significance level
alpha = 0.05

# Decision-making
if p_value > alpha:
    print("Fail to Reject the Null Hypothesis: The mean of defense and attack scores are not significantly different.")
else:
    print("Reject the Null Hypothesis: There is a significant difference between mean of defense and attack scores.")



Test Statistic (t): -4.33
P-Value: 0.000017

Reject the Null Hypothesis: There is a significant difference between mean of defense and attack scores.


Describe the results of the test in the cell below.

#### conclusions:
There is a significant difference between mean of defense and attack scores of each pokeman.


We are also curious about whether therer is a significant difference between the mean of special defense and the mean of special attack. Perform the hypothesis test in the cell below. 

Null hypothesis: The mean of special defense and special attack scores are equal.
Alternative hypothesis: There is a significant difference between mean of special defense and special attack scores

In [41]:
# paired ttest to compare attack and defence scores
t_stat, p_value = ttest_rel(df['sp._atk'], df['sp._def'])

print(f"Test Statistic (t): {t_stat:.2f}")
print(f"P-Value: {p_value:.6f}")
print()

# significance level
alpha = 0.05

# Decision-making
if p_value > alpha:
    print("Fail to Reject the Null Hypothesis: The mean of special defense and spacial attack scores are not significantly different.")
else:
    print("Reject the Null Hypothesis: There is a significant difference between mean of special defense and special attack scores.")


Test Statistic (t): 0.85
P-Value: 0.393369

Fail to Reject the Null Hypothesis: The mean of special defense and spacial attack scores are not significantly different.


Describe the results of the test in the cell below.

### conclusions:
Contrary to the normal attack and defense, the mean of special attack and defense is similar. 


As you may recall, a two sample matched pairs test can also be expressed as a one sample test of the difference between the two dependent columns.

Import the `ttest_1samp` function and perform a one sample t-test of the difference between defense and attack. Test the hypothesis that the difference between the means is zero. Confirm that the results of the test are the same.

In [45]:
# difference between 'defense' and 'attack'
difference = df['defense'] - df['attack']

# one-sample t-test on the difference
t_stat, p_value = ttest_1samp(difference, 0)


print(f"Test Statistic (t): {t_stat:.2f}")
print(f"P-Value: {p_value:.6f}")
print()

# significance level
alpha = 0.05

# Decision-making
if p_value > alpha:
    print("Fail to Reject the Null Hypothesis: The difference between defense and attack means is not significantly different from zero.")
else:
    print("Reject the Null Hypothesis: The difference between defense and attack means is significantly different from zero.")
    
    

Test Statistic (t): -4.33
P-Value: 0.000017

Reject the Null Hypothesis: The difference between defense and attack means is significantly different from zero.


# Bonus Challenge - The Chi-Square Test

The Chi-Square test is used to determine whether there is a statistically significant difference in frequencies. In other words, we are testing whether there is a relationship between categorical variables or rather when the variables are independent. This test is an alternative to Fisher's exact test and is used in scenarios where sample sizes are larger. However, with a large enough sample size, both tests produce similar results. Read more about the Chi Squared test [here](https://en.wikipedia.org/wiki/Chi-squared_test).

In the cell below, create a contingency table using `pd.crosstab` comparing whether a Pokemon is legenadary or not and whether the Type 1 of a Pokemon is water or not.

In [47]:
# creating new column to check whether type_1 is Water or not
df['is_water'] = df['type_1'] == 'Water'

# contingency table
contingency_table = pd.crosstab(df['is_water'], df['legendary'])
contingency_table



legendary,False,True
is_water,Unnamed: 1_level_1,Unnamed: 2_level_1
False,627,61
True,108,4


Perform a chi-squared test using the `chi2_contingency` function in `scipy.stats`. You can read the documentation of the function [here](https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.stats.chi2_contingency.html).

In [49]:
# Chi-Square test
chi2_stats, chi2_pvalue, dof, expected = chi2_contingency(contingency_table)
print(f"chi2_pvalue: {chi2_pvalue:.10f}")

chi2_pvalue: 0.0862546725


Based on a 95% confidence, should we reject the null hypothesis?

In [51]:
# p value from chi squared result is less than alpha 0.05 
# Decision-making
if p_value > alpha:
    print("Fail to Reject the Null Hypothesis: There is no relationship between whether a Pokemon is legendary and whether its Type 1 is Water.")
else:
    print("Reject the Null Hypothesis: There is association or relationship between whether a Pokemon is legendary and whether its Type 1 is Water.")


Reject the Null Hypothesis: There is association or relationship between whether a Pokemon is legendary and whether its Type 1 is Water.


In [52]:
#checking the strength of p value with cremer's v
from scipy.stats.contingency import association
association(contingency_table, method='cramer')
#0.06 is a moderate to stong assoication

0.06724447151934329