# Challenge 1 - T-test

In statistics, t-test is used to test if two data samples have a significant difference between their means. There are two types of t-test:

* **Student's t-test** (a.k.a. independent or uncorrelated t-test). This type of t-test is to compare the samples of **two independent populations** (e.g. test scores of students in two different classes). `scipy` provides the [`ttest_ind`](https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.stats.ttest_ind.html) method to conduct student's t-test.

* **Paired t-test** (a.k.a. dependent or correlated t-test). This type of t-test is to compare the samples of **the same population** (e.g. scores of different tests of students in the same class). `scipy` provides the [`ttest_re`](https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.stats.ttest_rel.html) method to conduct paired t-test.

Both types of t-tests return a number which is called the **p-value**. If p-value is below 0.05, we can confidently declare the null-hypothesis is rejected and the difference is significant. If p-value is between 0.05 and 0.1, we may also declare the null-hypothesis is rejected but we are not highly confident. If p-value is above 0.1 we do not reject the null-hypothesis.

Read more about the t-test in [this article](https://researchbasics.education.uconn.edu/t-test/) and [this Quora](https://www.quora.com/What-is-the-difference-between-a-paired-and-unpaired-t-test). Make sure you understand when to use which type of t-test. 

In [1]:
# Import libraries

import pandas as pd

#### Import dataset

In this challenge we will work on the Pokemon dataset you have used last week. The goal is to test whether different groups of pokemon (e.g. Legendary vs Normal, Generation 1 vs 2, single-type vs dual-type) have different stats (e.g. HP, Attack, Defense, etc.).

In [3]:
# Import dataset

pokemon = pd.read_csv('../data/Pokemon.csv')

pokemon.head()

Unnamed: 0,#,Name,Type 1,Type 2,Total,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,4,Charmander,Fire,,309,39,52,43,60,50,65,1,False


#### First we want to define a function with which we can test the means of a feature set of two samples. 

In the next cell you'll see the annotations of the Python function that explains what this function does and its arguments and returned value. This type of annotation is called **docstring** which is a convention used among Python developers. The docstring convention allows developers to write consistent tech documentations for their codes so that others can read. It also allows some websites to automatically parse the docstrings and display user-friendly documentations.

Follow the specifications of the docstring and complete the function.

In [5]:
from scipy.stats import ttest_ind

def t_test_features(s1, s2, features=['HP', 'Attack', 'Defense', 'Sp. Atk', 'Sp. Def', 'Speed', 'Total']):
    """Test means of a feature set of two samples
    
    Args:
        s1 (dataframe): sample 1
        s2 (dataframe): sample 2
        features (list): an array of features to test
    
    Returns:
        dict: a dictionary of t-test scores for each feature where the feature name is the key and the p-value is the value
    """
    results = {}

    # Your code here
    for feature in features:
        # Perform the t-test on the feature
        t_stat, p_value = ttest_ind(s1[feature], s2[feature], equal_var=False)  # Welch's t-test for unequal variance
        
        # Store the p-value in the dictionary
        results[feature] = p_value
    
    return results


#### Using the `t_test_features` function, conduct t-test for Lengendary vs non-Legendary pokemons.

*Hint: your output should look like below:*

```
{'HP': 1.0026911708035284e-13,
 'Attack': 2.520372449236646e-16,
 'Defense': 4.8269984949193316e-11,
 'Sp. Atk': 1.5514614112239812e-21,
 'Sp. Def': 2.2949327864052826e-15,
 'Speed': 1.049016311882451e-18,
 'Total': 9.357954335957446e-47}
 ```

In [28]:
# Your code here

# Separate Legendary and non-Legendary Pokémon
# Filtering the dataset to create two groups: one for Legendary Pokémon (where 'Legendary' column is True)
# and one for non-Legendary Pokémon (where 'Legendary' column is False)
legendary_pokemon = pokemon[pokemon['Legendary'] == True]
non_legendary_pokemon = pokemon[pokemon['Legendary'] == False]

# Conduct the t-test for Legendary vs non-Legendary Pokémon
# We are comparing the means of each feature for the two groups (Legendary vs non-Legendary)
# The t_test_features function performs t-tests on the features (such as 'HP', 'Attack', etc.)
test_results = t_test_features(legendary_pokemon, non_legendary_pokemon)

# Output the results with identification
# We iterate through the test results dictionary and print each feature and its corresponding p-value
print("Legendary vs Non-Legendary Pokémon T-test Results:")
for feature, p_value in test_results.items():
    print(f"  {feature} - p-value: {p_value}")





Legendary vs Non-Legendary Pokémon T-test Results:
  HP - p-value: 1.0026911708035284e-13
  Attack - p-value: 2.520372449236646e-16
  Defense - p-value: 4.8269984949193316e-11
  Sp. Atk - p-value: 1.5514614112239812e-21
  Sp. Def - p-value: 2.2949327864052826e-15
  Speed - p-value: 1.049016311882451e-18
  Total - p-value: 9.357954335957446e-47


#### From the test results above, what conclusion can you make? Do Legendary and non-Legendary pokemons have significantly different stats on each feature?

Your comment here

### Conclusion

From the t-test results, we can conclude that there are significant differences between Legendary and non-Legendary Pokémon for each feature. The p-values for all features are extremely small (close to zero), which indicates that the null hypothesis is rejected for each feature. 

This means that the stats of Legendary Pokémon are statistically different from those of non-Legendary Pokémon in terms of **HP**, **Attack**, **Defense**, **Sp. Atk**, **Sp. Def**, **Speed**, and **Total**. The low p-values suggest that these differences are not due to random chance, and Legendary Pokémon generally have higher stats compared to non-Legendary Pokémon.


#### Next, conduct t-test for Generation 1 and Generation 2 pokemons.

In [32]:
# Your code here

# Your code here

# Separate Generation 1 and Generation 2 Pokémon
# Filtering the dataset to create two groups: one for Generation 1 Pokémon (where 'Generation' column is 1)
# and one for Generation 2 Pokémon (where 'Generation' column is 2)
gen1_pokemon = pokemon[pokemon['Generation'] == 1]
gen2_pokemon = pokemon[pokemon['Generation'] == 2]

# Conduct the t-test for Generation 1 vs Generation 2 Pokémon
# We are comparing the means of each feature for the two groups (Generation 1 vs Generation 2)
# The t_test_features function performs t-tests on the features (such as 'HP', 'Attack', etc.)
test_results_gen1_gen2 = t_test_features(gen1_pokemon, gen2_pokemon)

# Output the results with identification
# We iterate through the test results dictionary and print each feature and its corresponding p-value
print("Generation 1 vs Generation 2 Pokémon T-test Results:")
for feature, p_value in test_results_gen1_gen2.items():
    print(f"  {feature} - p-value: {p_value}")  # p-value for each feature



Generation 1 vs Generation 2 Pokémon T-test Results:
  HP - p-value: 0.14551697834219623
  Attack - p-value: 0.24721958967217725
  Defense - p-value: 0.5677711011725426
  Sp. Atk - p-value: 0.12332165977104388
  Sp. Def - p-value: 0.18829872292645752
  Speed - p-value: 0.00239265937312135
  Total - p-value: 0.5631377907941676


#### What conclusions can you make?

Your comment here

### Conclusion for Generation 1 vs Generation 2 Pokémon

From the t-test results for Generation 1 and Generation 2 Pokémon, we can conclude that:

- For most stats (HP, Attack, Defense, Sp. Atk, Sp. Def, and Total), there is no significant difference between the two generations. The p-values for these features are all above 0.05, meaning we fail to reject the null hypothesis for these stats. This suggests that Generation 1 and Generation 2 Pokémon do not have significantly different stats in these categories.
  
- However, for **Speed**, we observe a significant difference with a p-value of **0.0024**, which is below the 0.05 threshold. This means we can reject the null hypothesis for Speed and conclude that Generation 1 and Generation 2 Pokémon have significantly different Speed stats.

In summary, while most stats are similar across both generations, **Speed** is the one feature where there is a significant difference between Generation 1 and Generation 2 Pokémon.


#### Compare pokemons who have single type vs those having two types.

In [30]:
# Your code here

# Separate Pokémon with a single type and those with two types
# Pokémon with single type: Where 'Type 2' is NaN (i.e., no second type)
# Pokémon with two types: Where 'Type 2' is not NaN (i.e., has a second type)
single_type_pokemon = pokemon[pokemon['Type 2'].isna()]
dual_type_pokemon = pokemon[pokemon['Type 2'].notna()]

# Conduct the t-test for single type vs dual type Pokémon
# The t_test_features function will compare the means of each feature (such as 'HP', 'Attack', etc.)
# between the single-type and dual-type Pokémon groups. The function returns p-values for each feature.
test_results_types = t_test_features(single_type_pokemon, dual_type_pokemon)

# Output the results with identification
# We iterate through the test results dictionary and print each feature and its corresponding p-value
# The p-values indicate whether there is a statistically significant difference between the two groups for each feature
print("Single-type vs Dual-type Pokémon T-test Results:")
for feature, p_value in test_results_types.items():
    print(f"  {feature} - p-value: {p_value}")  # p-value for each feature


Single-type vs Dual-type Pokémon T-test Results:
  HP - p-value: 0.11314389855379413
  Attack - p-value: 0.00014932578145948305
  Defense - p-value: 2.7978540411514693e-08
  Sp. Atk - p-value: 0.00013876216585667907
  Sp. Def - p-value: 0.00010730610934512779
  Speed - p-value: 0.02421703281819093
  Total - p-value: 1.1157056505229961e-07


#### What conclusions can you make?

Your comment here

### Conclusion for Single-type vs Dual-type Pokémon

From the t-test results comparing single-type and dual-type Pokémon, we observe the following:

- **HP, Attack, Defense, Sp. Atk, Sp. Def, Speed, Total**: All of these features have very low p-values, most notably for **Defense** and **Total** (with p-values as low as `2.80e-08` and `1.12e-07` respectively). Since all these p-values are well below the threshold of **0.05**, we can confidently reject the null hypothesis for each of these stats.
- **Speed** has a p-value of **0.0242**, which is also below **0.05**, indicating a significant difference between single-type and dual-type Pokémon in terms of Speed.
- **HP**: The p-value for HP is **0.1131**, which is greater than **0.05**. This suggests that there is no significant difference between single-type and dual-type Pokémon in terms of HP.

#### Summary:
- **Significant Differences**: There are significant differences between single-type and dual-type Pokémon in terms of **Attack**, **Defense**, **Sp. Atk**, **Sp. Def**, **Speed**, and **Total**.
- **No Significant Difference**: The **HP** stat does not show a significant difference between the two groups.

This suggests that dual-type Pokémon tend to have significantly different stats from single-type Pokémon, particularly in their **defensive** and **overall stats**, but there is no significant difference in **HP**.


#### Now, we want to compare whether there are significant differences of `Attack` vs `Defense`  and  `Sp. Atk` vs `Sp. Def` of all pokemons. Please write your code below.

*Hint: are you comparing different populations or the same population?*

In [20]:
from scipy.stats import ttest_rel

# Conduct the paired t-test for Attack vs Defense
t_stat_attack_defense, p_value_attack_defense = ttest_rel(pokemon['Attack'], pokemon['Defense'])

# Conduct the paired t-test for Sp. Atk vs Sp. Def
t_stat_spatk_spdef, p_value_spatk_spdef = ttest_rel(pokemon['Sp. Atk'], pokemon['Sp. Def'])

# Output the results with identification
print(f"Attack vs Defense:")
print(f"  t-statistic: {t_stat_attack_defense}")
print(f"  p-value: {p_value_attack_defense}\n")

print(f"Sp. Atk vs Sp. Def:")
print(f"  t-statistic: {t_stat_spatk_spdef}")
print(f"  p-value: {p_value_spatk_spdef}")


Attack vs Defense:
  t-statistic: 4.325566393330478
  p-value: 1.7140303479358558e-05

Sp. Atk vs Sp. Def:
  t-statistic: 0.853986188453353
  p-value: 0.3933685997548122


#### What conclusions can you make?

Your comment here

### Conclusion for Attack vs Defense and Sp. Atk vs Sp. Def

From the paired t-test results, we can draw the following conclusions:

1. **Attack vs Defense**:
   - The p-value for the comparison between `Attack` and `Defense` is **1.71e-05**, which is much smaller than the threshold of **0.05**. This indicates that we **reject the null hypothesis**, meaning that there is a **statistically significant difference** between the `Attack` and `Defense` stats of Pokémon. The average `Attack` stat differs from the average `Defense` stat.

2. **Sp. Atk vs Sp. Def**:
   - The p-value for the comparison between `Sp. Atk` and `Sp. Def` is **0.393**, which is much greater than **0.05**. This means we **fail to reject the null hypothesis**, suggesting that there is **no significant difference** between `Sp. Atk` and `Sp. Def`. The means of these two features are not significantly different.

### Summary:
- **Significant difference**: `Attack` vs `Defense` – There is a statistically significant difference between these two features.
- **No significant difference**: `Sp. Atk` vs `Sp. Def` – There is no statistically significant difference between these two features.

This suggests that while Pokémon have significantly different `Attack` and `Defense` stats, their `Sp. Atk` and `Sp. Def` stats do not differ significantly.
