<a href="https://www.kaggle.com/code/hassaneskikri/hypothesis-testing?scriptVersionId=168270717" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

In [1]:
%%html
<style>
    *{
        font-family: 'Arial', sans-serif;
        align-item : center;
        justifiy-content:center;
        max-width : 1000px;
    }
    h1{
        color: #FFD700;
        border-bottom: 3px solid #FFD700;
        text-align:center;
        padding-bottom: 0.3em;
        font-size:bold;
    }
    h2{
        color:#2dd4bf;
        padding-bottom: 0.3em;
    }
    p, ol, ul {
        font-size: 18px;
        line-height: 1.5;
        color: #eee;
    }
    a {
        color: #d946ef;
        text-decoration: none;
    }
    a:hover {
        text-decoration: underline;
        color : #86198f;
    }
    img{
        display: flex;
        margin-left: auto;
        margin-right: auto;
        width: 700px;
        height: auto;
        text-align: center;
        border-radius: 15px;
    }
    
</style>


Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is most often used by scientists to test specific predictions, called hypotheses, that arise from theories.

There are 5 main steps in hypothesis testing:

- State your research hypothesis as a null hypothesis and alternate hypothesis (Ho) and (Ha or H1).
- Collect data in a way designed to test the hypothesis.
- Perform an appropriate statistical test.
- Decide whether to reject or fail to reject your null hypothesis.
- Present the findings in your results and discussion section.

# Step 1: Formulate Hypotheses

- **Null Hypothesis (H0):** A statement of no effect or no difference. It is the hypothesis that researchers aim to test against.
- **Alternate Hypothesis (Ha):** A statement that indicates the presence of an effect or a difference. If the null hypothesis is rejected, the alternate hypothesis is considered supported.

Null and alternative hypotheses are similar in some ways:

- They’re both answers to the research question.
- They both make claims about the population.
- They’re both evaluated by statistical tests.

However, there are important differences between the two types of hypotheses, summarized in the following table.

![image.png](attachment:c2d03731-6fcb-431a-94a7-321e312c47e8.png)

# Step 2: Collect Data

Data collection should be methodically planned to ensure that the sample accurately represents the population. This involves deciding the sample size, sampling method, and ensuring the data collection process is unbiased.

# Step 3: Perform a Statistical Test

The choice of [statistical test](https://www.scribbr.com/statistics/statistical-tests/) depends on the nature of your data (e.g., categorical, continuous), the distribution of the data (e.g., normal distribution), and the hypothesis being tested (e.g., one-tailed or two-tailed test).

# Step 4: Make a Decision

- **P-value:** The p-value obtained from the statistical test quantifies the probability of observing the data if the null hypothesis is true. A small p-value (typically ≤ 0.05) indicates strong evidence against the null hypothesis, leading to its rejection.

- **Significance Level (α):** Predetermined threshold for the p-value, below which the null hypothesis will be rejected. Commonly set at 0.05.

# Step 5: Present Findings

Discuss the implications of the findings in the context of the study. Presenting the results involves not just stating whether the null hypothesis was rejected or not, but also discussing the size and significance of the effect.



# Example: Investigating Height Differences by Gender

This example applies the hypothesis testing framework to explore whether there is a significant difference in average height between men and women.

## Step 1: Formulate Hypotheses

- **H0:** There is no difference in average height between men and women. (μ_men = μ_women)
- **Ha:** Men are, on average, taller than women. (μ_men > μ_women)

## Step 2: Collect Data

Imagine collecting height data from a sample of adults, ensuring an equal representation of men and women across various socio-economic backgrounds to minimize sampling bias. Let's say the sample size is significant, with over 30 participants for each gender.

## Step 3: Perform a Statistical Test

Given the data are continuous and the sample size is large, a one-tailed t-test for independent samples is chosen to compare the mean heights. Assume the data approximately follow a normal distribution, satisfying the test's requirements.

## Step 4: Make a Decision

After conducting the t-test, suppose you find a mean height of 175.4 cm for men and 161.7 cm for women, with a p-value of 0.002. Given this p-value is less than the significance level of 0.05, you reject the null hypothesis.

## Step 5: Present Findings

In the study, the significant difference in mean heights (13.7 cm) between genders and the p-value (0.002) strongly suggest men are, on average, taller than women. This result supports the alternate hypothesis and indicates a meaningful difference in height by gender.



In [2]:
import numpy as np
from scipy import stats

# Step 1: Generate synthetic data
np.random.seed(42)  # For reproducibility

men_heights = np.random.normal(175.4, 10, 100)  # Mean = 175.4 cm, SD = 10 cm, n = 100
women_heights = np.random.normal(161.7, 10, 100)  # Mean = 161.7 cm, SD = 10 cm, n = 100

# Step 2: Perform a one-tailed t-test
t_stat, p_value_two_tailed = stats.ttest_ind(men_heights, women_heights)
p_value_one_tailed = p_value_two_tailed / 2

# Step 3: Interpret the results

alpha = 0.05  # Significance level
print(f"T-statistic: {t_stat}")
print(f"P-value (one-tailed): {p_value_one_tailed}")

if p_value_one_tailed < alpha:
    print("We reject the null hypothesis. Men are significantly taller than women.")
else:
    print("We fail to reject the null hypothesis. No significant difference in height between men and women.")


T-statistic: 9.445201415535255
P-value (one-tailed): 5.1173948808471035e-18
We reject the null hypothesis. Men are significantly taller than women.


# Statistical significance

Statistical significance helps us determine if our research findings reflect a real effect or if they're just due to chance. It revolves around the concept of the p-value, which is a measure of the strength of evidence against the null hypothesis (H0).

## Key Concepts

- Statistical Significance: Indicates a low probability that observed results occurred by chance if the null hypothesis is true.
- P-value: The probability of obtaining the observed results, or more extreme, assuming the null hypothesis is true.
- Significance Level (α): A threshold set before the study begins (commonly 0.05) to decide if results are statistically significant.

Statistical significance is a crucial concept in research, providing a way to judge whether observed effects are likely genuine or due to random variation. However, it's just one part of the puzzle, and findings should be interpreted with caution, considering effect sizes and confidence intervals to understand the practical implications.


# Type I and type II errors

Making a statistical decision always involves uncertainties, so the risks of making these errors are unavoidable in hypothesis testing.

The probability of making a Type I error is the significance level, or alpha (α), while the probability of making a Type II error is beta (β). These risks can be minimized through careful planning in your study design.

You decide to get tested for COVID-19 based on mild symptoms. There are two errors that could potentially occur:
- `Type I error (false positive):` the test result says you have coronavirus, but you actually don’t.
- `Type II error (false negative):`  the test result says you don’t have coronavirus, but you actually do.

![image.png](attachment:c3e9a1d3-7bc2-4c32-8e71-73ece1a7d6b9.png)

# Statistical Power



Statistical power is the chance that a study will detect an actual effect when there truly is one. It's important for ensuring that the research can reveal meaningful insights about real differences or relationships.

## Why Statistical Power Matters

- **Avoiding Type II Errors:** High power means we're less likely to miss a real effect (making a Type II error).
- **Resource Efficiency:** Ensures that the time, money, and effort put into research are likely to yield conclusive results.
- **Ethical Research:** Particularly in clinical trials, it's unethical to involve participants in studies that are unlikely to produce useful findings.

## Components Influencing Power

1. **Sample Size:** Larger samples increase power.
2. **Effect Size:** Larger effects are easier to detect, increasing power.
3. **Significance Level (α):** The threshold for deciding if results are statistically significant, usually set at 5% (0.05).

## Power Analysis

A calculation that helps you determine the necessary sample size for your study based on:
- Desired power (often 80%)
- Expected effect size
- Significance level

## Increasing Statistical Power

- **Increase Sample Size:** More data can make your test more likely to detect true effects.
- **Increase Effect Size:** If possible, adjusting the study design to amplify the effect can help.
- **Adjust Significance Level:** Accepting a higher risk of a Type I error (false positive) can increase power.
- **Reduce Measurement Error:** More accurate and precise measurements can improve power.

## Practical Example

Imagine a study investigating whether spending time in nature reduces stress in recent college graduates:

- **Null Hypothesis (H0):** Nature time has no effect on stress levels.
- **Alternative Hypothesis (HA):** Nature time reduces stress levels.
- **Type I Error:** Wrongly concluding that nature reduces stress when it doesn't.
- **Type II Error:** Not detecting that nature reduces stress when it actually does.

In this scenario, statistical power is the likelihood of correctly identifying that spending time in nature does reduce stress if that effect truly exists.


Statistical power is crucial for designing effective studies that are capable of detecting real effects. By carefully planning and conducting power analyses, researchers can set their studies up for success, ensuring that findings are both reliable and meaningful.


# Resources 

- [hypothesis testing](https://www.scribbr.com/statistics/hypothesis-testing/)
- [null vs alternative hypothesis](https://www.scribbr.com/statistics/null-and-alternative-hypotheses/)
- [ Statistical significance](https://www.scribbr.com/statistics/statistical-significance/)
- [p_value stack quest](https://www.youtube.com/watch?v=vemZtEM63GY&ab_channel=StatQuestwithJoshStarmer)
- [ how to calculate the p value](https://www.youtube.com/watch?v=JQc3yx0-Q9E&ab_channel=StatQuestwithJoshStarmer)

- [p-value article](https://www.scribbr.com/statistics/p-value/)

- [type I and type II errors vedio](https://www.youtube.com/watch?v=a_l991xUAOU&ab_channel=365DataScience)

- [type I and type II errors article](https://www.kaggle.com/code/hassaneskikri/hypothesis-testing/edit)

- [Statistical power vedio](https://www.youtube.com/watch?v=Rsc5znwR5FA&ab_channel=StatQuestwithJoshStarmer)

- [ Statistical power article](https://www.scribbr.com/statistics/statistical-power/)