Hypothesis testing is a statistical method that uses sample data to make inferences or decisions about population parameters. It is a systematic procedure for evaluating claims or hypotheses about a population parameter based on sample data. The process involves making an initial assumption, known as the null hypothesis (\(H_0\)), and testing this assumption against an alternative hypothesis (\(H_a\) or \(H_1\)) using statistical methods.

### Key Components of Hypothesis Testing:

1. **Null Hypothesis (\(H_0\))**:
    - The null hypothesis represents the default assumption or status quo.
    - It often states that there is no effect, no difference, or no association between variables.
    - Example: \(H_0: \mu = \mu_0\) (Null hypothesis that the population mean is equal to a specified value \( \mu_0 \)).

2. **Alternative Hypothesis (\(H_a\) or \(H_1\))**:
    - The alternative hypothesis represents what we want to prove or find evidence for.
    - It can be one-sided (e.g., \(H_a: \mu > \mu_0\), \(H_a: \mu < \mu_0\)) or two-sided (e.g., \(H_a: \mu \neq \mu_0\)).

3. **Test Statistic**:
    - A test statistic is a numerical value calculated from sample data.
    - It quantifies the difference between the observed data and what would be expected under the null hypothesis.

4. **P-value**:
    - The p-value is the probability of observing a test statistic as extreme as, or more extreme than, the one calculated from the sample data, assuming that the null hypothesis is true.
    - A smaller p-value indicates stronger evidence against the null hypothesis.

5. **Decision Rule**:
    - Based on the p-value and a predetermined significance level (\(\alpha\)), a decision is made to either reject the null hypothesis or fail to reject it.
    - Common significance levels are 0.05, 0.01, and 0.10.

6. **Conclusion**:
    - Based on the decision rule, a conclusion is drawn regarding the null hypothesis.
    - If the p-value is less than the significance level (\(\alpha\)), the null hypothesis is rejected in favor of the alternative hypothesis.
    - If the p-value is greater than or equal to the significance level (\(\alpha\)), the null hypothesis is not rejected.

### Steps in Hypothesis Testing:

1. **State the Null Hypothesis (\(H_0\)) and Alternative Hypothesis (\(H_a\) or \(H_1\))**.
2. **Choose the Significance Level (\(\alpha\))**.
3. **Select the Test Statistic** (e.g., z-test, t-test, chi-square test).
4. **Calculate the Test Statistic** from the sample data.
5. **Calculate the P-value** associated with the test statistic.
6. **Make a Decision** based on the P-value and Significance Level.
7. **Draw a Conclusion** regarding the Null Hypothesis.

### Applications of Hypothesis Testing:

1. **Medical Research**: Testing the effectiveness of a new treatment compared to a standard treatment.
2. **Quality Control**: Testing the quality of manufactured products to ensure they meet specifications.
3. **Market Research**: Testing hypotheses about consumer preferences or behavior.
4. **Environmental Science**: Testing hypotheses about the impact of pollutants on ecosystems.
5. **Business Analytics**: Testing hypotheses about sales trends, customer behavior, or marketing strategies.

### Conclusion:

Hypothesis testing is a fundamental method in statistics that provides a structured framework for making decisions and drawing conclusions based on sample data. It allows researchers, analysts, and decision-makers to evaluate claims, test theories, and make informed decisions in various fields by assessing the evidence against a null hypothesis. Understanding the principles, components, and applications of hypothesis testing is essential for conducting meaningful statistical analyses and interpreting the results correctly.

In [3]:
%pwd

'C:\\Users\\suman\\Downloads\\DS\\numpy\\statistics'

In [1]:
import numpy as np

# Exam scores for Group A and Group B
group_a_scores = np.array([85, 88, 84, 91, 76, 83, 89, 85, 90, 78, 
                           87, 82, 80, 86, 88, 85, 92, 81, 85, 89, 
                           79, 83, 87, 90, 88, 82, 84, 86, 88, 81])

group_b_scores = np.array([78, 80, 76, 85, 72, 79, 81, 77, 84, 73, 
                           82, 75, 71, 80, 78, 82, 86, 75, 79, 81, 
                           74, 76, 80, 85, 78, 77, 81, 79, 83, 76])


In [2]:
from scipy import stats

# Perform independent t-test
t_stat, p_value = stats.ttest_ind(group_a_scores, group_b_scores)

print(f"t-statistic: {t_stat}")
print(f"p-value: {p_value}")


t-statistic: 6.18987035012551
p-value: 6.614744644383411e-08
