# Hypothesis Testing

* 🧪 What is Hypothesis Testing in Data Science?
Hypothesis testing is a statistical method used in data science to make inferences or decisions about a population based on a sample of data.

It helps you answer questions like:

"Is this new marketing strategy better than the old one?"
"Does this drug actually improve recovery?"
"Is the mean user rating really above 4.0?"

🧠 The Core Idea
You start with two competing hypotheses:

Null Hypothesis (H₀):
A default statement that there is no effect, no difference, or no relationship.

Example: "There is no difference between the means of group A and group B."

Alternative Hypothesis (H₁ or Ha):
What you want to test or prove — that there is a difference or effect.

Example: "Group A has a higher mean than group B."

🧮 Basic Steps in Hypothesis Testing
Formulate Hypotheses

H₀: No change, no effect

H₁: There is a change or effect

Choose Significance Level (α)

Common values: 0.05, 0.01, or 0.10

This is the probability of rejecting H₀ when it's actually true (Type I error)

Select a Test

T-test: Comparing means

Chi-square test: Comparing categories

ANOVA: Comparing means across multiple groups

Z-test: Similar to t-test, used for large samples

Calculate p-value

The probability of getting your observed result (or more extreme) assuming H₀ is true

Make Decision

If p-value ≤ α → Reject H₀ → There is statistically significant evidence

If p-value > α → Fail to reject H₀ → Not enough evidence

🧑‍💻 Real Example (A/B Testing):
Goal: You launch a new website layout (B) and want to see if it increases conversion compared to the old one (A).

H₀: Conversion rate of A = B

H₁: Conversion rate of B > A

Perform a proportion z-test

If p-value < 0.05, you conclude layout B is better

📊 Why Hypothesis Testing Matters in Data Science
Evaluates models, experiments, user behavior

Supports A/B testing, clinical trials, marketing analysis

Makes data-driven decisions with controlled error

If you'd like, I can walk you through a step-by-step Python example using real or mock data.








Steps of hypothesis testing
1) State H0 and H1
2) Chooose level of significance (a)
3) Find critical values
4) Find test statistic
5) Draw your conclusion

We need to do it manually and can get the z value using, scipy.stats as st

In [5]:
import scipy.stats as st
import numpy as np

In [4]:
x_bar = 90
u_bar = 82
p_sd = 20
n = 81

z_test = (x-u)/(sd/root(n))

In [7]:
z_test = (x_bar-u_bar)/(p_sd/np.sqrt(n))
z_test

3.5999999999999996

In [9]:
z_table = st.norm.ppf(0.95)
z_table

1.6448536269514722

In [11]:
if z_table < z_test:
    print("H0 is False: Claim is True")
else:
    print("H0 is True: Claim is False")

H0 is False: Claim is True
