# Inferential Statistics — Python for Data Science


##  What is Inferential Statistics?
Inferential Statistics helps us **draw conclusions** and make **predictions** about a population based on a sample.


It answers questions like:
- Is there a significant difference between two groups?
- Is the sample average statistically different from a known value?
- Do two variables have a meaningful relationship?

##  Libraries Required


In [4]:
import numpy as np
import pandas as pd
from scipy import stats
import matplotlib.pyplot as plt
import seaborn as sns

## Sample Data


In [6]:
sample_data = [85, 90, 78, 92, 88, 76, 95, 89, 84, 91]


## Key Concepts in Inferential Statistics

## 1. Population vs Sample
Population: Entire dataset

Sample: Subset used to infer about the population

## 2. Confidence Intervals
Interpretation: We are 95% confident that the true population mean lies within this interval.

In [7]:
import math

sample_mean = np.mean(sample_data)
sample_std = np.std(sample_data, ddof=1)
n = len(sample_data)

# 95% confidence interval for the mean
z = 1.96  # z-score for 95%
margin_of_error = z * (sample_std / math.sqrt(n))
ci_lower = sample_mean - margin_of_error
ci_upper = sample_mean + margin_of_error

print(f"95% Confidence Interval: ({ci_lower:.2f}, {ci_upper:.2f})")


95% Confidence Interval: (83.03, 90.57)


## 3. Hypothesis Testing Overview
Null Hypothesis (H₀): No effect / no difference

Alternative Hypothesis (H₁): There is an effect / difference

p-value: Probability that observed result occurred by chance

If p < 0.05: Reject the null hypothesis



## 4. One Sample t-test

In [10]:
# Test if the mean is significantly different from 80
t_stat, p_value = stats.ttest_1samp(sample_data, 80)
print("t-statistic:", t_stat)
print("p-value:", p_value)
#Check if p-value < 0.05 → statistically significant



t-statistic: 3.5319711414286616
p-value: 0.006394135411873378


## 5.Two Sample t-test (Independent)
Used to compare means of two independent groups.



In [11]:
group1 = [78, 82, 88, 90, 85]
group2 = [92, 95, 89, 94, 91]

t_stat, p_value = stats.ttest_ind(group1, group2)
print("t-statistic:", t_stat)
print("p-value:", p_value)


t-statistic: -3.183289703016889
p-value: 0.012933289263518293


## 6. 🔁 Paired t-test (Dependent Samples)
📝 Used when comparing same group before & after an intervention.



In [12]:
before = [70, 75, 78, 74, 72]
after  = [78, 80, 82, 79, 77]

t_stat, p_value = stats.ttest_rel(before, after)
print("t-statistic:", t_stat)
print("p-value:", p_value)


t-statistic: -7.961865632364446
p-value: 0.001348170975769803


## 7. 🔢 Chi-Square Test for Independence
 Used for categorical data (e.g., gender vs preference).




In [13]:
# Contingency table
data = [[20, 30],
        [15, 35]]

chi2, p, dof, expected = stats.chi2_contingency(data)
print("Chi-square:", chi2)
print("p-value:", p)


Chi-square: 0.7032967032967032
p-value: 0.4016781664697727


## 8. 📐 ANOVA (Analysis of Variance)
Use ANOVA to compare 3 or more group means.



In [14]:
group_a = [85, 87, 90, 88]
group_b = [78, 79, 81, 80]
group_c = [92, 91, 95, 94]

f_stat, p_value = stats.f_oneway(group_a, group_b, group_c)
print("F-statistic:", f_stat)
print("p-value:", p_value)


F-statistic: 59.24999999999976
p-value: 6.596224997267661e-06


## SUMMARY

| Test Type           | Purpose                                 | Function                      |
| ------------------- | --------------------------------------- | ----------------------------- |
| Confidence Interval | Estimate population mean range          | Manual / `stats.t.interval()` |
| One-sample t-test   | Compare sample mean to known value      | `stats.ttest_1samp()`         |
| Two-sample t-test   | Compare means of two independent groups | `stats.ttest_ind()`           |
| Paired t-test       | Compare before-after same group         | `stats.ttest_rel()`           |
| Chi-square test     | Test independence of categories         | `stats.chi2_contingency()`    |
| ANOVA               | Compare more than 2 groups              | `stats.f_oneway()`            |


Inferential statistics helps you move beyond simple summaries — into making decisions, testing hypotheses, and validating patterns!