# Statistics Fundamentals

## 1. Mean, Median, and Mode

The **mean** is the average of a dataset, the **median** is the middle value when the data is sorted, and the **mode** is the most frequently occurring value in the dataset.

In [None]:
import numpy as np
from scipy import stats

data = [1, 2, 2, 3, 4, 5, 5, 5, 6]
mean = np.mean(data)
median = np.median(data)
mode = stats.mode(data).mode[0]

print(f'Mean: {mean}')
print(f'Median: {median}')
print(f'Mode: {mode}')

## 2. Standard Deviation and Variance

The **standard deviation** measures the amount of variation in a dataset, while the **variance** is the square of the standard deviation.

In [None]:
variance = np.var(data)
std_deviation = np.std(data)

print(f'Variance: {variance}')
print(f'Standard Deviation: {std_deviation}')

## 3. Probability Distributions

A **probability distribution** describes how the values of a random variable are distributed. Examples include normal, binomial, and Poisson distributions.

In [None]:
import matplotlib.pyplot as plt

# Normal distribution
x = np.linspace(-3, 3, 1000)
y = stats.norm.pdf(x, loc=0, scale=1)
plt.plot(x, y)
plt.title('Normal Distribution')
plt.show()

## 4. Hypothesis Testing

**Hypothesis testing** is a statistical method used to test an assumption regarding a population parameter.

In [None]:
# One-sample t-test
sample_data = [2.1, 2.5, 2.8, 3.0, 3.3]
t_stat, p_value = stats.ttest_1samp(sample_data, popmean=3)

print(f'T-statistic: {t_stat}')
print(f'P-value: {p_value}')

## 5. P-values and Confidence Intervals

A **p-value** indicates the strength of evidence against the null hypothesis, while a **confidence interval** provides a range of plausible values for a population parameter.

In [None]:
# Confidence interval for a sample mean
confidence_level = 0.95
sample_mean = np.mean(sample_data)
sample_std = np.std(sample_data, ddof=1)
n = len(sample_data)
margin_error = stats.t.ppf((1 + confidence_level) / 2, df=n-1) * (sample_std / np.sqrt(n))

lower_bound = sample_mean - margin_error
upper_bound = sample_mean + margin_error

print(f'Confidence Interval: ({lower_bound}, {upper_bound})')