# Mean
The mean, often referred to as the average, is a measure of central tendency that is calculated by summing all the values in a dataset and dividing by the number of values. It is a useful statistic for understanding the overall trend of the data.

## Types of Mean

In [1]:
import numpy as np
import statistics
from scipy import stats

### 1. Arithmetic Mean
The arithmetic mean is the most common type of mean. It is calculated using the formula:
$$
\text{Arithmetic Mean} = \frac{\sum_{i=1}^{n} x_i}{n}
$$

where \( x_i \) represents each value in the dataset and \( n \) is the total number of values.

$$
AM = \frac{(x₁ + x₂ + ... + xₙ)}{n}
$$

**When to use it:**

- General purpose averaging for data on an interval or ratio scale
- When values are independent of each other
- For data without extreme outliers
- Examples: Average test scores, average temperature, average sales per day


In [2]:
data = [2, 4, 6, 8, 10]

# Using NumPy
am_numpy = np.mean(data)
print(f"NumPy Arithmetic Mean: {am_numpy}")  # 6.0

# Using statistics module
am_stats = statistics.mean(data)
print(f"Statistics Module Mean: {am_stats}")  # 6.0

# Manual calculation
am_manual = sum(data) / len(data)
print(f"Manual Arithmetic Mean: {am_manual}")  # 6.0


NumPy Arithmetic Mean: 6.0
Statistics Module Mean: 6
Manual Arithmetic Mean: 6.0


### 2. Geometric Mean
The nth of the product of n numbers. It is calculated using the formula:

$$
\text{Geometric Mean} = \left( \prod_{i=1}^{n} x_i \right)^{\frac{1}{n}}
$$

where \( x_i \) represents each value in the dataset and \( n \) is the total number of values.

$$
GM = (x₁ * x₂ * ... * xₙ)^{(1/n)}
$$

**When to use it:**
- Growth rates and compound interest
- Data that are multiplicative rather than additive
- Ratios and percentages
- When data spans multiple orders of magnitude
- Examples: Investment returns, population growth, bacterial growth

In [3]:
# Growth rates (e.g., 10% growth = 1.10, 5% loss = 0.95)
growth_rates = [1.10, 1.05, 0.95, 1.15, 1.08]

# Using SciPy
gm_scipy = stats.gmean(growth_rates)
print(f"SciPy Geometric Mean: {gm_scipy:.4f}")  # 1.0651

# Using manual calculation
product = np.prod(growth_rates)
gm_manual = product ** (1/len(growth_rates))
print(f"Manual Geometric Mean: {gm_manual:.4f}")  # 1.0651

# What this means: Average growth rate is 6.51% per period

SciPy Geometric Mean: 1.0639
Manual Geometric Mean: 1.0639


### 3. Harmonic Mean
The reciprocal of the arithmetic mean of the reciprocals of the data points.

$$
\text{Harmonic Mean} = \frac{n}{\sum_{i=1}^{n} \frac{1}{x_i}}
$$
where \( x_i \) represents each value in the dataset and \( n \) is the total number of values.

$$
HM = \frac{n}{(1/x₁ + 1/x₂ + ... + 1/xₙ)}
$$

**When to use it:**
- Rates and ratios
- Average speeds when distances are equal
- Precision/Recall in machine learning (F1-score)
- Data where time is involved
- Examples: Average speed, parallel resistance, P/E ratios in finance

In [5]:
# Speeds for equal distances (km/h)
speeds = [60, 80, 40, 77, 120]

# Using SciPy
hm_scipy = stats.hmean(speeds)
print(f"SciPy Harmonic Mean: {hm_scipy:.2f}")  # 80.00

# Using statistics module
hm_stats = statistics.harmonic_mean(speeds)
print(f"Statistics Harmonic Mean: {hm_stats:.2f}")  # 80.00

# Why harmonic mean for speeds?
# If you travel 3 equal distances at these speeds,
# harmonic mean gives the true average speed

SciPy Harmonic Mean: 66.24
Statistics Harmonic Mean: 66.24


### 4. Weighted Mean
An arithmetic mean where each value has a specific weight.

$$
\text{Weighted Mean} = \frac{\sum_{i=1}^{n} w_i \cdot x_i}{\sum_{i=1}^{n} w_i}
$$

where \( w_i \) represents the weight for each value \( x_i \).

$$
WM = \frac{(w₁*x₁ + w₂*x₂ + ... + wₙ*xₙ)}{(w₁ + w₂ + ... + wₙ)}
$$

**When to use it:**
- When some values are more important than others
- GPA calculation (credits as weights)
- Stock indices (market cap as weights)
- Survey data with different sample sizes
- Examples: Course grades with credit hours, portfolio returns


In [6]:
# Student grades and corresponding credit hours
grades = [85, 92, 78, 90]
credits = [3, 4, 2, 3]  # weights

# Using NumPy
wm_numpy = np.average(grades, weights=credits)
print(f"NumPy Weighted Mean: {wm_numpy:.2f}")  # 87.42

# Manual calculation
weighted_sum = sum(g * w for g, w in zip(grades, credits))
total_weights = sum(credits)
wm_manual = weighted_sum / total_weights
print(f"Manual Weighted Mean: {wm_manual:.2f}")  # 87.42

NumPy Weighted Mean: 87.42
Manual Weighted Mean: 87.42


### 5. Trimmed Mean
An arithmetic mean calculated after removing a percentage of extreme values from both ends.

$$
\text{Trimmed Mean} = \frac{\sum_{i=k+1}^{n-k} x_i}{n - 2k}
$$
where \( k \) is the number of values trimmed from each end.

$$TM = \frac{(x_{k+1} + x_{k+2} + ... + x_{n-k})}{(n - 2k)}$$

**When to use it:**
- Data with outliers that shouldn't be completely ignored
- Robust statistics when you want to reduce outlier influence
- Economic data (like Olympic judging)
- When data might have measurement errors
- Examples: Average income with outliers removed, average test scores excluding extreme values

In [7]:
# Data with potential outliers
data_with_outliers = [15, 18, 19, 20, 21, 22, 23, 24, 25, 100]

# Using SciPy (trim 10% from each end)
trimmed_scipy = stats.trim_mean(data_with_outliers, 0.1)
print(f"SciPy Trimmed Mean: {trimmed_scipy:.2f}")  # 21.75

# Compare with regular mean
regular_mean = np.mean(data_with_outliers)
print(f"Regular Mean: {regular_mean:.2f}")  # 26.70

# The trimmed mean is much more representative of the central data!

SciPy Trimmed Mean: 21.50
Regular Mean: 28.70


## Comparison Example: When to Use Which Mean


In [10]:
# Example: Investment returns over 3 years
returns = [1.10, 1.25, 0.80]  # 10% gain, 25% gain, 20% loss

print("Investment Returns Analysis:")
print(f"Arithmetic Mean: {np.mean(returns):.3f}")
print(f"Geometric Mean: {stats.gmean(returns):.3f}")
print(f"Harmonic Mean: {stats.hmean(returns):.3f}")

# Which is correct for investment growth?
final_value = 1.0
for r in returns:
    final_value *= r
    
print(f"Actual growth over 3 periods: {final_value:.3f}")
print(f"Geometric mean applied 3 times: {stats.gmean(returns)**3:.3f}")

Investment Returns Analysis:
Arithmetic Mean: 1.050
Geometric Mean: 1.032
Harmonic Mean: 1.014
Actual growth over 3 periods: 1.100
Geometric mean applied 3 times: 1.100
