### **Python `statistics` Module – Notes**

---

#### **1. What is the `statistics` module?**

* The `statistics` module provides **functions for calculating mathematical statistics** of numerical data.
* Introduced in Python 3.4.
* Commonly used for **mean, median, mode, variance, standard deviation**, etc.
* **Import it before use:**

```python
import statistics
```

---

#### **2. Common Functions in `statistics` Module**

##### **A) Mean (Average)**

* `statistics.mean(data)` → Returns the **arithmetic mean**.

In [2]:
import statistics as stats
nums = [1, 2, 3, 4, 5]
print(stats.mean(nums))  # 3

3


##### **B) Median**

* `statistics.median(data)` → Returns the **middle value**.
* If the dataset is even, returns the **average of the two middle values**.

In [3]:
nums = [1, 2, 3, 4, 5]
print(stats.median(nums))  # 3
nums2 = [1, 2, 3, 4]
print(stats.median(nums2)) # 2.5

3
2.5


* **`median_low()`** and **`median_high()`** return the lower or higher middle value for even datasets.

In [7]:
print(stats.median_low([1, 2, 3, 4, 5, 6]))  # 2
print(stats.median_high([1, 2, 3, 4, 5, 6])) # 3

3
4


##### **C) Mode**

* `statistics.mode(data)` → Returns the **most common value**.

In [12]:
nums = [1, 2, 2, 3]
print(stats.mode(nums))  # 2

2


If there are multiple modes and you need all of them, use `multimode()` (Python 3.8+):

In [13]:
nums = [1, 1, 2, 2, 3]
print(stats.multimode(nums))  # [1, 2]

[1, 2]


##### **D) Variance**

* `statistics.variance(data)` → Returns the **sample variance**.
* Measures how far numbers spread from the mean.

In [15]:
nums = [2, 4, 4, 4, 5, 5, 7, 9]
print(stats.variance(nums))  # 4.5714...

4.571428571428571


In [17]:
import math

std = math.sqrt(stats.variance(nums))

print(std)

2.138089935299395


In [18]:
print(stats.stdev(nums))

2.138089935299395


If you want the **population variance**, use:


In [16]:
print(stats.pvariance(nums)) # 4.0

4


##### **E) Standard Deviation**

* `statistics.stdev(data)` → Returns the **sample standard deviation**.
* **`pstdev()`** for **population standard deviation**.


In [19]:
nums = [2, 4, 4, 4, 5, 5, 7, 9]
print(stats.stdev(nums))   # 2.138...
print(stats.pstdev(nums))  # 2.0

2.138089935299395
2.0


##### **F) Harmonic Mean** (Python 3.6+)

* `statistics.harmonic_mean(data)` → Calculates the **harmonic mean** (useful in rates/ratios).

In [20]:
nums = [40, 60]
print(stats.harmonic_mean(nums))  # 48.0

48.0


##### **G) Other Useful Functions**

* `statistics.fmean(data)` (Python 3.8+) → **Fast mean** as a float (even if integers are provided).
* `statistics.geometric_mean(data)` (Python 3.8+) → **Geometric mean**.

In [21]:
nums = [1, 3, 9]
print(stats.geometric_mean(nums))  # 3.0

3.0000000000000004


#### **3. When to Use Which Function?**

* **mean:** To find the average score.
* **median:** To find the middle value (useful when data has outliers).
* **mode:** To find the most frequent value.
* **variance & stdev:** To measure spread or variability.

#### **4. Practice Exercises**

##### **Beginner:**

1. Calculate the mean, median, and mode for `[5, 10, 15, 20, 25]`.
2. Find the harmonic mean of `[10, 20, 40]`.

##### **Intermediate:**

3. Calculate the variance and standard deviation for `[2, 4, 4, 4, 5, 5, 7, 9]`.
4. For `[1, 1, 2, 2, 3]`, find all the modes using `multimode()`.

##### **Advanced:**

5. Write a function that accepts a list of numbers and returns a dictionary with **mean, median, variance, and standard deviation**.
6. Given the dataset `[10, 12, 23, 23, 16, 23, 21, 16]`, find the **population variance and population standard deviation**.

---

#### **5. Key Notes for Students**

* `statistics` functions work with lists, tuples, or any iterable containing numeric data.
* Variance and standard deviation require at least **two data points**, otherwise you'll get an error.
* `mode()` raises an error if there’s no unique mode (for older Python versions), so `multimode()` is safer.