# 📘 Descriptive Statistics: Key Measures and Exercises
This notebook covers core descriptive statistics concepts with examples and hometasks. Topics include:
- Mean
- Median
- Mode
- Variance
- Standard Deviation

## 🔹 Mean (Average)
The **mean** is the sum of all values divided by the number of values.

In [3]:
import numpy as np

data = [10, 12, 14, 16, 18]
mean_value = np.mean(data)
mean_value

np.float64(14.0)

### 🏠 Mean: Hometasks
1. Calculate the mean for the dataset: `[55, 60, 65, 70, 75, 80]`
2. A worker's weekly hours are: `[40, 42, 38, 41, 44]`. What is the mean weekly working time?
3. Calculate the mean age of individuals: `[23, 29, 35, 41, 47, 53, 59]`

In [4]:
l1=[55, 60, 65, 70, 75, 80]
l1_mean=np.mean(l1)
l1_mean

np.float64(67.5)

In [6]:
l2=[40, 42, 38, 41, 44]

import statistics as ss
ss.mean(l2)

41

In [10]:
l3=[23, 29, 35, 41, 47, 53, 59]
l3_mean=sum(l3)/len(l3)
print(l3_mean)

l3_mean_2=np.mean(l3)
l3_mean_2

41.0


np.float64(41.0)

In [22]:
import pandas as pd

l4= pd.Series([23, 29, 35, 41, 47, 53, 59])
l4_mean=l4.mean()
l4_mean

np.float64(41.0)

## 🔹 Median
The **median** is the middle value in an ordered dataset. If there is an even number of observations, the median is the average of the two middle values.

In [11]:
data = [10, 20, 30, 40, 50, 60]
median_value = np.median(data)
median_value

np.float64(35.0)

### 🏠 Median: Hometasks
1. Find the median of: `[12, 15, 14, 11, 13]`
2. Find the median income for: `[1200, 1500, 1300, 1100, 1600, 1700]`
3. Given house prices: `[100000, 150000, 200000, 250000, 300000, 350000, 400000]`, calculate the median.

In [16]:
m1=[12, 15, 14, 11, 13]
m1_median=np.median(m1)
m1_median

np.float64(13.0)

In [17]:
m2=[1200, 1500, 1300, 1100, 1600, 1700]
m2_median=ss.median(m2)
m2_median

1400.0

In [23]:
m3=pd.Series([100000, 150000, 200000, 250000, 300000, 350000, 400000])
m3.median()

250000.0

In [25]:
m4=[100000, 150000, 200000, 250000, 300000, 350000, 400000]
sorted(m4)[(int((len(m4)+1)/2))-1]

250000

## 🔹 Mode
The **mode** is the value that appears most frequently in a dataset.

In [26]:
import statistics as stats
data = [1, 2, 2, 3, 4, 4, 4, 5]
mode_value = stats.mode(data)
mode_value

4

### 🏠 Mode: Hometasks
1. Find the mode of `[5, 6, 7, 8, 8, 9, 10]`
2. A class's test scores: `[45, 55, 65, 65, 75, 85, 95, 65]`. What is the mode?
3. Determine the mode in household sizes: `[2, 3, 3, 4, 4, 4, 5, 6]`

In [27]:
n1=[5, 6, 7, 8, 8, 9, 10]
n1_mode=stats.mode(n1)
n1_mode

8

In [33]:
n2=pd.Series([45, 55, 65, 65, 75, 85, 95, 65])
n2_mode=n2.mode()
n2_mode

0    65
dtype: int64

In [34]:
n3=[2, 3, 3, 4, 4, 4, 5, 6]
print(stats.mode(n3))

4


## 🔹 Variance
The **variance** measures the spread of the numbers in a dataset. A higher variance means more variability.

In [53]:
data = [5, 10, 15, 20, 25]
variance_value = np.var(data, ddof=1)  # Sample variance
variance_value

np.float64(62.5)

### 🏠 Variance: Hometasks
1. Find the variance of `[2, 4, 6, 8, 10]`
2. Calculate the variance for population: `[100, 150, 200, 250, 300]`
3. Compute the sample variance for exam scores: `[78, 82, 85, 90, 94]`

In [57]:
v1=[2, 4, 6, 8, 10]
print(np.var(v1)) #forpopulation
np.var(v1, ddof=1) #forsample

8.0


np.float64(10.0)

In [58]:
v2=pd.Series([100, 150, 200, 250, 300])
v2.var()

6250.0

In [94]:
v3=[78, 82, 85, 90, 94]
v3_mean=np.mean(v3)
list=[(item-v3_mean)**2 for item in v3]
list
variance=sum(list)/(len(v3)-1)
variance

np.float64(40.2)

In [96]:
np.var(v3,ddof=1)

np.float64(40.2)

## 🔹 Standard Deviation
**Standard deviation** is the square root of the variance. It also shows the spread of data.

In [98]:
std_dev = np.std(data, ddof=1)
std_dev

np.float64(7.905694150420948)

### 🏠 Standard Deviation: Hometasks
1. Calculate the standard deviation of: `[10, 12, 14, 16, 18]`
2. What is the standard deviation for population sizes: `[1000, 1050, 1100, 1150, 1200]`?
3. Find the sample standard deviation for: `[30, 35, 40, 45, 50]`

In [103]:
d1=[10, 12, 14, 16, 18]
std_dev_d1=np.std(d1,ddof=1)
std_dev_d1

np.float64(3.1622776601683795)

In [104]:
d2=pd.Series([1000, 1050, 1100, 1150, 1200])
d2.std()

79.05694150420949

In [105]:
np.std([1000, 1050, 1100, 1150, 1200],ddof=1)

np.float64(79.05694150420949)

In [111]:
n3=[30, 35, 40, 45, 50]
n3_mean=np.mean(n3)
n3_mean
list_std=[(val-n3_mean)**2 for val in n3]
list_std
std_dev_n3=np.sqrt(sum(list_std)/(len(n3)-1))
std_dev_n3

np.float64(7.905694150420948)

In [112]:
np.std(n3,ddof=1)

np.float64(7.905694150420948)