# Descriptive Statistics Exercises and Solutions
This notebook includes basic exercises related to descriptive statistics, including mean, median, mode, variance, and standard deviation.

## Exercise 1: Calculate Measures of Central Tendency
**Task:** Given the following data, calculate the mean, median, and mode:
`data = [12, 15, 12, 18, 19, 12, 16, 14, 18, 19]`

In [None]:
import statistics as stats
import numpy as np
import pandas as pd

data = [12, 15, 12, 18, 19, 12, 16, 14, 18, 19]
mean = np.mean(data)
median = np.median(data)
mode = stats.mode(data)
mean, median, mode

## Exercise 2: Calculate Measures of Dispersion
**Task:** Using the same data, calculate the variance and standard deviation.

In [None]:
variance = np.var(data, ddof=1)  # Sample variance
std_dev = np.std(data, ddof=1)  # Sample standard deviation
variance, std_dev

## Exercise 3: Five Number Summary
**Task:** Compute the minimum, Q1, median (Q2), Q3, and maximum.

In [None]:
min_val = np.min(data)
q1 = np.percentile(data, 25)
q2 = np.percentile(data, 50)
q3 = np.percentile(data, 75)
max_val = np.max(data)
min_val, q1, q2, q3, max_val

## Exercise 4: Visualizing the Data
**Task:** Create a boxplot to visualize the distribution of the data.

In [None]:
import matplotlib.pyplot as plt
plt.boxplot(data, vert=False)
plt.title('Boxplot of Data')
plt.xlabel('Values')
plt.grid(True)
plt.show()

## 🏠 Homework Tasks for Descriptive Statistics
Solve the following tasks independently. Try to interpret your results as well.


### 📌 Homework: Measures of Central Tendency
#### Task 1:
Given the dataset: `[21, 23, 19, 25, 30, 21, 20, 19, 25, 23]`, calculate the **mean**, **median**, and **mode**.

#### Task 2:
You surveyed 15 households and collected the following data for the number of children: `[2, 3, 1, 4, 2, 3, 3, 2, 1, 4, 5, 3, 2, 1, 4]`. Find the **mean**, **median**, and **mode**.

#### Task 3:
In a small company, the monthly salaries (in hundreds) are: `[35, 40, 45, 50, 40, 35, 100, 45, 40, 35]`. Compute the **mean**, **median**, and **mode** and comment on any skewness.

In [None]:
#Task1

x1=[21, 23, 19, 25, 30, 21, 20, 19, 25, 23]

data1=pd.Series(x1)

print(f"Mean:{data1.mean()}   Median:{data1.median()}   Mode:{data1.mode().tolist()}")

plt.boxplot(data1,vert=False)
plt.title('Boxplot of Data')
plt.xlabel('Values')
plt.grid(True)
plt.show()


#Mean>Median .It means data is right skewed. 
#And we have 4 modes(19,21,23,25)

In [None]:
#Task2

x2=[2, 3, 1, 4, 2, 3, 3, 2, 1, 4, 5, 3, 2, 1, 4]
data2=pd.Series(x2)


print(f"Mean:{round(data2.mean(),2)}      Median:{data2.median()}     Mode:{data2.mode().tolist()}")

plt.boxplot(data2,vert=False)
plt.title('Boxplot of Data')
plt.xlabel('Values')
plt.grid(True)
plt.show()



#Here is Median>Mean. It means data is left skewed.
#We have 2 modes (2,3)

In [None]:
#Task3

x3=[35, 40, 45, 50, 40, 35, 100, 45, 40, 35]
data3=pd.Series(x3)

print(f"Mean: {data3.mean()}    Median: {data3.median()}    Mode: {data3.mode().tolist()}")

plt.boxplot(data3,vert=False)
plt.title('Boxplot of Data')
plt.xlabel('Values')
plt.grid(True)
plt.show()

#Mean>Median .It means data is right skewed
#We have 2 modes and 1 outlier (100)

### 📌 Homework: Measures of Dispersion
#### Task 1:
Use the dataset: `[5, 10, 10, 10, 15, 20, 25, 25, 30]` to calculate **variance** and **standard deviation**.

#### Task 2:
Income levels (in AZN) for 8 families are: `[500, 700, 800, 600, 1200, 1000, 950, 1100]`. Calculate **variance** and **standard deviation**.

#### Task 3:
Evaluate the spread of scores in a classroom: `[60, 65, 70, 75, 80, 85, 90, 95, 100]`. What are the **sample variance** and **standard deviation**?

In [None]:
#Task1
x4=[5, 10, 10, 10, 15, 20, 25, 25, 30]
data4=pd.Series(x4)
data4.std()
data4.var()

print(f"  Standard deviation: {round(data4.std(),2)}        Variance: {round(data4.var(),2)}")

In [None]:
#Task2
x5=[500, 700, 800, 600, 1200, 1000, 950, 1100]
data5=pd.Series(x5)

print(f"   Standard deviation:  {round(data5.std(),2)}       Variance:   {round(data5.var(),2)}")

In [None]:
#Task3
x6=[60, 65, 70, 75, 80, 85, 90, 95, 100]
data6=pd.Series(x6)
print(f"   Standard deviation:  {round(data6.std(),2)}       Variance:   {round(data6.var(),2)}")

### 📌 Homework: Five Number Summary
#### Task 1:
Compute the **five-number summary** for `[100, 102, 105, 107, 110, 113, 115, 117, 120]`.

#### Task 2:
Given household expenses in a community: `[230, 250, 270, 290, 310, 330, 350, 370, 390, 410]`, calculate the **minimum**, **Q1**, **Q2**, **Q3**, and **maximum**.

#### Task 3:
Analyze the five-number summary for the number of working hours per week: `[35, 36, 38, 40, 42, 44, 46, 48, 50, 55]`.

In [None]:
#Task 1
data1=[100, 102, 105, 107, 110, 113, 115, 117, 120]
min_val = np.min(data1)
q1 = np.percentile(data1, 25)
q2 = np.percentile(data1, 50)
q3 = np.percentile(data1, 75)
max_val = np.max(data1)
min_val, q1, q2, q3, max_val

In [None]:
#Task 2
data2=[230, 250, 270, 290, 310, 330, 350, 370, 390, 410]

min_val = np.min(data2)
q1 = np.percentile(data2, 25)
q2 = np.percentile(data2, 50)
q3 = np.percentile(data2, 75)
max_val = np.max(data2)

min_val, q1, q2, q3, max_val



In [None]:
#Task 3
data3=[35, 36, 38, 40, 42, 44, 46, 48, 50, 55]

min_val = np.min(data3)
q1 = np.percentile(data3, 25)
q2 = np.percentile(data3, 50)
q3 = np.percentile(data3, 75)
max_val = np.max(data3)

min_val, q1, q2, q3, max_val
