# Introduction to Statistics and Descriptive Metrics

## Theory
### Overview of Statistics:

Statistics is the science of collecting, analyzing, interpreting, and presenting data.
Essential for ***data-driven decision-making***, especially in fields like `machine learning (ML)` and `deep learning (DL)`, where understanding data patterns and variability is crucial for building predictive models.
### Importance in ML/DL:

- **Data Understanding**: Helps in data exploration to uncover insights, detect outliers, and understand data distribution.
- **Model Performance**: Statistical knowledge is key to evaluating model results, selecting features, and optimizing parameters.
### Types of Data:

- **Nominal**: Categorical data without any order (e.g., gender, colors).
- **Ordinal**: Categorical data with a meaningful order, but intervals are not consistent (e.g., ratings like poor, fair, good).
- **Interval**: Numeric data with equal intervals, but no true zero (e.g., temperature in Celsius).
- **Ratio**: Numeric data with equal intervals and a true zero, allowing meaningful ratios (e.g., age, weight).
### Descriptive Statistics:

- **Mean**: The average of all values.
- **Median**: The middle value when data is ordered.
- **Mode**: The most frequently occurring value.
- **Range**: The difference between the highest and lowest values.
- **Variance**: Measures the spread of data points around the mean.
- **Standard Deviation**: The square root of variance, indicating the typical distance from the mean.

---

## Practical: Calculating Descriptive Statistics in Python
Let's apply these concepts using Python with NumPy and Pandas.

#### Step 1: Import Required Libraries

In [1]:
import numpy as np
import pandas as pd

#### Step 2: Create a Sample Dataset

In [2]:
data = [12, 15, 14, 10, 18, 20, 16, 15, 19, 11]

#### Step 3: Calculate Descriptive Statistics

#### 1.Mean

In [3]:
mean_value = np.mean(data)
print(f"Mean: {mean_value}")

Mean: 15.0


#### 2. Median 

In [4]:
median_value = np.median(data)
print(f"Median: {median_value}")

Median: 15.0


#### 3. Mode

In [5]:
mode_value = pd.Series(data).mode()[0]  # Using Pandas for mode
print(f"Mode: {mode_value}")

Mode: 15


#### 4. Range

In [6]:
range_value = np.max(data) - np.min(data)
print(f"Range: {range_value}")

Range: 10


#### 5. Variance

In [7]:
variance_value = np.var(data, ddof=1)  # Using sample variance (ddof=1)
print(f"Variance: {variance_value}")

Variance: 11.333333333333334


In [None]:
#### 6. Standard Deviation