c
# 📊 Descriptive Statistics Reference


## 🔢 Types of Data

- **Qualitative Data**: Non-numerical data describing categories or qualities, such as colors, names, or labels.
- **Quantitative Data**: Numerical data representing quantities, categorized as:
  - **Discrete**: Countable values (e.g., number of students).
  - **Continuous**: Measurable values within a range (e.g., height, weight).

---

## 📐 Measures of Central Tendency

### ✅ Mean (Arithmetic Average)
The average of all data points, calculated by dividing their sum by the number of data points.
**Formula:**
$$ \bar{x} = \frac{\sum_{i=1}^{n} x_i}{n} $$

Where:
- $ \bar{x} $: Sample mean
- $ x_i $: Individual data point
- $ n $: Number of data points

### ✅ Median
The middle value when data points are arranged in ascending order.

**Definition:**
- For odd $ n $: Median = Middle value
- For even $ n $: Median = $ \frac{\text{Sum of two middle values}}{2} $

### ✅ Mode
The value(s) that appear most frequently in the dataset.

**Definition:**
- Unimodal: One mode
- Bimodal/Multimodal: Multiple modes
- No mode: No repeated values

---

## 📊 Measures of Dispersion

### ✅ Range
The difference between the maximum and minimum data values.
**Formula:**
$$ \text{Range} = \max(x_i) - \min(x_i) $$

### ✅ Variance
The average of the squared differences from the mean, measuring data spread.

- **Population Variance:**
  $$ \sigma^2 = \frac{\sum_{i=1}^{N} (x_i - \mu)^2}{N} $$
- **Sample Variance:**
  $$ s^2 = \frac{\sum_{i=1}^{n} (x_i - \bar{x})^2}{n-1} $$

Where:
- $ \sigma^2 $: Population variance
- $ s^2 $: Sample variance
- $ \mu $: Population mean
- $ \bar{x} $: Sample mean
- $ N $: Population size
- $ n $: Sample size

### ✅ Standard Deviation
The square root of the variance, indicating the spread of data points.

- **Population Standard Deviation:**
  $$ \sigma = \sqrt{\frac{\sum_{i=1}^{N} (x_i - \mu)^2}{N}} $$
- **Sample Standard Deviation:**
  $$ s = \sqrt{\frac{\sum_{i=1}^{n} (x_i - \bar{x})^2}{n-1}} $$

### ✅ Interquartile Range (IQR)
The range of the middle 50% of the data, calculated as the difference between the third and first quartiles.
**Formula:**
$$ \text{IQR} = Q3 - Q1 $$

Where:
- $ Q1 $: 25th percentile (first quartile)
- $ Q3 $: 75th percentile (third quartile)

---

## 🔁 Skewness
Measures the asymmetry of the data distribution.
**Formula:**
$$ \text{Skewness} = \frac{\sum_{i=1}^{n} (x_i - \bar{x})^3 / n}{s^3} $$

Where:
- $ s $: Sample standard deviation

- Positive skew: Tail extends to the right (higher values)
- Negative skew: Tail extends to the left (lower values)
- Zero skew: Symmetric distribution

---

## 🔺 Kurtosis
Measures the peakedness or tailedness of the data distribution.
**Formula:**
$$ \text{Kurtosis} = \frac{\sum_{i=1}^{n} (x_i - \bar{x})^4 / n}{s^4} - 3 $$

Where:
- $ s $: Sample standard deviation

- Subtracting 3 gives excess kurtosis relative to a normal distribution
- Positive kurtosis: Heavy tails, sharp peak (leptokurtic)
- Negative kurtosis: Light tails, flat peak (platykurtic)
- Zero kurtosis: Normal distribution (mesokurtic)

---

## 🛠️ Usage

- This document serves as a reference for statistical analysis in research, data science, or academic studies.
- Formulas can be implemented in programming languages like Python (using libraries like NumPy or SciPy), R, or statistical software.
- Ensure your Markdown viewer supports LaTeX rendering to view formulas correctly (e.g., GitHub, VS Code with Markdown+Math extension, or Jupyter Notebook).

---

## 📎 Notes

- Distinguish between population and sample formulas (e.g., variance and standard deviation) based on your dataset.
- Skewness and kurtosis formulas assume sample data; adjust denominators for population data if needed.
- For practical implementation, use statistical libraries like NumPy or SciPy in Python or functions in R for efficient computation.

---




