# Descriptive Statistics
## Descriptive statistics provide a summary of a dataset by describing its main features, such as central tendency, dispersion, and shape.
## These statistical measures are essential for understanding the underlying structure and patterns in data before conducting further analysis.

### Measures of Central Tendency:
#### Describe the center or typical value of the data.
#### Common metrics: Mean, Median, Mode.

In [1]:
import numpy as np

In [3]:
data = np.array([10, 20, 30, 40, 50])

# Mean
print("Mean:", np.mean(data))

Mean: 30.0


In [4]:
# Median
print("Median: ", np.median(data))

Median:  30.0


### Measures of Dispersion:
#### Indicate the spread or variability in the data.
#### Common metrics: Variance, Standard Deviation, Range, Interquartile Range (IQR).

In [5]:
# Standard Deviation
print("Standard Deviation:", np.std(data)) # Output: 14.14

# Variance
print("Variance:", np.var(data)) # Output: 200.0

# Range
print("Range:", np.ptp(data)) # Output: 40

Standard Deviation: 14.142135623730951
Variance: 200.0
Range: 40


### Other Key Metrics:
#### Percentiles, quantiles, and frequencies.

In [7]:
# 25th and 75th Percentiles
print("25th Percentile:", np.percentile(data, 25)) # Output: 20.0
print("75th Percentile:", np.percentile(data, 75)) # Output: 40.0

25th Percentile: 20.0
75th Percentile: 40.0


### Using Pandas for Descriptive Statistics

In [10]:
# Statistical Summary with describe()
import pandas as pd

data = pd.Series([10, 20, 30, 40, 50])

# Summary statistics
print(data.describe())

count     5.000000
mean     30.000000
std      15.811388
min      10.000000
25%      20.000000
50%      30.000000
75%      40.000000
max      50.000000
dtype: float64


In [11]:
# Central Tendency
# Mean
print("Mean:", data.mean()) # Output: 30.0

# Median
print("Median:", data.median()) # Output: 30.0

# Mode
print("Mode:", data.mode()) # Output: 10 (if there's a tie, it returns all modes)

Mean: 30.0
Median: 30.0
Mode: 0    10
1    20
2    30
3    40
4    50
dtype: int64


In [12]:
# Dispersion
# Variance
print("Variance:", data.var()) # Output: 250.0

# Standard Deviation
print("Standard Deviation:", data.std()) # Output: 15.81

Variance: 250.0
Standard Deviation: 15.811388300841896


In [13]:
# Percentiles
# Quantiles
print("25th Percentile:", data.quantile(0.25)) # Output: 20.0
print("75th Percentile:", data.quantile(0.75)) # Output: 40.0

25th Percentile: 20.0
75th Percentile: 40.0


### Using SciPy for Advanced Descriptive Statistics

### Shape of Data Distribution:
#### Describe the shape or symmetry of the data.
#### Metrics include Skewness and Kurtosis.