# Data Normalization

Normalizing data is a common preprocessing step in data analysis and machine learning to scale the data into a similar range. Normalization is essential when your data features have different scales, and you want to ensure that each feature contributes equally to the analysis. There are various methods to normalize data, but two of the most commonly used ones are Min-Max normalization and Z-score normalization.

Here is our original dataset

|feature1|feature2|
|---|---|
|10|25|
|5|15|
|20|30|
|15|20|
|8|22|

For feature1:
- `min_value` = 5
- `max_value` = 20
- `mean` =  (10 + 5 + 20 + 15 + 8) / 5 = 11.6 (approx)
- `standard_deviation` = sqrt((10-11.6)^2 + (5-11.6)^2 + (20-11.6)^2 + (15-11.6)^2 + (8-11.6)^2) / 5 = 5.47 (approx)

For feature2:
- `min_value` = 15
- `max_value` = 30
- `mean` =  (25 + 15 + 30 + 20 + 22) / 5 = 22.4 (approx)
- `standard_deviation` = sqrt((25-22.4)^2 + (15-22.4)^2 + (30-22.4)^2 + (20-22.4)^2 + (22-22.4)^2) / 5 = 5.09 (approx)

## Min-Max Normalization

Min-Max normalization scales the data to a fixed range, typically between 0 and 1.

The formula to normalize a single feature (column) is as follows:

```
normalized_value = (x - min_value) / (max_value - min_value)
```

where:

- `x` is the original value of the data point
- `min_value` is the minimum value of the feature in the dataset
- `max_value` is the maximum value of the feature in the dataset

These are the results for Min-Max Normalization:

|feature1|feature2|
|---|---|
|0.375|0.333|
|0.0|0.0|
|1.0|1.0|
|0.75|0.666|
|0.25|0.5|

## Z-score Normalization (Standardization)

Z-score normalization (also known as standardization) scales the data to have a mean of 0 and a standard deviation of 1.

The formula for Z-score normalization of a single feature is as follows:

```
normalized_value = (x - mean) / standard_deviation
```

where:

- `x` is the original value of the data point
- `mean` is the mean of the feature in the dataset
- `standard_deviation` is the standard deviation of the feature in the dataset.

Z-score normalization makes the data centered around 0, and the spread of the data will be adjusted by the standard deviation.

These are the results for Z-score Normalization:

|feature1|feature2|
|---|---|
|-0.51|0.222|
|-1.15|-1.33|
|1.27|1.11|
|0.36|-0.11|
|-0.27|0.11|