# Two ways to normalize a column in dataset

## Using Min-Max Normalization

The output range will be [0, 1]

In [1]:
import pandas as pd
from sklearn.preprocessing import MinMaxScaler

# Example DataFrame
data = {'column_to_normalize': [10, 20, 30, 40, 33, 50]}
df = pd.DataFrame(data)

# Initialize the MinMaxScaler
scaler = MinMaxScaler()

# Fit and transform the data
df['column_to_normalize'] = scaler.fit_transform(df[['column_to_normalize']])

print(df)


   column_to_normalize
0                0.000
1                0.250
2                0.500
3                0.750
4                0.575
5                1.000


# Using Standardization (Z-score Normalization)


Z-score normalization, also known as standard score normalization, is a technique used to scale data so that it has a mean of 0 and a standard deviation of 1. This scaling makes different datasets or features comparable and is especially useful when features have different units or very different variances. The output range will be [-2, 2]

**Formula**
$$
z = \frac{(x-\mu)}{\sigma}
$$

Where:

$x$ is the original data point.

μ is the mean of the dataset.

𝜎 is the standard deviation of the dataset.

In [None]:
import pandas as pd
from sklearn.preprocessing import StandardScaler

# Example DataFrame
data = {'column_to_normalize': [10, 20, 30, 40, 50]}
df = pd.DataFrame(data)

# Initialize the StandardScaler
scaler = StandardScaler()

# Fit and transform the data
df['column_to_normalize'] = scaler.fit_transform(df[['column_to_normalize']])

print(df)


   column_to_normalize
0            -1.414214
1            -0.707107
2             0.000000
3             0.707107
4             1.414214
