# 1. Standard Scaling

`Standard scaling` is a method of scaling the data such that the distribution of the data is centered around 0, with a standard deviation of 1. This is done by subtracting the mean of the data from each data point and then dividing by the standard deviation of the data. This is a very common method of scaling data, and is used in many machine learning algorithms.

The formula is as follows:

z = (x - μ) / σ


In [1]:
# import libraries
import pandas as pd
from sklearn.preprocessing import StandardScaler, MinMaxScaler, MaxAbsScaler

In [2]:
# make an example dataset
df = {
    'age': [18, 45, 32, 56, 28],
    'salary': [35000, 75000, 52000, 85000, 48000],
    'experience': [1, 15, 7, 25, 5],
    'credit_score': [650, 800, 720, 780, 700]
}

# conver this data to pandas datafram
df = pd.DataFrame(df)
df.head()

Unnamed: 0,age,salary,experience,credit_score
0,18,35000,1,650
1,45,75000,15,800
2,32,52000,7,720
3,56,85000,25,780
4,28,48000,5,700


In [4]:
Scalar = StandardScaler()

scaled_df = Scalar.fit_transform(df)
scaled_df
# convert this data into a pandas dataframe
scaled_df = pd.DataFrame(scaled_df, columns=df.columns)
scaled_df.head()

Unnamed: 0,age,salary,experience,credit_score
0,-1.338081,-1.310087,-1.126376,-1.470429
1,0.691592,0.873392,0.516256,1.286626
2,-0.285658,-0.382109,-0.422391,-0.183804
3,1.518497,1.419261,1.689564,0.919018
4,-0.58635,-0.600457,-0.657053,-0.551411


# Min-Max Scaling


In [5]:
# import the scalar
scalar = MinMaxScaler()

# fit the scalar on data
scaled_df = scalar.fit_transform(df)
# convert this data into a pandas dataframe
scaled_df = pd.DataFrame(scaled_df, columns=df.columns)
scaled_df.head()

Unnamed: 0,age,salary,experience,credit_score
0,0.0,0.0,0.0,0.0
1,0.710526,0.8,0.583333,1.0
2,0.368421,0.34,0.25,0.466667
3,1.0,1.0,1.0,0.866667
4,0.263158,0.26,0.166667,0.333333


# Max ABS Scaler


In [6]:
# import the scalar
scalar = MaxAbsScaler()

# fit the scalar on data
scaled_df = scalar.fit_transform(df)
scaled_df
# convert this data into a pandas dataframe
scaled_df = pd.DataFrame(scaled_df, columns=df.columns)
scaled_df.head()

Unnamed: 0,age,salary,experience,credit_score
0,0.321429,0.411765,0.04,0.8125
1,0.803571,0.882353,0.6,1.0
2,0.571429,0.611765,0.28,0.9
3,1.0,1.0,1.0,0.975
4,0.5,0.564706,0.2,0.875


## Rebust Scalar


In [7]:
from sklearn.preprocessing import RobustScaler

# import the scalar
scalar = RobustScaler()

# fit the scalar on data
scaled_df = scalar.fit_transform(df)
scaled_df
# convert this data into a pandas dataframe
scaled_df = pd.DataFrame(scaled_df, columns=df.columns)
scaled_df.head()

Unnamed: 0,age,salary,experience,credit_score
0,-0.823529,-0.62963,-0.6,-0.875
1,0.764706,0.851852,0.8,1.0
2,0.0,0.0,0.0,0.0
3,1.411765,1.222222,1.8,0.75
4,-0.235294,-0.148148,-0.2,-0.25
