In [None]:
import pandas as pd

In [None]:
data=pd.read_csv("Social_Network_Ads.csv")

In [None]:
def scale_data(feature_scaler):
    columns_to_scale= list(data.select_dtypes(exclude=["object","datetime64"]).columns)
    scaled_array = feature_scaler.fit_transform(data.loc[:,columns_to_scale])
    df_scaled = pd.DataFrame(scaled_array,columns=data.select_dtypes(exclude=["object"]).columns)
    df_scaled.loc[:,columns_to_scale] = feature_scaler.fit_transform(df_scaled.loc[:,columns_to_scale])
    return df_scaled

## Min Max Scaler

All features transformed into the range [0,1] meaning that the minimum and maximum value of a feature/variable is going to be 0 and 1, respectively.

Xsc=X−Xmin/Xmax−Xmin.

In [None]:
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
df_scaled=scale_data(scaler)
df_scaled.head()

## Standard Scaler

StandardScaler transforms data such that its distribution will have a mean value 0 and standard deviation 1.

z= (x-μ)/σ 
 
 μ= mean
 σ= standard deviation 

In [None]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
df_scaled=scale_data(scaler)
df_scaled.head()

## Normalizer

Normalize samples individually to unit norm.
Each sample (i.e. each row of the data matrix) with at least one non zero component is rescaled independently of other samples so that its norm (l1, l2 or inf) equals one.

In [None]:
from sklearn.preprocessing import Normalizer
scaler = Normalizer(norm = 'l2')
# norm = 'l2' is default

df_scaled=scale_data(scaler)
df_scaled.head()

## MaxAbs Scaler

Maximum absolute scaling scales the data to its maximum value; that is, it divides every observation by the maximum value of the variable

x_scaled= x/max(x)

The result of the preceding transformation is a distribution in which the values vary approximately within the range of -1 to 1

In [None]:
from sklearn.preprocessing import MaxAbsScaler
scaler = MaxAbsScaler()

df_scaled=scale_data(scaler)
df_scaled.head()

## Robust Scaler

RobustScaler transforms the feature vector by subtracting the median and then dividing by the interquartile range (75% value — 25% value). 

It is used to scale features using statistics that are robust to outliers

![image.png](attachment:image.png)


In [None]:
from sklearn.preprocessing import RobustScaler
scaler = RobustScaler()
df_scaled=scale_data(scaler)
df_scaled.head()

## Quantile Transformer

Quantile Transformation is a non-parametric data transformation technique to transform your numerical data distribution to following a certain data distribution (often the Gaussian Distribution (Normal Distribution)). In the Scikit-Learn, the Quantile Transformer can transform the data into Normal distribution or Uniform distribution; it depends on your distribution references.



In [None]:
from sklearn.preprocessing import QuantileTransformer
scaler = QuantileTransformer()
df_scaled=scale_data(scaler)
df_scaled.head()

## Power Transformer

While Quantile Transformer is a non-parametric transformer applying Quantile Function, Power Transformer is a parametric transformer via power function. Like the Quantile Transformer, Power Transformer is often used to transform data to follow the Normal Distribution.

From Scikit-Learn, two methods are given within the Power Transformer class: Yeo-Johnson transform, and Box-Cox transforms. The basic difference between the methods is the data they allowed to be transformed — Box-Cox needs the data to be positive, while Yeo-Johnson allowed the data to be both negative and positive. 

In [None]:
from sklearn.preprocessing import PowerTransformer
scaler = PowerTransformer(method = 'yeo-johnson')
'''
parameters:
method = 'box-cox' or 'yeo-johnson'
'''

df_scaled=scale_data(scaler)
df_scaled.head()

## Function Transformer

Scikit-Learn has provided us many transformation methods that we could use for the data preprocessing pipeline. However, we want to apply our own function for data transformation, but Scikit-Learn did not offer it. That is why Scikit-Learn also presents the Function Transformers class to develop their own data transformation function.

In [None]:
import numpy as np
from sklearn.preprocessing import FunctionTransformer
transformer = FunctionTransformer(np.log2, validate = True)
df_scaled=scale_data(transformer)
df_scaled.head()

## K-Bins Discretizations

Discretization is a process of transforming the continuous feature into a categorical feature by partitioning it into several bins within the expected value range (intervals)

In [None]:
from sklearn.preprocessing import KBinsDiscretizer

scaler = KBinsDiscretizer(n_bins = 5, encode = 'ordinal', strategy='quantile')
df_scaled=scale_data(scaler)
df_scaled.head()

##  Feature Binarization

Feature Binarization is a simple discretization process using a certain threshold to transform the continuous feature into a categorical feature. The value results from Feature Binarization is Boolean value — True or False (0 or 1). Let’s try to use the Binarization class from Scikit-Learn to understand the concept.

In [None]:
from sklearn.preprocessing import Binarizer

#Setting the threshold to 20
transformer = Binarizer( threshold = 20)
df_scaled=scale_data(transformer)
df_scaled.head()