In [5]:
"""
Feature Scaling is a technique to standardize the independent features present in the data.
It is performed during the data pre-processing to handle highly varying values.
If feature scaling is not done then machine learning algorithm tends to use greater values as higher and
consider smaller values as lower regardless of the unit of the values.
For example it will take 10 m and 10 cm both as same regardless of their unit.
We will learn about different techniques which are used to perform feature scaling.
"""

'\nFeature Scaling is a technique to standardize the independent features present in the data.\nIt is performed during the data pre-processing to handle highly varying values.\nIf feature scaling is not done then machine learning algorithm tends to use greater values as higher and\nconsider smaller values as lower regardless of the unit of the values.\nFor example it will take 10 m and 10 cm both as same regardless of their unit.\nWe will learn about different techniques which are used to perform feature scaling.\n'

In [77]:
"""
Absolute Maximum Scaling
1. We should first select the maximum absolute value out of all the entries of a particular measure.
2. Then after this we divide each entry of the column by this maximum value.
Xscaled = Xi-max(∣X∣)/max(∣X∣)

this method is not used that often the reason behind this is that it is too sensitive to the outliers
each entry of the column lies in the range of -1 to 1
"""
import pandas as pd
import numpy as np

df = pd.read_csv('downloads/SampleFile.csv')
print(df.head())

max_vals = df.abs().max()
print(max_vals)
df_scaled = (df - max_vals) / max_vals
print(df_scaled.head())
type(df_scaled)

   LotArea  MSSubClass
0     8450          60
1     9600          20
2    11250          60
3     9550          70
4    14260          60
LotArea       215245
MSSubClass       190
dtype: int64
    LotArea  MSSubClass
0 -0.960742   -0.684211
1 -0.955400   -0.894737
2 -0.947734   -0.684211
3 -0.955632   -0.631579
4 -0.933750   -0.684211


pandas.core.frame.DataFrame

In [82]:
"""  
Min-Max Scaling
1. First we are supposed to find the minimum and the maximum value of the column.
2. Then we will subtract the minimum value from the entry and
   divide the result by the difference between the maximum and the minimum value.
X_scaled = Xi - Xmin / Xmax-Xmin
this method is also prone to outliers but the range in which the data will range 
between 0 to 1, after performing the above two steps.
"""
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(df)
df_scaled = pd.DataFrame(scaled_data,columns=df.columns)
df_scaled.head()

Unnamed: 0,LotArea,MSSubClass
0,0.03342,0.235294
1,0.038795,0.0
2,0.046507,0.235294
3,0.038561,0.294118
4,0.060576,0.235294


In [84]:
"""
3. Normalization
we subtract each entry by the mean value of the whole data and
then divide the results by the difference between the minimum and the maximum value.
X_scaled = Xi-Xmean / Xmax-Xmin
"""
from sklearn.preprocessing import Normalizer

scaler = Normalizer()
scaled_data = scaler.fit_transform(df)
scaled_df = pd.DataFrame(scaled_data, columns=df.columns)
print(scaled_df.head())

    LotArea  MSSubClass
0  0.999975    0.007100
1  0.999998    0.002083
2  0.999986    0.005333
3  0.999973    0.007330
4  0.999991    0.004208


In [88]:
"""
4. Standardization
This method of scaling is basically based on the central tendencies and variance of the data. 
1. Calculate the mean and standard deviation of the data we would like to normalize it.
2. Then we are supposed to subtract the mean value from each entry and
   then divide the result by the standard deviation.
(This helps us achieve a normal distribution of the data with a mean equal to zero and a standard deviation equal to 1.)
X_scaled = Xi - Xmean/sd
"""
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
scaled_data = scaler.fit_transform(df)
scaled_df = pd.DataFrame(scaled_data,columns=df.columns)
print(scaled_df.head())

    LotArea  MSSubClass
0 -0.207142    0.073375
1 -0.091886   -0.872563
2  0.073480    0.073375
3 -0.096897    0.309859
4  0.375148    0.073375


In [90]:
"""
5. Robust Scaling
In this method of scaling, we use two main statistical measures of the data.
Median
Inter-Quartile Range
After calculating these two values we are supposed to subtract the median from each entry
and then divide the result by the interquartile range.
X_scaled = Xi - Xmedian / IQR
"""
from sklearn.preprocessing import RobustScaler

scaler = RobustScaler()
scaled_data = scaler.fit_transform(df)
scaled_df = pd.DataFrame(scaled_data,columns=df.columns)
print(scaled_df.head())

    LotArea  MSSubClass
0 -0.254076         0.2
1  0.030015        -0.6
2  0.437624         0.2
3  0.017663         0.4
4  1.181201         0.2


In [92]:
"""
Scaling, normalization and standardization are essential feature engineering techniques that ensure data is well-prepared for machine learning models.
They help improve model performance, enhance convergence and reduce biases.
Choosing the right method depends on your data and algorithm
"""
"""
Why use Feature Scaling?
In machine learning feature scaling is used for number of purposes:

Range: 
    Scaling guarantees that all features are on a comparable scale and have comparable ranges. 
    This process is known as feature normalisation.
    This is significant because the magnitude of the features has an impact on many machine learning techniques. 
    Larger scale features may dominate the learning process and have an excessive impact on the outcomes.
Algorithm performance improvement: 
    When the features are scaled several machine learning methods including
    gradient descent-based algorithms, distance-based algorithms (such k-nearest neighbours) and
    support vector machines perform better or converge more quickly. 
    The algorithm’s performance can be enhanced by scaling the features which prevent the convergence of the algorithm to the ideal outcome.
Preventing numerical instability: 
    Numerical instability can be prevented by avoiding significant scale disparities between features.
    For examples include distance calculations where having features with differing scales can result in numerical overflow or underflow problems. 
    Stable computations are required to mitigate this issue by scaling the features.
Equal importance: 
    Scaling features makes sure that each characteristic is given the same consideration during the learning process. 
    Without scaling bigger scale features could dominate the learning producing skewed outcomes. 
    This bias is removed through scaling and each feature contributes fairly to model predictions.
"""

'\nWhy use Feature Scaling?\nIn machine learning feature scaling is used for number of purposes:\n\nRange: \n    Scaling guarantees that all features are on a comparable scale and have comparable ranges. \n    This process is known as feature normalisation.\n    This is significant because the magnitude of the features has an impact on many machine learning techniques. \n    Larger scale features may dominate the learning process and have an excessive impact on the outcomes.\nAlgorithm performance improvement: \n    When the features are scaled several machine learning methods including\n    gradient descent-based algorithms, distance-based algorithms (such k-nearest neighbours) and\n    support vector machines perform better or converge more quickly. \n    The algorithm’s performance can be enhanced by scaling the features which prevent the convergence of the algorithm to the ideal outcome.\nPreventing numerical instability: \n    Numerical instability can be prevented by avoiding sig