# Why should we perform Feature Scaling

**Gradient Descent based algorithms**<br>
*According to gradient descent formula, the features will affect the step size of the gradient descent. The algorithm will reach the minima faster if the features are in range [0,1] or [-1,1]*

**Distance based algorithms**<br>
*In calculating the euclidean distance between two data points is highly affected by the magnitude of that data points. And since both the features are in different scales,  there is a chance that higher weightage is given to features with higher magnitude.*

# Algoritmhs to be feature scaled 
### Gradient descent based :
- Linear Regression
- Logistic Regression
- Neural Networks 




### Distance based :
- KNN
- K-means clustering
- SVM

# Algorithms need not to be feature scaled 
- Decision Tree
- Bagging algorithms
- Boosting algorithms

In [None]:
import pandas as pd

In [None]:
data= pd.read_csv("/content/drive/MyDrive/Data Science /datasets/telecom_churn.csv")
data.head()

Unnamed: 0,Churn,AccountWeeks,ContractRenewal,DataPlan,DataUsage,CustServCalls,DayMins,DayCalls,MonthlyCharge,OverageFee,RoamMins
0,0,128,1,1,2.7,1,265.1,110,89.0,9.87,10.0
1,0,107,1,1,3.7,1,161.6,123,82.0,9.78,13.7
2,0,137,1,0,0.0,0,243.4,114,52.0,6.06,12.2
3,0,84,0,0,0.0,2,299.4,71,57.0,3.1,6.6
4,0,75,0,0,0.0,3,166.7,113,41.0,7.42,10.1


In [None]:
features= data.iloc[:, 1:]
target= data.iloc[:, 0]

## Normalization- Min Max Scaler

In [None]:
from sklearn.preprocessing import MinMaxScaler

scaling= MinMaxScaler() 
scaled_features= scaling.fit_transform(features)
scaled_features

array([[0.52479339, 1.        , 1.        , ..., 0.77081192, 0.54260583,
        0.5       ],
       [0.43801653, 1.        , 1.        , ..., 0.69886948, 0.53765805,
        0.685     ],
       [0.56198347, 1.        , 0.        , ..., 0.39054471, 0.33315008,
        0.61      ],
       ...,
       [0.11157025, 1.        , 0.        , ..., 0.43165468, 0.79384277,
        0.705     ],
       [0.75619835, 0.        , 0.        , ..., 0.36998972, 0.43870258,
        0.25      ],
       [0.30165289, 1.        , 1.        , ..., 0.88386434, 0.73117097,
        0.685     ]])

In [None]:
features_scaled_df= pd.DataFrame(scaled_features, columns= features.columns)
features_scaled_df.head()

Unnamed: 0,AccountWeeks,ContractRenewal,DataPlan,DataUsage,CustServCalls,DayMins,DayCalls,MonthlyCharge,OverageFee,RoamMins
0,0.524793,1.0,1.0,0.5,0.111111,0.755701,0.666667,0.770812,0.542606,0.5
1,0.438017,1.0,1.0,0.685185,0.111111,0.460661,0.745455,0.698869,0.537658,0.685
2,0.561983,1.0,0.0,0.0,0.0,0.693843,0.690909,0.390545,0.33315,0.61
3,0.342975,0.0,0.0,0.0,0.222222,0.853478,0.430303,0.441932,0.170423,0.33
4,0.305785,0.0,0.0,0.0,0.333333,0.4752,0.684848,0.277492,0.407916,0.505


## Standardization- Z Score Normalization

In [None]:
from sklearn.preprocessing import StandardScaler

scaler= StandardScaler()

scaled_features2= scaler.fit_transform(features)
scaled_features2

array([[ 0.67648946,  0.32758048,  1.6170861 , ...,  1.99072703,
        -0.0715836 , -0.08500823],
       [ 0.14906505,  0.32758048,  1.6170861 , ...,  1.56451025,
        -0.10708191,  1.24048169],
       [ 0.9025285 ,  0.32758048, -0.61839626, ..., -0.26213309,
        -1.57434567,  0.70312091],
       ...,
       [-1.83505538,  0.32758048, -0.61839626, ..., -0.01858065,
         1.73094204,  1.3837779 ],
       [ 2.08295458, -3.05268496, -0.61839626, ..., -0.38390932,
        -0.81704825, -1.87621082],
       [-0.67974475,  0.32758048,  1.6170861 , ...,  2.66049626,
         1.28129669,  1.24048169]])

In [None]:
features_scaled_df2= pd.DataFrame(scaled_features2, columns= features.columns)
features_scaled_df2.head()

Unnamed: 0,AccountWeeks,ContractRenewal,DataPlan,DataUsage,CustServCalls,DayMins,DayCalls,MonthlyCharge,OverageFee,RoamMins
0,0.676489,0.32758,1.617086,1.480204,-0.427932,1.566767,0.476643,1.990727,-0.071584,-0.085008
1,0.149065,0.32758,1.617086,2.266072,-0.427932,-0.333738,1.124503,1.56451,-0.107082,1.240482
2,0.902529,0.32758,-0.618396,-0.641642,-1.188218,1.168304,0.675985,-0.262133,-1.574346,0.703121
3,-0.42859,-3.052685,-0.618396,-0.641642,0.332354,2.196596,-1.466936,0.042307,-2.741846,-1.303026
4,-0.654629,-3.052685,-0.618396,-0.641642,1.092641,-0.24009,0.626149,-0.931902,-1.037927,-0.049184
