# Feature Scaling

In [1]:
# # Import necessary dependencies and settings

from sklearn.preprocessing import StandardScaler, MinMaxScaler, RobustScaler
import numpy as np
import pandas as pd

In [2]:
# # Load sample data of video views
views = pd.DataFrame([1295., 25., 19000., 5., 1., 300.], columns=['VIEW'])
views

Unnamed: 0,VIEW
0,1295.0
1,25.0
2,19000.0
3,5.0
4,1.0
5,300.0


# Standardized Scaling

![formaulae standard scalar](datasets_n_images/images/formulae_standard_scalar.png 'formulae_standard_scalar')

In [3]:
ss = StandardScaler()
views['zscore'] = ss.fit_transform(views[['VIEW']])
print(views)

      VIEW    zscore
0   1295.0 -0.307214
1     25.0 -0.489306
2  19000.0  2.231317
3      5.0 -0.492173
4      1.0 -0.492747
5    300.0 -0.449877


# Min-Max Scaling

![formaulae minMax scalar](datasets_n_images/images/min_max_scalar.png 'formulae_minMax_scalar')

In [4]:
# applying Min Max Scalar

mms = MinMaxScaler()
views['minmax'] = mms.fit_transform(views[['VIEW']])
print(views)

# Note this : In the minmax column, the maximum viewed video 
# in row index 2 has a value of 1, and the minimum viewed video 
# in row index 4 has a value of 0.

      VIEW    zscore    minmax
0   1295.0 -0.307214  0.068109
1     25.0 -0.489306  0.001263
2  19000.0  2.231317  1.000000
3      5.0 -0.492173  0.000211
4      1.0 -0.492747  0.000000
5    300.0 -0.449877  0.015738


# Robust Scaling

The disadvantage of min-max scaling is that often the presence of outliers affects the scaled values for any feature. Robust scaling tries to use specific statistical measures to scale features without being affected by outliers. Mathematically this scaler can be represented as

![formaulae Robust scalar](datasets_n_images/images/robust_scalar.png 'formulae_robust_scalar')

where we scale each value of feature X by subtracting the median of X and dividing the resultant by the IQR also known as the Inter-Quartile Range of X which is the range (difference) between the first quartile (25th %ile) and the third quartile (75th %ile).

In [5]:
# applying Robust Scalar

rs = RobustScaler()
views['robust'] = rs.fit_transform(views[['VIEW']])
print(views)

      VIEW    zscore    minmax     robust
0   1295.0 -0.307214  0.068109   1.092883
1     25.0 -0.489306  0.001263  -0.132690
2  19000.0  2.231317  1.000000  18.178528
3      5.0 -0.492173  0.000211  -0.151990
4      1.0 -0.492747  0.000000  -0.155850
5    300.0 -0.449877  0.015738   0.132690


There are several other techniques for feature scaling and normalization, but these should be sufficient to get you started and are used extensively in building Machine Learning systems.

Important Takeaways:
--
![Scaling](datasets_n_images/images/Scaling.png 'Scaling')