In [7]:
import pandas as pd

## Feature Scaling 
two most common techniques: 
    Standardization and Normalization

In [10]:
data = pd.DataFrame([[-1, 2], [-0.5, 6], [0, 10], [1, 18]])

### Normalization
Make data columns into the same scale in the range of 0 to 1
<br />Also known as Min-Max Scale
![norm](https://miro.medium.com/max/506/1*ii46F2WDo9mvFvdxzUvGbQ.png) 



In [14]:
from sklearn.preprocessing import MinMaxScaler

min_max_scaler = MinMaxScaler()
data_new = min_max_scaler.fit_transform(data)
print(data_new)

[[0.   0.  ]
 [0.25 0.25]
 [0.5  0.5 ]
 [1.   1.  ]]


In [12]:
def normalize(values):
    return (values - values.min())/(values.max() - values.min())

In [13]:
normalize(data)

Unnamed: 0,0,1
0,0.0,0.0
1,0.25,0.25
2,0.5,0.5
3,1.0,1.0


### Standardization
The values are centered around the mean with a unit standard deviation
<br />the values are not restricted to a particular range
<img src="https://miro.medium.com/max/920/1*YSAAU_v--I8OlHQzG5A1Sg.png" width="200" height="700"/>


In [16]:
from sklearn.preprocessing import StandardScaler

standard_scaler = StandardScaler()
data_new2 = standard_scaler.fit_transform(data)
print(data_new2)

[[-1.18321596 -1.18321596]
 [-0.50709255 -0.50709255]
 [ 0.16903085  0.16903085]
 [ 1.52127766  1.52127766]]


In [17]:
def standardize(values):
    return (values - values.mean())/values.std()

In [18]:
standardize(data)

Unnamed: 0,0,1
0,-1.024695,-1.024695
1,-0.439155,-0.439155
2,0.146385,0.146385
3,1.317465,1.317465


### Normalization V.S. Standardization 
* Normalization:<br />    the distribution of data does not follow a Gaussian distribution. This can be useful in algorithms that do not assume any distribution of the data like K-Nearest Neighbors and Neural Networks.
* Standardization:<br />    the data follows a Gaussian distribution. Also, unlike normalization, standardization does not have a bounding range. So, even if you have outliers in your data, they will not be affected by standardization.

### Quiz
* for Gradient Descent Based Algorithms:

* for Distance Based Algorithms:

* for Tree Based Algorithms:

#### Reference:
1. https://www.analyticsvidhya.com/blog/2020/04/feature-scaling-machine-learning-normalization-standardization/
2. https://www.youtube.com/watch?v=mnKm3YP56PY