# Rescaling Features

In [44]:
%matplotlib inline

import numpy as np
import pandas as pd
import seaborn as sns

from sklearn.preprocessing import StandardScaler, MinMaxScaler

### Create DataFrame

In [38]:
# Creating Dataframe
df = pd.DataFrame(data = {'Feature 1': np.random.randint(50,100, size=10),
                 'Feature 2': np.random.randint(0,100, size= 10),
                 'Feature 3': np.random.randint(0,100, size=10),
                 'Feature 4': np.random.randint(0,100, size=10)})

df

Unnamed: 0,Feature 1,Feature 2,Feature 3,Feature 4
0,50,79,63,82
1,57,18,41,1
2,82,68,55,75
3,93,25,18,52
4,57,41,57,62
5,54,34,87,92
6,93,91,11,18
7,96,94,48,80
8,61,80,49,25
9,78,56,9,7


### Standardizing Features
The idea behind StandardScaler is that it will transform your data such that its distribution will have a mean value 0 and standard deviation of 1.

This is useful when you want to compare data that correspond to different units. In that case, you want to remove the units. To do that in a consistent way of all the data, you transform the data in a way that the variance is unitary and that the mean of the series is 0.

In [45]:
# Call and Transform the dataframe using standard scaler

scaler = StandardScaler()
std_df = pd.DataFrame(data=scaler.fit_transform(df), columns=df.columns)
std_df

Unnamed: 0,Feature 1,Feature 2,Feature 3,Feature 4
0,-1.281746,0.773016,0.815759,1.013009
1,-0.875763,-1.538453,-0.118965,-1.503976
2,0.574176,0.356194,0.47586,0.795492
3,1.212149,-1.273202,-1.096177,0.080792
4,-0.875763,-0.666916,0.560835,0.391531
5,-1.049756,-0.932166,1.835458,1.323748
6,1.212149,1.227731,-1.393589,-0.97572
7,1.386142,1.34141,0.178447,0.950861
8,-0.643773,0.810909,0.220935,-0.758203
9,0.342186,-0.098522,-1.478564,-1.317533


### MinMax Scaling Features

Transforms features by scaling each feature to a given range.

This estimator scales and translates each feature individually such that it is in the given range on the training set, i.e. between zero and one.

where min, max = feature_range.

This transformation is often used as an alternative to zero mean, unit variance scaling.


In [49]:
scaler = MinMaxScaler()
std_df = pd.DataFrame(data=scaler.fit_transform(df), columns=df.columns)
std_df

Unnamed: 0,Feature 1,Feature 2,Feature 3,Feature 4
0,0.0,0.802632,0.692308,0.89011
1,0.152174,0.0,0.410256,0.0
2,0.695652,0.657895,0.589744,0.813187
3,0.934783,0.092105,0.115385,0.56044
4,0.152174,0.302632,0.615385,0.67033
5,0.086957,0.210526,1.0,1.0
6,0.934783,0.960526,0.025641,0.186813
7,1.0,1.0,0.5,0.868132
8,0.23913,0.815789,0.512821,0.263736
9,0.608696,0.5,0.0,0.065934


Author: Kavi Sekhon