## Scaling to Minimum and Maximum values - MinMaxScaling

Minimum and maximum scaling squeezes the values between 0 and 1. It subtracts the minimum value from all the observations, and then divides it by the value range:

X_scaled = (X - X.min / (X.max - X.min)


The result of the above transformation is a distribution which values vary within the range of 0 to 1. But the mean is not centered at zero and the standard deviation varies across variables. The shape of a min-max scaled distribution will be similar to the original variable, but the variance may change, so not identical. This scaling technique is also sensitive to outliers.

This technique will not **normalize the distribution of the data** thus if this is the desired outcome, we should implement any of the techniques discussed in section 7 of the course.

In a nutshell, MinMaxScaling:

- does not center the mean at 0
- variance varies across variables
- may not preserve the shape of the original distribution
- the minimum and maximum values are 0 and 1.
- sensitive outliers

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pylab as plt

from sklearn.model_selection import  train_test_split
from sklearn.preprocessing import MinMaxScaler

In [2]:
data = pd.read_csv('../Data/mobile_dataset.csv')

In [3]:
data = data[['int_memory', 'mobile_wt', 'px_height', 'px_width','ram', 'battery_power', 'price_range']]
data

Unnamed: 0,int_memory,mobile_wt,px_height,px_width,ram,battery_power,price_range
0,7,188,20,756,2549,842,1
1,53,136,905,1988,2631,1021,2
2,41,145,1263,1716,2603,563,2
3,10,131,1216,1786,2769,615,2
4,44,141,1208,1212,1411,1821,1
...,...,...,...,...,...,...,...
1995,2,106,1222,1890,668,794,0
1996,39,187,915,1965,2032,1965,2
1997,36,108,868,1632,3057,1911,3
1998,46,145,336,670,869,1512,0


In [5]:
X_train, X_test, y_train, y_test = train_test_split(data.drop(['price_range'], axis=1), data['price_range'], random_state=24)

In [6]:
scale = MinMaxScaler()

In [7]:
scale.fit(X_train)

MinMaxScaler()

In [8]:
train_scale = scale.transform(X_train)
test_scale = scale.transform(X_test)

In [9]:
train_scale = pd.DataFrame(train_scale, columns=X_train.columns)
test_scale = pd.DataFrame(test_scale, columns=X_test.columns)

In [10]:
train_scale

Unnamed: 0,int_memory,mobile_wt,px_height,px_width,ram,battery_power
0,0.645161,0.341667,0.400510,0.412550,0.431053,0.056780
1,0.225806,0.741667,0.438265,0.244993,0.531801,0.625919
2,0.064516,0.508333,0.076020,0.348465,0.551844,0.859719
3,0.451613,0.358333,0.314796,0.591455,0.050508,0.867067
4,0.096774,0.091667,0.417857,0.999332,0.730893,0.098196
...,...,...,...,...,...,...
1495,0.274194,0.066667,0.423980,0.809746,0.246660,0.734135
1496,0.564516,0.333333,0.360714,0.466622,0.900321,0.356045
1497,1.000000,0.583333,0.722959,0.643525,0.893640,0.660655
1498,0.161290,0.916667,0.396429,0.413218,0.814003,0.408150
