# Min-Max Normalization


Another form of scaling your data is to use a min-max normalization process. The name says it all, we find the minimum and maximum data point in our entire data set and set each of those to 0 and 1, respectively. Then the rest of the data points will transform to a number between 0 and 1, depending on its distance between the minimum and maximum number. We find that transformed number by taking the data point subtracting it from the minimum point, then dividing by the value of our maximum minus minimum.

Mathematically a min-max normalization looks like this:

X_norm = (X-X_min)/(X_max-X_min)

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [6]:
df = pd.read_csv('income.csv')
df.columns

Index(['Name', 'Age', 'Income($)'], dtype='object')

In [7]:
income = df['Income($)']
print(income)

0      70000
1      90000
2      61000
3      60000
4     150000
5     155000
6     160000
7     162000
8     156000
9     130000
10    137000
11     45000
12     48000
13     51000
14     49500
15     53000
16     65000
17     63000
18     64000
19     80000
20     82000
21     58000
Name: Income($), dtype: int64


In [8]:
min_income = np.min(income)
print(min_income)

45000


In [9]:
max_income = np.max(income)
print(max_income)


162000


In [10]:
income_range = max_income-min_income
print(income_range)

117000


In [11]:
Norm_income = (income-min_income)/income_range
print(Norm_income)

0     0.213675
1     0.384615
2     0.136752
3     0.128205
4     0.897436
5     0.940171
6     0.982906
7     1.000000
8     0.948718
9     0.726496
10    0.786325
11    0.000000
12    0.025641
13    0.051282
14    0.038462
15    0.068376
16    0.170940
17    0.153846
18    0.162393
19    0.299145
20    0.316239
21    0.111111
Name: Income($), dtype: float64


In [15]:
print(np.min(Norm_income))

0.0


In [16]:
print(np.max(Norm_income))

1.0


# Min-Max Normalization with Sklearn

In [14]:
from sklearn.preprocessing import MinMaxScaler
SCALER = MinMaxScaler()


In [17]:
income

0      70000
1      90000
2      61000
3      60000
4     150000
5     155000
6     160000
7     162000
8     156000
9     130000
10    137000
11     45000
12     48000
13     51000
14     49500
15     53000
16     65000
17     63000
18     64000
19     80000
20     82000
21     58000
Name: Income($), dtype: int64

In [18]:
print(np.max(income))

162000


In [19]:
print(np.min(income))

45000


In [20]:
reshaped_income = np.array(income).reshape(-1,1)
scaled_reshaped = SCALER.fit_transform(reshaped_income)
print(scaled_reshaped)

[[0.21367521]
 [0.38461538]
 [0.13675214]
 [0.12820513]
 [0.8974359 ]
 [0.94017094]
 [0.98290598]
 [1.        ]
 [0.94871795]
 [0.72649573]
 [0.78632479]
 [0.        ]
 [0.02564103]
 [0.05128205]
 [0.03846154]
 [0.06837607]
 [0.17094017]
 [0.15384615]
 [0.16239316]
 [0.2991453 ]
 [0.31623932]
 [0.11111111]]


In [21]:
print(np.min(scaled_reshaped))

0.0


In [22]:
print(np.max(scaled_reshaped))

1.0000000000000002


COOOLL