#**Normalization**

Normalization is the process of scaling individual samples to have unit norm. This process can be useful if you plan to use a quadratic form such as the dot-product or any other kernel to quantify the similarity of any pair of samples.

This assumption is the base of the Vector Space Model often used in text classification and clustering contexts.

The function normalize provides a quick and easy way to perform this operation on a single array-like dataset, either using the L1, L2, or max norms.

Normalization makes the features more consistent with each other, which allows the model to predict outputs more accurately.


In [None]:
import numpy as np
from sklearn import preprocessing

In [None]:
data = [[ 1., -1.,  2.],  [ 2.,  0.,  0.],  [ 0.,  1., -1.]]
data

[[1.0, -1.0, 2.0], [2.0, 0.0, 0.0], [0.0, 1.0, -1.0]]

**1) L1 Normalization**

The L1 norm that is calculated as the sum of the absolute values of the vector.


In [None]:
#L1 
normalized = preprocessing.normalize(data, norm='l1')
normalized

array([[ 0.25, -0.25,  0.5 ],
       [ 1.  ,  0.  ,  0.  ],
       [ 0.  ,  0.5 , -0.5 ]])

**2) L2 Normalization**

The L2 norm that is calculated as the square root of the sum of the squared vector values.

In [None]:
#L2 
normalized = preprocessing.normalize(data, norm='l2')
normalized

array([[ 0.40824829, -0.40824829,  0.81649658],
       [ 1.        ,  0.        ,  0.        ],
       [ 0.        ,  0.70710678, -0.70710678]])

**3) Max Normalization**

The max norm that is calculated as the maximum vector values.

In [None]:
#max
normalized = preprocessing.normalize(data, norm='max')
normalized

array([[ 0.5, -0.5,  1. ],
       [ 1. ,  0. ,  0. ],
       [ 0. ,  1. , -1. ]])