# Normalization

The most common form of normalization aim to modify the values so that they sum up to 1.

__L1 Normalization__, which refers to __Least Absolute Deviations__ works by making sure that the sum of absolute values is 1 in each row.

__L2 Normalization__, which refers to __Least Squares__ works by making sure the sum of squares is 1.

Which one to pick? Generally L1 normalization is more robust that L2, as it is resistant to outliers in the data. 
L1 normalization will safely ignore them during the calculations. 

If we are solving a problem where outliers are important, then maybe L2 normalization becomes a better choice.

In [7]:
# import numpy as np
from sklearn.preprocessing import normalize

input_data = np.array([[5.2, -2.9, 3.3],
                       [-1.2, 7.8, -6.1],
                       [3.9, 0.4, 2.1],
                       [7.3, -9.9, -4.5]])

# Normalize data
data_normalized_l1 = normalize(input_data, norm='l1')
data_normalized_l2 = normalize(input_data, norm='l2')

print('data_normalized_l1:\n', data_normalized_l1)
print('data_normalized_l2:\n', data_normalized_l2)

data_normalized_l1:
 [[ 0.45614035 -0.25438596  0.28947368]
 [-0.0794702   0.51655629 -0.40397351]
 [ 0.609375    0.0625      0.328125  ]
 [ 0.33640553 -0.4562212  -0.20737327]]
data_normalized_l2:
 [[ 0.76388033 -0.42601019  0.48477021]
 [-0.12030718  0.78199664 -0.61156148]
 [ 0.87690281  0.08993875  0.47217844]
 [ 0.55734935 -0.75585734 -0.34357152]]
