In [1]:
import numpy as np
import sklearn.preprocessing

# L2 normalization
So far, each of the scaling techniques we've used has been applied to the data features (i.e. columns). However, in certain cases we want to scale the individual data observations (i.e. rows). For instance, when clustering data we need to apply L2 normalization to each row, in order to calculate cosine similarity scores. The Clustering section will cover data clustering and cosine similarities in greater depth.

L2 normalization applied to a particular row of a data array will divide each value in that row by the row's L2 norm. In general terms, the L2 norm of a row is just the square root of the sum of squared values for the row.

## &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; X = [x<sub>1</sub>, x<sub>2</sub>, x<sub>3</sub>, ......., x<sub>m</sub>]

###  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; X<sub>L2</sub> = [x<sub>1</sub>/l, x<sub>2</sub>/l, x<sub>3</sub>/l, ......., x<sub>m</sub>/l]
where,
# &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; l = &#8730; &sum; x<sub>i</sub><super>2</super>
  &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; where, i=[1,m] 

The above formula demonstrates L2 normalization applied to row X to obtain the normalized row of values, XL2.

In scikit-learn, the transformer module that implements L2 normalization is the Normalizer.

The code below shows how to use the Normalizer.

In [3]:
data = np.array([[4, 1, 2, 2],
                 [3, 4, 0, 0],
                 [7, 5, 9, 2]])

# predefined data
print('{}\n'.format(repr(data)))

from sklearn.preprocessing import Normalizer
# the above markdown theory tells it's working and one more thing is that it works in row
normalizer = Normalizer()
transformed = normalizer.fit_transform(data)
print('{}\n'.format(repr(transformed)))

array([[4, 1, 2, 2],
       [3, 4, 0, 0],
       [7, 5, 9, 2]])

array([[0.8       , 0.2       , 0.4       , 0.4       ],
       [0.6       , 0.8       , 0.        , 0.        ],
       [0.55513611, 0.39652579, 0.71374643, 0.15861032]])

