<p>Please see the corresponding writeup here: http://kawahara.ca/how-to-normalize-vectors-to-unit-norm-in-python/</p>

In [21]:
import numpy as np
from sklearn import preprocessing

# 2 samples, with 3 dimensions.
# The 2 rows indicate 2 samples.
# The 3 columns indicate 3 features for each sample.
X = np.asarray([
[-1,0,1],
[0,1,2]],
dtype=np.float) # Float is needed.

# Before-normalization.
print(X)
# Output,
# [[-1.  0.  1.]
#  [ 0.  1.  2.]]

[[-1.  0.  1.]
 [ 0.  1.  2.]]


<h1>L2 Normalization</h1>

In [22]:
# l2-normalize the samples (rows).
X_normalized = preprocessing.normalize(X, norm='l2')

# After normalization.
print(X_normalized)
# Output,
# [[-0.70710678  0.          0.70710678]
#  [ 0.          0.4472136   0.89442719]]

[[-0.70710678  0.          0.70710678]
 [ 0.          0.4472136   0.89442719]]


In [23]:
# Square all the elements/features.
X_squared = X_normalized ** 2
print(X_squared)
# Output,
# [[ 0.5  0.   0.5]
#  [ 0.   0.2  0.8]]

# Sum over the rows.
X_sum_squared = np.sum(X_squared, axis=1)
print(X_sum_squared)
# Output,
# [ 1.  1.]

# Yay! Each row sums to 1 after being normalized.

[[0.5 0.  0.5]
 [0.  0.2 0.8]]
[1. 1.]


<h1>L1 normalization</h1>

In [24]:
X_normalized_l1 = preprocessing.normalize(X, norm='l1')
print(X_normalized_l1)
# [[-0.5   0.   0.5]
#  [  0.   0.3  0.67]]

[[-0.5         0.          0.5       ]
 [ 0.          0.33333333  0.66666667]]


In [25]:
# Absolute value of all elements/features.
X_abs = np.abs(X_normalized_l1)
print(X_abs)
# [[0.5   0.   0.5]
#  [0     0.3  0.67]]

[[0.5        0.         0.5       ]
 [0.         0.33333333 0.66666667]]


In [26]:
# Sum over the rows.
X_sum_abs = np.sum(X_abs, axis=1)
print(X_sum_abs)
# Output,
# [ 1.  1.]

# Yay! Each row sums to 1 after being normalized.

[1. 1.]


In [None]:
<h2>How to l1-normalize vectors to a unit vector in Python</h2>
Now you might ask yourself, well that worked for L2 normalization. But what about L1 normalization?

In <code>L2 normalization</code> we normalize each sample (row) so the <strong>squared</strong> elements sum to 1. While in <code>L1 normalization</code> we normalize each sample (row) so the <strong>absolute value</strong> of each element sums to 1.

Let's do another example for L1 normalization (where <code>X</code> is the same as above)!

[python]
X_normalized_l1 = preprocessing.normalize(X, norm='l1')
print(X_normalized_l1)
# [[-0.5   0.   0.5]
#  [  0.   0.3  0.67]]
[/python]

Okay looks promising! Let's do a quick sanity check.

[python]
# Absolute value of all elements/features.
X_abs = np.abs(X_normalized_l1)
print(X_abs)
# [[0.5   0.   0.5]
#  [0     0.3  0.67]]

# Sum over the rows.
X_sum_abs = np.sum(X_abs, axis=1)
print(X_sum_abs)
# Output,
# [ 1.  1.]

# Yay! Each row sums to 1 after being normalized.
[/python]

We can now see that taking the absolute value of each element, and then summing across each row, gives the expected value of "1" for each row.