# Normalization techniques

In [22]:
import torch

This sample data tensor shall represent a collection of three-dimensional $(x,y,z)$ points as they appear in the Messenger data.

In [23]:
data = torch.tensor([
    [1, 10, 100],
    [2, 20, 200]
], dtype=torch.float32)

In [24]:
print(data)

tensor([[  1.,  10., 100.],
        [  2.,  20., 200.]])


## Naïve normalization

If we simply normalize each coordinate separately, the relative proportions are completely lost!

In [25]:
mean = torch.mean(data, dim=0)
stdev = torch.std(data, dim=0)

naive = (data - mean) / stdev

In [26]:
print(naive)

tensor([[-0.7071, -0.7071, -0.7071],
        [ 0.7071,  0.7071,  0.7071]])


## Ratio-preserving normalization

When normalizing with a deviation measure that includes all coordinates, the relative proportions are preserved.

In [27]:
mean = torch.mean(data, dim=0)
sqdists = torch.norm(data - mean, dim=1) ** 2
distdev = torch.sqrt(torch.mean(sqdists))

preserving = (data - mean) / distdev

In [28]:
print(preserving)

tensor([[-0.0099, -0.0995, -0.9950],
        [ 0.0099,  0.0995,  0.9950]])


However, if the relative proportions turn out too large, the drawback is that numerical issues arise, which to avoid was the reason for normalization in the first place.
