# Hypertools Normalization

The normalize feature Z-transforms your data (over columns or rows of an array, or list of arrays, as desired).

By default, the function is set to normalize ‘across’ the columns of all lists, but it can also normalize the columns ‘within’ each individual list, or alternatively, for each row in the array. It returns an array or list of arrays where the columns or rows are z-scored (output type same as input type)

This feature is especially useful for data reduction and machine learning techniques that are sensitive to scaling differences between features.

## Import Packages

In [6]:
import hypertools as hyp
import numpy as np
import matplotlib.pyplot as plt

## Generate synthetic data

In [7]:
cluster1 = np.random.multivariate_normal(np.zeros(3), np.eye(3), size=100)
cluster2 = np.random.multivariate_normal(np.zeros(3)+10, np.eye(3), size=100)

data = [cluster1, cluster2]

## Normalizing

Simply pass data to normalize with no additional arguments to z-score across lists.

In [8]:
hyp.normalize(data)

[array([[-1.04209273, -0.90978167, -0.61590356],
        [-1.01460522, -1.32797426, -1.08567012],
        [-0.90147856, -0.8907131 , -1.15345733],
        [-1.01541389, -0.9091629 , -0.83977903],
        [-0.82061808, -1.33803953, -0.9737548 ],
        [-1.3192241 , -0.93583037, -1.12612115],
        [-0.73429618, -0.91164124, -0.96017266],
        [-1.07738035, -0.97603518, -1.50867896],
        [-1.26533296, -0.71477688, -1.02485484],
        [-0.83803208, -0.86504471, -0.99183391],
        [-0.91161786, -0.56342652, -1.09458544],
        [-1.03062   , -1.1192014 , -0.85773923],
        [-0.77913749, -0.53846212, -0.92471084],
        [-0.95675897, -1.08826249, -1.00969289],
        [-1.04369948, -1.09329553, -1.15045321],
        [-0.759182  , -1.3153986 , -0.93039052],
        [-0.90588574, -1.13559308, -0.95299039],
        [-0.84913179, -1.26439212, -0.81728596],
        [-0.84402177, -0.81783601, -1.21492322],
        [-1.1237754 , -1.07433053, -0.7480133 ],
        [-1.04258083

Or, pass one of the following normalization arguments, listed below, as shown in the following examples.

+ 'across' - columns z-scored across passed lists (default)
+ 'within' - columns z-scored within passed lists
+ 'row' - rows z-scored 

In [13]:
hyp.normalize(data, normalize = 'across')

[array([[-1.04209273, -0.90978167, -0.61590356],
        [-1.01460522, -1.32797426, -1.08567012],
        [-0.90147856, -0.8907131 , -1.15345733],
        [-1.01541389, -0.9091629 , -0.83977903],
        [-0.82061808, -1.33803953, -0.9737548 ],
        [-1.3192241 , -0.93583037, -1.12612115],
        [-0.73429618, -0.91164124, -0.96017266],
        [-1.07738035, -0.97603518, -1.50867896],
        [-1.26533296, -0.71477688, -1.02485484],
        [-0.83803208, -0.86504471, -0.99183391],
        [-0.91161786, -0.56342652, -1.09458544],
        [-1.03062   , -1.1192014 , -0.85773923],
        [-0.77913749, -0.53846212, -0.92471084],
        [-0.95675897, -1.08826249, -1.00969289],
        [-1.04369948, -1.09329553, -1.15045321],
        [-0.759182  , -1.3153986 , -0.93039052],
        [-0.90588574, -1.13559308, -0.95299039],
        [-0.84913179, -1.26439212, -0.81728596],
        [-0.84402177, -0.81783601, -1.21492322],
        [-1.1237754 , -1.07433053, -0.7480133 ],
        [-1.04258083

In [14]:
hyp.normalize(data, normalize = 'within')

[array([[-0.31994902,  0.40577034,  2.07836636],
        [-0.17730156, -1.91493054, -0.58952666],
        [ 0.40977343,  0.51158871, -0.97450299],
        [-0.18149819,  0.40920416,  0.80693531],
        [ 0.82940188, -1.97078636,  0.04606162],
        [-1.75813258,  0.26121677, -0.81925567],
        [ 1.2773726 ,  0.39545095,  0.12319711],
        [-0.50307546,  0.03810584, -2.99187373],
        [-1.47846251,  1.48792201, -0.2441452 ],
        [ 0.73903129,  0.65403183, -0.05661311],
        [ 0.35715513,  2.32781954, -0.6401584 ],
        [-0.26041088, -0.75637494,  0.70493591],
        [ 1.04466693,  2.46635592,  0.32459153],
        [ 0.12289366, -0.58468381, -0.15803761],
        [-0.32828735, -0.61261399, -0.95744202],
        [ 1.14822672, -1.84514368,  0.29233552],
        [ 0.38690221, -0.84733829,  0.16398662],
        [ 0.68142893, -1.56209041,  0.9346777 ],
        [ 0.70794758,  0.91600986, -1.32357942],
        [-0.7438443 , -0.5073704 ,  1.32809019],
        [-0.32248203

In [15]:
hyp.normalize(data, normalize = 'row')

[array([[-1.01484267, -0.34555626,  1.36039892],
        [ 0.96954914, -1.37639156,  0.40684242],
        [ 0.71624934,  0.69792465, -1.41417399],
        [-1.26246597,  0.07929571,  1.18317026],
        [ 1.03344962, -1.35277888,  0.31932926],
        [-1.22259463,  1.22688385, -0.00428922],
        [ 1.39320676, -0.48629126, -0.9069155 ],
        [ 0.51186705,  0.88577314, -1.39764018],
        [-1.16387753,  1.2776699 , -0.11379237],
        [ 0.97547526,  0.39901826, -1.37449352],
        [-0.22463021,  1.32151151, -1.09688131],
        [-0.1833003 , -1.1227636 ,  1.3060639 ],
        [-0.16875946,  1.30037323, -1.13161377],
        [ 1.14804075, -1.28921393,  0.14117318],
        [ 1.27408674, -0.10551047, -1.16857628],
        [ 1.04462949, -1.34788238,  0.3032529 ],
        [ 0.94911418, -1.38251453,  0.43340035],
        [ 0.64484545, -1.41243683,  0.76759139],
        [ 0.66779626,  0.74570185, -1.41349811],
        [-0.8053648 , -0.60406516,  1.40942996],
        [ 1.22090296