# Linear Normalization
Most normalizations are linear normalizations  
A linear normalization applies this pattern: 

`xNorm = (x - offset)/spread` 

Where: 
- x is a numeric variable
- offset is a scalar that shifts variable x lower or higher
- spread is a scalar that re-scales variable x to a smaller or larger spread
- xNorm is the normalized variable


In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
#import seaborn as sns
%matplotlib inline

## Different Linear Normalization Methods
First we need an array that we can normalize
We call this array a variable

In [None]:
x = np.array([1,11,5,3,15,3,5,9,7,9,3,5,7,3,5,21.])
X = pd.DataFrame(data = x, columns = ['x'])
print(" The original variable:")
X

### Trivial Normalization
- offset is 0
- spread is 1
In math a trivial process is one that doesn't change anything

In [None]:
offset = 0
spread = 1 
X['xNormTrivial'] = (x - offset)/spread
print(" Trivial normalization doesn't change values:")
X

### Min-max or Feature scaling
- offset is min of x
- spread is the range of x or the max of x minus the min of x
- The min of a min-max-normalized variable is zero
- The max of a min-max-normalized variable is one

In [None]:
offset = min(x)
spread = max(x) - min(x)
X['xNormMinMax'] = (x - offset)/spread
print(" The min-max-normalized variable has values from 0 to 1")
X

### Z-Normalization or Standard Normalization or Standard Scoring
- offset is mean of x (The mean of a z-normalized variable is zero)
- spread is standard deviation of x (The standard deviation of a z-normalized variable is one)
- Most of the values are between -2 and +2

In [None]:
offset = np.mean(x)
spread = np.std(x)
X['xNormZ'] = (x - offset)/spread
print(" The Z-normalized variable has most values within -2 and + 2")
X

In [None]:
offset = np.median(x)
spread = np.median(np.absolute(x - np.median(x)))
X['xNormMad'] = (x - offset)/spread
print(" The MMAD-normalized variable is zeroed around the median")
X

### Comparisons of linear Normalizations
The above normalizations all have different numbers
How do their histograms compare?

In [None]:
fig, axes = plt.subplots(2, 2, figsize = (14,7))
axes[0, 0].hist(x = X.xNormTrivial, bins=8)
axes[0, 0].set_title('Trivial Normalization', y=1.0, pad=-14)
axes[0, 1].hist(x = X.xNormMinMax,  bins=8)
axes[0, 1].set_title('MinMax Normalization', y=1.0, pad=-14)
axes[1, 0].hist(x = X.xNormZ,       bins=8)
axes[1, 0].set_title('Z Normalization', y=1.0, pad=-14)
axes[1, 1].hist(x = X.xNormMad,     bins=8)
axes[1, 1].set_title('MMAD Normalization', y=1.0, pad=-14);
fig.suptitle('Compare linear normalization methods');

### What is the difference between these normalizations?
The histograms (distributions) all look alike except for the x-axis values. A linear normalization will not change the shape of the histogram (distribution).  The normalization will only shift and rescale the x-axis of the histogram.

## Min-Max-Normalization versus Z-Normalization
The two most common normalizations are min-max and Z-normalization.  One major purpose for normalizing variables, is to put the values of variables on par with each other.  In other words, normalizations should adjust the scale of variables so that the scales are similar.
- Create two variables, one with a high outlier, one with a low outlier
- Overlay histograms
    - Overlay the histograms of the two original variables
    - Overlay the histograms of the two Min-Max Normalized variables
    - Overlay the histograms of the two Z-normalized variables

In [None]:
# Variable 1
sigma1 = 1
mu1a = 3
mu1b = 7
x1 = np.array(15)
x1 = np.append(x1, mu1a + sigma1*np.random.randn(100))
x1 = np.append(x1, mu1b + sigma1*np.random.randn(50))
x1 = x1.reshape(-1,1)

# Variable 2
sigma2 = 0.5
mu2a = 8.9
mu2b = 10.1
x2 = np.array(2.5)
x2 = np.append(x2, mu2b + sigma2*np.random.randn(150))
x12 = x2.reshape(-1,1)

### Overlay Histograms of Original Variables

In [None]:
# Compare the original variables by overlaying histograms
plt.hist(x1, bins = 20, color=[0, 0, 1, 0.5])
plt.hist(x2, bins = 20, color=[1, 0.7, 0, 0.5])
plt.title("Compare variables \n without normalization", y=1.0, pad=-28, loc='left')
plt.show()

#### Compare Distributions
- Are the scales of the two variables significantly different?  
- If so, how are they different?
- Compare the values of each histogram's x-coordinate

### Overlay Histograms of Min-Max Normalized Variables

In [None]:
# Change both variables by Min-Max Normalization
x1NormMinMax = (x1 - np.min(x1))/(np.max(x1) - np.min(x1))
x2NormMinMax = (x2 - np.min(x2))/(np.max(x2) - np.min(x2))
# Compare the Min-Max-normalized variables by overlaying histograms
plt.hist(x1NormMinMax, bins = 20, color=[0, 0, 1, 0.5])
plt.hist(x2NormMinMax, bins = 20, color=[1, 0.7, 0, 0.5])
plt.title("Compare variables after \n Min-Max Normalization", y=1.0, pad=-28, loc='left')
plt.show()

#### Compare Min-Max-Normalized Distributions
- Are the Min-Max normalized scales of the variables significantly different?  
- If so, how are they different?
- Compare the values of each histogram's x-coordinate

### Overlay Histograms of Z-Normalized Variables

In [None]:
# Change both variables by Z-Normalization
x1NormZ = (x1 - np.mean(x1))/np.std(x1)
x2NormZ = (x2 - np.mean(x2))/np.std(x2)
# Compare the Z-normalized variables by overlaying histograms
plt.hist(x1NormZ, bins = 20, color=[0, 0, 1, 0.5])
plt.hist(x2NormZ, bins = 20, color=[1, 0.7, 0, 0.5])
plt.title("Compare variables \n after Z-normalization", y=1.0, pad=-28, loc='left')
plt.show()

#### Compare Z-Normalized Distributions
- Are the scales of the variables significantly different?  
- If so, how are they different?
- Compare the values of each histogram's x-coordinate
  
By-the-way **Median-Medain Absolute Deviation (MMAD)** works just as well or even better than Z-normalization for these cases where outliers dominate the scale.

## Use scikit-learn (sklearn) for Normalization

In [None]:
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import MinMaxScaler

# Z-Normalize with scikit-learn
Scaler = StandardScaler()
Scaler.fit(x1)
Z_sklearn = Scaler.transform(x1)
print('scikit-learn std is:', Scaler.scale_[0], '; scikit-learn mean is: ', Scaler.mean_[0])

# Z-Normalize with numpy
Z_numpy = (x1 - np.mean(x1))/np.std(x1) # np.std(x1) is spread;  np.mean(x1) is offset
print('       numpy std is:', np.std(x1),';        numpy mean is: ', np.mean(x1))

# Compare the scikit-learn normalization with the numpy normalization
fig, axes = plt.subplots(1, 2, figsize = (14,3))
axes[0].hist(Z_sklearn, bins = 20)
axes[1].hist(Z_numpy,   bins = 20)
axes[0].set_title('sklearn StandardScaler', y=1.0, pad=-14)
axes[1].set_title('(x1 - np.mean(x1))/np.std(x1)', y=1.0, pad=-14);
fig.suptitle(' Compare normalization using numpy vs sklearn ');

#### Conclusion: scikit-learn and numpy normalizations are equivalent

## Compounding Linear Normalizations

In [None]:
# Z-Normalize the variable
Z = StandardScaler().fit_transform(x1)

# Min-Max-Normalize the variable
MinMax = MinMaxScaler().fit_transform(x1)

# Double Normalized:  Z-Normalize the Min-Max-Normalized variable
MinMax_Z = StandardScaler().fit_transform(MinMax)

# Double Normalized:  Z-Normalize the Z-Normalized variable
Z_Z = StandardScaler().fit_transform(Z)

# Double Normalized:  Min-Max-Normalize the Min-Max-normalized variable
MinMax_MinMax = MinMaxScaler().fit_transform(MinMax)

# Double Normalized:  Min-Max-Normalize the Z-normalized variable
Z_MinMax = MinMaxScaler().fit_transform(Z)

In [None]:
fig, axes = plt.subplots(3, 2, figsize = (14,9))
axes[0, 0].hist(Z, bins = 20)
axes[0, 0].set_title('only Z', y=1.0, pad=-28)
axes[0, 1].hist(MinMax, bins = 20)
axes[0, 1].set_title('only MinMax', y=1.0, pad=-28)
axes[1, 0].hist(MinMax_Z, bins = 20)
axes[1, 0].set_title('1st MinMax\n2nd Z', y=1.0, pad=-28)
axes[1, 1].hist(Z_MinMax, bins = 20)
axes[1, 1].set_title('1st Z\n2nd MinMax', y=1.0, pad=-28)
axes[2, 1].hist(MinMax_MinMax, bins = 20)
axes[2, 1].set_title('1st MinMax\n2nd MinMax', y=1.0, pad=-28)
axes[2, 0].hist(Z_Z, bins = 20)
axes[2, 0].set_title('1st Z\n2nd Z', y=1.0, pad=-28);
fig.suptitle('Compare the x-axis values!');

#### Conclusion: Last Normalization Method is the only normalization that counts!

## De-normalize

Reverse a normalization for whenever you want to know the value in its original context.  For instance if we want to know the original value of something in normalized space.  Or, you want to know the original value at the 75th percentile.
 
Remember that linear normalization applies this pattern: 

`xNorm = (x - offset)/spread` 

Where: 
- x is a numeric variable
- offset is a scalar that shifts variable x lower or higher
- spread is a scalar that re-scales variable x to a smaller or larger spread
- xNorm is the normalized variable

Reverse normalization is:
`x = xNorm*spread + offset` 

In [None]:
print('Original    ', 'min:{:0.2f}; mean:{:0.2f}; max:{:0.2f}; std:{:0.2f}'.
      format(np.min(x1), np.mean(x1), np.max(x1), np.std(x1)))

# Z-Normalize the variable
offset = np.mean(x1)
spread = np.std(x1)
xNorm = (x1 - offset)/spread

print('Z-Normalized', 'min:{:0.2f}; mean:{:0.2f}; max:{:0.2f}; std:{:0.2f}'.
      format(np.min(xNorm), np.mean(xNorm), np.max(xNorm), np.std(xNorm)))

### The wrong way to denormalize
Get the offset and spread from the normalized variable and apply them to the de-normalization formula.

In [None]:
# Incorrect Denormilaztion
offset = np.mean(xNorm)
spread = np.std(xNorm)
x1_wrong = xNorm*np.std(xNorm) + np.mean(xNorm)

print('Wrong way De-normalized', 'min:{:0.2f}; mean:{:0.2f}; max:{:0.2f}; std:{:0.2f}'.
      format(np.min(x1_wrong), np.mean(x1_wrong), np.max(x1_wrong), np.std(x1_wrong)))

#### Why is above denormalization attempt incorrect?
We want the denormalized variable to be the same as the original variable.
Compare the values of the x-coordinates (min, mean, max, std) of this denormalization attempt with the x-coordinates of the original and the normalized variables.
What are the values of the spread and offset?

### Correct Denormalization
Get the offset and spread from the original variable and apply them to the de-normalization formula

In [None]:
# Correct Denormilaztion
offset = np.mean(x1)
spread = np.std(x1)
x1_correct = xNorm*spread + offset

print('Correctly De-normalized', 'min:{:0.2f}; mean:{:0.2f}; max:{:0.2f}; std:{:0.2f}'.
      format(np.min(x1_correct), np.mean(x1_correct), np.max(x1_correct), np.std(x1_correct)))

#### Why is above denormalization attempt correct?
We want the denormalized variable to be the same as the original variable.  Therefore, we need to use the same parameters in the de-normalization formula that we used in the normalization formula.

Compare the values of the x-coordinates (min, mean, max, std) of this denormalization attempt with the x-coordinates of the original and the normalized variables.