# Feature scaling or feature normalization

Feature scaling is a method used to standardize the range of independent variables or features of data. In data processing, it is also known as data normalization and is generally performed during the data preprocessing step.

## Feature standardization

In machine learning, we can handle various types of data, e.g. audio signals and pixel values for image data, and this data can include multiple dimensions. Feature standardization makes the values of each feature in the data have zero-mean (when subtracting the mean in the numerator) and unit-variance. This method is widely used for normalization in many machine learning algorithms (e.g., support vector machines, logistic regression, and neural networks)[2][citation needed]. This is typically done by calculating standard scores.[3] The general method of calculation is to determine the distribution mean and standard deviation for each feature. Next we subtract the mean from each feature. Then we divide the values (mean is already subtracted) of each feature by its standard deviation.

In [10]:
import numpy as np
data = np.genfromtxt('ex1data2', delimiter=',')
x = data[:,[0]]
y = data[:,1]

In [16]:
def standardization(x):
    mean_residual = []
    std_residual  = []
    standardization = x
    
    dimension = x.shape[1]    
    for i in range(dimension):
        mean = np.mean(x[:, i])
        std  = np.std(x[:, i])
        
        mean_residual.append(mean)
        std_residual.append(std)
        
        standardization[:, i] = (1./std)*(standardization[:, i] - mean)
        
    return standardization

In [17]:
print(standardization(data))

[[  1.31415422e-01  -2.26093368e-01   4.80890225e-01]
 [ -5.09640698e-01  -2.26093368e-01  -8.49833796e-02]
 [  5.07908699e-01  -2.26093368e-01   2.31097448e-01]
 [ -7.43677059e-01  -1.55439190e+00  -8.76398036e-01]
 [  1.27107075e+00   1.10220517e+00   1.61263744e+00]
 [ -1.99450507e-02   1.10220517e+00  -3.27500639e-01]
 [ -5.93588523e-01  -2.26093368e-01  -2.06242009e-01]
 [ -7.29685755e-01  -2.26093368e-01  -1.14317510e+00]
 [ -7.89466782e-01  -2.26093368e-01  -1.03807621e+00]
 [ -6.44465993e-01  -2.26093368e-01  -7.91516995e-01]
 [ -7.71822042e-02   1.10220517e+00  -8.11734851e-01]
 [ -8.65999486e-04  -2.26093368e-01   5.32514582e-02]
 [ -1.40779041e-01  -2.26093368e-01  -8.41830726e-02]
 [  3.15099326e+00   2.43050370e+00   2.90606282e+00]
 [ -9.31923697e-01  -2.26093368e-01  -6.50856985e-01]
 [  3.80715024e-01   1.10220517e+00   8.85085658e-01]
 [ -8.65782986e-01  -1.55439190e+00  -3.27500639e-01]
 [ -9.72625673e-01  -2.26093368e-01  -1.13589150e+00]
 [  7.73743478e-01   1.10220