# Feature Scaling Algorithms (Algoritmos de Feature Scaling)
**[EN-US]**

Makes gradient descent run faster.

**[PT-BR]**

Faz com que o gradient descent seja executado mais rápido.

## Table of Contents
* [Libraries](#Libraries)
* [Z-score Normalization](#Z-score-Normalization-(Normalização-Z-score))
* [Mean Normalization](#Mean-Normalization-(Normalização-da-Média))
* [Feature Scaling](#Feature-Scaling)

## Libraries

In [2]:
import numpy as np

In [3]:
X_train = np.array([[2104, 5, 1, 45], [1416, 3, 2, 40], [852, 2, 1, 35]])
y_train = np.array([460, 232, 178])
X_train.shape

(3, 4)

## Z-score Normalization (Normalização Z-score)
$$\frac{x_i - \mu_i}{\sigma_i}$$
**[EN-US]**

We calculate the standard deviation ($\sigma_i$) of each feature $i$ and the mean $\mu_i$. Then we subtract each feature value by its mean and divide by its standard deviation, so the feature values ​​have a mean of 0 and a standard deviation of 1.

**[PT-BR]**

Calculamos o desvio padrão ($\sigma_i$) de cada feature $i$ e a média $\mu_i$. Então subtraimos cada valor da feature pela sua média e dividimos pelo seu desvio padrão, para os valores da feature terem média 0 e desvio padrão 1.

$\frac{\vec{x}_i - \mu_i}{\sigma_i}$

In [1]:
def zscore_normalize_features(X):
    mu = np.mean(X, axis=0)
    sigma = np.std(X, axis=0)
    X_norm = (X - mu) / sigma

    return X_norm, mu, sigma

## Mean Normalization (Normalização da Média)
$$\frac{x_i - \mu_i}{max - min}$$
**[EN-US]**

We rescale the values to have mean = 0.

**[PT-BR]**

Redimensionamos os valores para terem média = 0.

In [2]:
def mean_normalization(X):
    mu = np.mean(X, axis=0)
    max = np.max(X, axis=0)
    min = np.min(X, axis=0)
    X_norm = (X - mu) / (max - min)

    return X_norm, mu, max, min

## Feature Scaling
$$\frac{x_i}{\text{max}}$$
Or (ou):
$$\frac{(x_i - \text{min})}{(\text{max} - \text{min})}$$
**[EN-US]**

Divide each positive feature by its maximum value or, resize each feature by its minimum and maximum value using $\frac{(x_i - \text{min})}{(\text{max} - \text{min})}$ . Both ways normalize features to the range of -1 and 1, where the first method works for positive features, and the last method works for any features.

**[PT-BR]**

Divide cada feature positiva por seu valor máximo ou, redimensiona cada feature por seu valor mínimo e máximo usando $\frac{(x_i - \text{min})}{(\text{max} - \text{min})}$. Ambas as formas normalizam features para o intervalo de -1 e 1, onde o primeiro método funciona para features positivas, e o último método funciona para quaisquer features.

In [3]:
def feature_scaling(X):
    max = np.max(X, axis=0)
    min = np.min(X, axis=0)
    X_norm = (X - min) / (max - min)

    return X_norm, max, min

In [18]:
X_norm, mu, sigma = zscore_normalize_features(X_train)
X_norm, mu, sigma

(array([[ 1.26311506,  1.33630621, -0.70710678,  1.22474487],
        [-0.08073519, -0.26726124,  1.41421356,  0.        ],
        [-1.18237987, -1.06904497, -0.70710678, -1.22474487]]),
 array([1.45733333e+03, 3.33333333e+00, 1.33333333e+00, 4.00000000e+01]),
 array([5.11961804e+02, 1.24721913e+00, 4.71404521e-01, 4.08248290e+00]))

In [19]:
X_norm, mu, max, min = mean_normalization(X_train)
X_norm, mu, max, min

(array([[ 0.51650692,  0.55555556, -0.33333333,  0.5       ],
        [-0.03301384, -0.11111111,  0.66666667,  0.        ],
        [-0.48349308, -0.44444444, -0.33333333, -0.5       ]]),
 array([1.45733333e+03, 3.33333333e+00, 1.33333333e+00, 4.00000000e+01]),
 array([2104,    5,    2,   45]),
 array([852,   2,   1,  35]))

In [22]:
X_norm, max, min = feature_scaling(X_train)
X_norm, max, min

(array([[1.        , 1.        , 0.        , 1.        ],
        [0.45047923, 0.33333333, 1.        , 0.5       ],
        [0.        , 0.        , 0.        , 0.        ]]),
 array([2104,    5,    2,   45]),
 array([852,   2,   1,  35]))