# 9. Feature Scaling

Pre-processing features
E.g. choosing T-shirt size based on feature `height + weight`, height in feet and weight in pounds.
-> Imbalanced features where height is in range [5,7] and weight is in [115, 75].
-> **Rescale features so they span comparable ranges, usually between 0 and 1.**

$$x' = \frac{x-x_min}{x_max-x_min}$$
where $x'$ is the rescaled feature.

$$0\leq x' \leq 1$$
- Adv: Standardised
- Disadv: Outliers can skew ranges.



In [2]:
def rescale(input_features):
    min = mini(input_features)
    max = maxi(input_features)
    output_features = (input_features - min)/(max - min)
    return output_features

## MinMaxScaler in `sk-learn`

In [4]:
from sklearn.preprocessing import MinMaxScaler
import numpy

# Each el of numpy array is a different training point.
# Each el within training point is a feature.
# One feature here
weights = numpy.array([[115.], [140.], [175.]])

# Our scaler
scaler = MinMaxScaler()

# New feature 
# Fits (finds x_min, x_max)
# Transform (applies formula to all elements in set of data)
rescaled_weight = scaler.fit_transform(weights)

rescaled_weight

array([[ 0.        ],
       [ 0.41666667],
       [ 1.        ]])

Algorithms affected by feature rescaling:
- SVM with RBF kernel
- K-means clustering
    - Because there is a cluster center and you compute distance between center and points. Distance twice as large -> counts for twice as much.

(vs Decision Trees, Linear Regression not.)
    - DT: Series of vertical and horizontal lines (not diagonal) so no tradeoff.
    - LR: Each feature has a coeff associated with it. Doesn't affect other variable.