In [16]:
import numpy as np

### Feature Scaling - Important Preprocessing Step

Numerical features with different scales leads to slower convergence of iterative optimization procedures.

It is a good practice to scale numerical features so that all of them are on the same scale. We want all our features to be on a similar scale, so that no one feature has a disproportionately large say/weight in making the predictions.

+ Feature scaling **enables faster convergence** in iterative optimization algorithms like gradient descent and its variants
+ The performance of algorithms like SVM, K-NN, and K-means etc that compute euclidean distance among input samples gets impacted if the features are not scaled

Note: **Tree based** ML algorithms are not affected by feature-scaling. In other words, feature scaling is not required for tree based ML algorithms.

Feature Scaling is only performed on numerical attributes.

Sklearn provides three APIs for feature scaling:
+ StandardScaler
+ MinMaxScaler
+ MaxAbsScaler


#### Standard Scaler

The standard scaler transforms the original feature vector $x$ into a new scaled feature $x'$ using the following formula:
$$ x' = \frac{x - \mu}{\sigma} $$

The transformed feature vector $x'$ has mean($\mu$) = 0 and standard deviation ($\sigma$)=1

The standard scaler does not bind the transformed values to any particular range.

In [17]:
from sklearn.preprocessing import StandardScaler

# Assume we have a dataset with two numerical features: 'age' and 'income'
data = [[30, 50000], [40, 60000], [50, 70000], [60, 80000]]

# Create an instance of StandardScaler
scaler = StandardScaler()

# Fit the scaler to the dataset and transform the dataset
scaled_data = scaler.fit_transform(data)

# Print the scaled dataset
print(scaled_data)


[[-1.34164079 -1.34164079]
 [-0.4472136  -0.4472136 ]
 [ 0.4472136   0.4472136 ]
 [ 1.34164079  1.34164079]]


#### MinMaxScaler

It transforms the original feature vector $X$ into a new feature vector $X'$ so that all the values fall within range [0,1]
using the formula: 

$$\frac{x - min(x)}{max(x)-min(x)}$$

The transformed values all lie in the range [0,1]

The smallest value is transformed to 0 and the largest is transformed to 1 and the rest lie b/w 0 and 1.

In [18]:
from sklearn.preprocessing import MinMaxScaler

print(np.matrix(data))

# create a MinMaxScaler object and transform the data using it
scaler = MinMaxScaler()
scaler.fit_transform(data)

[[   30 50000]
 [   40 60000]
 [   50 70000]
 [   60 80000]]


array([[0.        , 0.        ],
       [0.33333333, 0.33333333],
       [0.66666667, 0.66666667],
       [1.        , 1.        ]])

#### MaxAbsScaler

It transforms the original feature vector $X$ into a new feature vector $X'$ so that all the values fall within range [-1,1]
using the formula: 

$$\frac{x}{MaxAbsoluteValue}$$

Divide each value of the feature by the maximum absolute value of the feature

In [19]:
from sklearn.preprocessing import MaxAbsScaler

print(np.matrix(data))

scaler = MaxAbsScaler()
scaler.fit_transform(data)

[[   30 50000]
 [   40 60000]
 [   50 70000]
 [   60 80000]]


array([[0.5       , 0.625     ],
       [0.66666667, 0.75      ],
       [0.83333333, 0.875     ],
       [1.        , 1.        ]])

#### FunctionTransformer

Constructs transformed features by applying a user deﬁned function.

In [20]:
from sklearn.preprocessing import FunctionTransformer

X = np.array([[128, 2], [2, 256], [4, 1], [512, 64]])

print(np.matrix(X))

[[128   2]
 [  2 256]
 [  4   1]
 [512  64]]


In [21]:
# Create a function transformer
ft = FunctionTransformer(np.log2) # we're specifying the log base 2 function to be used for transforming the features

ft.fit_transform(X)

array([[7., 1.],
       [1., 8.],
       [2., 0.],
       [9., 6.]])

#### Polynomial Transformation

Generates a new feature matrix consisting of all polynomial
combinations of the features with degree less than or equal
to the speciﬁed degree.

For example, `PolynomialFeatures` with degree 2 transforms $X = [x_1 , x_2 ]$ into
$$X^′ = [1, x_1 , x_2 , x_1^2 , x_1x_2, x_2^2 ]$$


PolynomialFeatures are used to add complex features to the dataset.

In [22]:
X = np.array([[2, 3],[4, 5],[6,7]])
np.matrix(X)

matrix([[2, 3],
        [4, 5],
        [6, 7]])

In [23]:
from sklearn.preprocessing import PolynomialFeatures

pf = PolynomialFeatures(degree=2)
pf.fit_transform(X)

array([[ 1.,  2.,  3.,  4.,  6.,  9.],
       [ 1.,  4.,  5., 16., 20., 25.],
       [ 1.,  6.,  7., 36., 42., 49.]])