# Feature Scaling 

** Feature Scaling Formula ** 

$ x' = \frac{x - x_{min}}{x_{max} - x_{min}} $

* $x$ is the old value before rescaling. 
* $x_{max} =$ maximum value of old feature before re scaling. 
* $x_{min} =$ similar, but minimum. 

### example: 

Let, $ol = [115,140,175]$. Then, $x_{min}=115$ and $x_{max} = 175$. Therfore, 
we may use our feature scaling $x_{140}' = \frac{140 - 115}{175 - 115} = 0.42$. 

What we may notice here is that our transform features will always be in between $0 \leq x' \leq 1$. 


### Which Algorithm would be affected by feature rescaling 

* Decision Trees (NO) 
* SVM with RBF kernel (YES) 
* Linear Regression (NO) 
* k-means clustering (YES) 

* **SVM**  - look at the seperation line seperating the distance. This trade off one dimension 
* **k-means** - you computer the clustering from all the data points.

* **DT** - will give you cuts between the vert and horz.
* **LR** - linear regression will have a coeficcient always go togethere. 

In [17]:
def featureScaling(arr): 
    try: 
        x_prime = []
        for x in data: 
            x_min = min(data)
            x_max = max(data)
            feature_scaling = (x-x_min)/float(x_max - x_min)
            x_prime.append(feature_scaling)
            print feature_scaling
    except ZeroDivisionError: 
        return "Max and Min are the same, you don't need to rescale", arr[0]
    else: 
        return x_prime

featureScaling(data)

In [26]:
#minmax scalar in sklearn 
from sklearn.preprocessing import MinMaxScaler 
import numpy as np

weights = np.array([[200000.0],[1000000]]) # weights like before. 
scaler = MinMaxScaler() #scalar
rescaled_weight = scaler.fit_transform(weights) # feature scaling 
print rescaled_weight # prints the feature. 

[[ 0.]
 [ 1.]]


# How this all works

Once you understand the answer, most of the rest of syntax used in the project will become much clearer (So bear with me for a while as I take a few steps back and then answer your question).

1. Create an Instance

All of the data processors and classifiers that you will be using in this course begin life as a set of instructions called a class (they are 'blueprints' for what the processor or classifier can do).

You create an instance from those set of instructions, in the case of scaling that is done in the statement:

# create a specific instance of the class `'MinMaxScaler'`
``scaler = MinMaxScaler()``
where 'scaler' is any name that you choose.

'scaler' is now a data object that has attributes (i.e. variables) and methods (i.e. functions) inside it. Where attributes and methods are accessed using 'dot' notation:

``scaler.fit()``
for example, accesses the function/method '.fit()'

2. Accessing attributes and calling methods of that Instance:

I will change the code outlined in previous posts a little to make things clearer (and explain the changes that I made later):

# call the '.fit()' method on data
scaler.fit(finance_features)
All data processors and classifier/regressors have a '.fit()' method. Obviously, that '.fit()' method will perform different calculations depending on the purpose of the process (you would expect an instance of 'MinMaxScaler' to calculate the minimum and maximum for each variable passed to '.fit()', whereas you would expect an instance of 'PCA' to calculate the principal components for each variable passed to '.fit()' - the same method name, different calculations).

Data processors have a '.transform()' method (which will create the scaled variables for 'MinMaxScaler' and the principal components for 'PCA').

(Classifiers don't have a '.transform()' method, they have a '.predict()' method).

Why is all of that relevant?

Once you call either:

# call the '.fit()' method on data
``scaler.fit(finance_features)``
or

# call the '.fit()' method on data and transform the data
``rescaled_finance_features = scaler.fit_transform(finance_features)``
The instance 'scaler' stores the results of '.fit()' as attributes (which you can access - see the documentation1 - using 'scaler.min_' to see the minimum values, for example).

So, that 'instance' is now primed and ready to use those values on any data that you want to transform. That is, the statements:

``financial_features_test = numpy.array([200000., 1000000.])
financial_features_test_transformed = scaler.transform(financial_features_test)``
are creating data (has to have the same number of columns as the data fitted), and then transforming (i.e. scaling) that data, by calling the '.transform()' method .... using the min and the max of the data passed to '.fit()'.

3 Wrap-up

You can use data processors to fit and transform in the same step (using '.fit_transform()') but, in trying to understand how the process that you are asking about works, it is easier to split the steps into '.fit()' and '.transform()'.

'.fit()' performs the calculations that allow you to '.transform()' (you can see this if you try to call '.transform()' before you call '.fit()' as an error will be thrown). Those calculations reside in the instance (until you '.fit()' again, when those values/attributes will be over-written).

