## Feature scaling

A technique for rescaling the dataset to ensure that numerical features have a similar range.

> Feature scaling is a fundamental preprocessing step in machine learning that plays a pivotal role in ensuring fair comparisons and accurate predictions by standardizing the scale of numerical features. Many machine learning algorithms perform better or converge faster when the input numerical features are on a similar scale.

* Always rescale (with whatever method) if some features have too small or too large values.
* It's never a harm to rescale features &rarr; do it when in doubt...
* Good candidates for rescalling are features with very different ranges:
<table>
    <tbody>
        <tr>
            <th>feature $x_{1}$</th> 
            <th>feature $x_{2}$</th>
            <th> </th>
        </tr>        
        <tr>
            <td>0 $ \leq x_{1} \leq$ 3</td> 
            <td>-2 $ \leq x_{2} \leq$ 0.5</td>
            <td>similar ranges, no need to rescale</td>
        </tr>
        <tr>
            <td>-100 $ \leq x_{1} \leq$ 100</td> 
            <td>-0.001 $ \leq x_{2} \leq$ 0.001</td>
            <td>very different ranges, must rescale</td>
        </tr>
    </tbody>
</table> 
<br>
* It's more likely that a model will learn to chose relatively small parameter $w$ when the range of a feature is relatively large, and vice versa.
* Example</u>: $price^{(i)}$ = $w_{1}x_{1}^{(i)}$ + $w_{2}x_{2}^{(i)}$ + $b$ 
* If $x_{1}$ is in range 300 - 2000 [area  in sq.m.] => the model would probably learn a small $w_{1}$ such as 0.1
* If $x_{2}$ is in range 0 - 5 [# of rooms] => the model would probably learn a large $w_{2}$ such as 100
![piai2.png](attachment:piai2.png)


## Methods for feature scalling

* <b>Divide all values by the max value in the range</b>
<table>
    <tbody>
        <tr>
            <th>feature $x_{1}$</th> 
            <th>feature $x_{2}$</th>
            <th> </th>
        </tr>        
        <tr>
            <td>300 $ \leq x_{1} \leq$ 2000</td> 
            <td>0 $ \leq x_{2} \leq$ 5</td> 
        </tr>
        <tr>
            <td>$x_{1}$_scaled = $x_{1} / 2000$</td> 
            <td>$x_{2}$_scaled = $x_{2} / 5$</td> 
        </tr>
        <tr>
            <td>0.15 $ \leq x_{1}$_scaled $\leq$ 1</td> 
            <td>0 $ \leq x_{2}$_scaled $\leq$ 1</td> 
        </tr>        
    </tbody>
</table> 

* <b>Mean normalization</b>
* Find the average value of each feature $x_{j}$ (i.e per column $j$)
* Find a new range for each feature by computing the "mean" values: scaled $x_{j}^{i}$ = $\frac{ x_{j}^{i} - mean_{j} }{ max_{j} - min_{j} }$ 
* Note: "average" is the arithmetic mean of a group of values: $\frac{ sum\;of\;all\;values }{ num\;of\;all\;values }$           
<table>
    <tbody>
        <tr>
            <th>feature $x_{1}$</th> 
            <th>feature $x_{2}$</th>
            <th> </th>
        </tr>        
        <tr>
            <td>300 $ \leq x_{1} \leq$ 2000</td> 
            <td>0 $ \leq x_{2} \leq$ 5</td> 
        </tr>
        <tr>
            <td>$mean_{1}$ = 600</td> 
            <td>$mean_{2}$ = 2.3</td> 
        </tr>
        <tr>
            <td>$x_{1}$_scaled = $\frac{ x_{1} - 600 }{ 2000 - 300 }$</td> 
            <td>$x_{2}$_scaled = $\frac{ x_{2} - 2.3 }{ 5 - 0 }$</td> 
        </tr>    
        <tr>
            <td>-0.18 $ \leq x_{1}$_scaled $\leq$ 0.82</td> 
            <td>-0.46 $ \leq x_{2}$_scaled $\leq$ 0.54</td> 
        </tr>          
    </tbody>
</table>  

* <b>Z-score normalization</b>
* Find the standard deviation $\sigma_{j}$ of each range (i.e. of each feature = each column in the dataset)
* $\sigma_{j}$ = $\sqrt{ \frac { (\sum{x_{j} - \mu})^2 } { N } }$ , where:
* * $N$ = size of population
* * $x_{j}$ = population values
* * $\mu$ = population's mean 
* Normalize by the formula: scaled $x_{j}^{i}$ = $\frac {x_{j}^{i} - \mu_{j}} { \sigma_{j} }$
* After z-score normalization, all features will have a mean of 0 and a standard deviation of 1.
<br>
<br>
<u>Note</u>
* median = middle (central) element of a sorted collection
* mean = arithmetic mean (average)
* mode = the most frequent element 
![piai3-2.png](attachment:piai3-2.png)