# Evaluating Regression Performance

## R-Squared Intuition

* Recall that to find the best regression line through a collection of data points, the line with the smallest sum of squared residuals ($SS_{res}=\sum (y_i-\hat{y_i})^2$) is used.
* Consider instead if we calculate the distances from the data points and the average of $y$, this is known as the total sum of squares: $SS_{tot}=\sum (y_i-\bar{y})^2$
* $R^2$ is then defined as $R^2 = 1 - \frac{SS_{res}}{SS_{tot}}$
* We can think of the line at $\bar y$ as a 'default' line of best fit. The $R^2$ value is essentially checking how good your regression line (with its minimised $SS_{res}$) is compared to the 'default' line of best fit. 
  * As you minimise $SS_{res}$, $\frac{SS_{res}}{SS_{tot}}$ will become smaller; and therefore $R^2$ will become larger.
  * $R^2$ is bounded between 0% and 100%.
  * Another way to think of $R^2$ is that it is the percentage of the total variation that is being accounted for (i.e. how much variation in y can be explained by x?):
  
    $R^2 = 1 - \frac{SS_{res}}{SS_{tot}} = \frac{SS_{tot}-SS_{res}}{SS_{tot}} = \frac{\text{explained variation}}{\text{total variation}}$
    
## Adjusted R-Squared Intuition

* The same concepts above can be applied to multiple linear regression. However, whenever you add a variable to your linear regression, the $R^2$ will never decrease. This is because, either:
    * The regression process finds that the new variable is useful and assigns it a non-zero co-efficient. In doing so, $SS_{res}$ will be further minimised - causing $R^2$ to increase; or,
    * The regression process finds that the new variable is not useful and assigns it a zero (or close to zero) co-efficient, thereby leaving $R^2$ unchanged. In practice, it is rare that the co-efficient is exactly zero due to small random correlation between $y$ and the new variable.
* Adjusted R-Squared aims to account for this:

    $\text{Adj }R^2 = 1 - (1 - R^2)\frac{n - 1}{n - p - 1}$
    
    Where:
    - $p$: number of regressors
    - $n$: sample size (number of observations)
<br><br>
* Adj. R-Squared has a penalisation factor that penalises you whenever you add more variables to your model. When variables are added, $p$ increases which in turn decreases the denominator in the last fraction, increasing $(1 - R^2)\frac{n - 1}{n - p - 1}$; and thereby reducing overall Adj. R-Squared.
  * If your new variable improves the model a lot, it will overwhelm the penalisation factor, allowing Adj. R-Squared to increase.