#### Total Sum of Squares

Total Sum of Squares = Explained Sum of Squares + Residual Sum of Squares

$$ TSS=ESS+RSS $$

#### Residual Sum-of-Squares (aka $RSS$)
The residual sum-of-squares is one of the basic ways of quantifying how much error exists in the fitted model.

$$ RSS = \sum_{i=1}^N r_i^2 = \sum_{i=1}^N \left(y_i - \left(\beta_0 + \beta_1 x_i\right)\right)^2 $$

**  In english: RSS is the sum of the squared difference between actual data point and prediction. **

**  In code:**

``` 
RSS = sum((Y - Yhat) ** 2) 
```

#### Explained Sum-of-Squares (aka $ESS$)

The explained sum-of-squares measures the variance explained by the regression model.

$$ESS = \sum_{i=1}^N \left( \hat{y}_i - \bar{y} \right)^2 = \sum_{i=1}^N \left( \left( \hat{\beta}_0 + \hat{\beta}_1 x_i \right) - \bar{y} \right)^2$$

** In english: ESS is sum of difference between actual values and mean value for each data point. **

**  In code:**

``` 
ESS = sum((Y - Ybar) ** 2) 
```

#### The Coefficient of Determination ($R^2$)

The coefficient of determination, $R^2$, tells us the percentage of the variance in the response variable $Y$ that can be explained by the linear regression model.

$$ R^2 = \frac{ESS}{TSS} $$

```
theRsquared = theESS/(theESS + theRSS)
```

#### Mean Squared Error and the $F$-Statistic
***
The mean squared errors are just the *averages* of the sum-of-squares errors over their respective degrees of freedom.

mean squared error of explained variability:  $$MSE = \frac{ESS}{K}$$

mean squared error of residual:           $$MSR = \frac{RSS}{N-K-1}$$

$$F = \frac{MSR}{MSE} = \left( \frac{R^2}{1 - R^2} \right) \left( \frac{N - K - 1}{K} \right)$$
 


### The $F$-Statistic Revisited

The $F$-statistic can also be used to compare two *nested* models, that is, two models trained on the same dataset where one of the models contains a *subset* of the variables of the other model. The *full* model contains $K$ variables and the *reduced* model contains a subset of these $K$ variables. This allows us to add additional variables to a base model and then test if adding the variables helped the model fit.

$$F = \frac{\left( \frac{RSS_{reduced} - RSS_{full}}{DF_{reduced} - DF_{full}} \right)}{\left( \frac{RSS_{full}}{DF_{full}} \right)}$$

where $DF_x = N - K_x - 1$ where $K_x$ is the number of variables in model $x$.