When selecting a suitable metric, we have to ask ourselves a three key questions:

what kind of data do we analyze?
do we need a percentage error metric?
is overestimation or underestimation more critical?

# <mark> MAE

# <mark> MSE

# <mark> RMSE

# <mark> R2

Problems with R2 that are corrected with an adjusted R2

    R2 increases with every predictor added to a model. As R2 always increases and never decreases, it can appear to be a better fit with the more terms you add to the model. This can be completely misleading.
    Similarly, if your model has too many terms and too many high-order polynomials you can run into the problem of over-fitting the data. When you over-fit data, a misleadingly high R2 value can lead to misleading projections.
    R2 tends to reward you for including too many independent variables in a regression model, and it doesn’t provide any incentive to stop adding more. Adjusted R-squared and predicted R-squared use different approaches to help you fight that impulse to add too many.
    The protection that adjusted R-squared and predicted R-squared provide is critical because too many terms in a model can produce results that you can’t trust. These statistics help you include the correct number of independent variables in your regression model.
    
    Problem 1: ALWAYS INCREASING WHEN YOU ADD NEW FEATURES
    R-squared increases every time you add an independent variable to the model. The R-squared never decreases, not even when it’s just a chance correlation between variables. A regression model that contains more independent variables than another model can look like it provides a better fit merely because it contains more variables.
    Problem 2: TENDS TO OVERFIT, FAILS TO GENERALIZE WELL
    When a model contains an excessive number of independent variables and polynomial terms, it becomes overly customized to fit the peculiarities and random noise in your sample rather than reflecting the entire population. Statisticians call this overfitting the model, and it produces deceptively high R-squared values and a decreased capability for precise predictions.

# <mark> Adjusted R2

Adjusted R2 is a special form of R2.
R2 shows how well terms (data points) fit a curve or line. Adjusted R2 also indicates how well terms fit a curve or line, but adjusts for the number of terms in a model. If you add more and more useless variables to a model, adjusted r-squared will decrease. If you add more useful variables, adjusted r-squared will increase.
Adjusted R2 will always be less than or equal to R2.

The formula is:
![image.png](attachment:04365bd1-2253-483c-bf5f-8aa1be3aea48.png)

where:

    N is the number of points in your data sample.
    K is the number of independent regressors, i.e. the number of variables in your model, excluding the constant.

Both R2 and the adjusted R2 give you an idea of how many data points fall within the line of the regression equation. However, there is one main difference between R2 and the adjusted R2: R2 assumes that every single variable explains the variation in the dependent variable. The adjusted R2 tells you **the percentage of variation explained by only the independent variables that actually affect the dependent variable.**

Let’s say you are comparing a model with five independent variables to a model with one variable and the five variable model has a higher R-squared. Is the model with five variables actually a better model, or does it just have more variables? To determine this, just compare the adjusted R-squared values!

The adjusted R-squared adjusts for the number of terms in the model. Importantly, its value increases only when the new term improves the model fit more than expected by chance alone. The adjusted R-squared value actually decreases when the term doesn’t improve the model fit by a sufficient amount.

The example below shows how the adjusted R-squared increases up to a point and then decreases. On the other hand, R-squared blithely increases with each and every additional independent variable.

![image.png](attachment:357a34b4-58b8-431d-a8bd-bbc6ddf0ac18.png)

In this example, the researchers might want to include only three independent variables in their regression model. My R-squared blog post shows how an under-specified model (too few terms) can produce biased estimates. However, an overspecified model (too many terms) can reduce the model’s precision. In other words, both the coefficient estimates and predicted values can have larger margins of error around them. That’s why you don’t want to include too many terms in the regression model!

# <mark> Predicted R2

Use predicted R-squared to determine how well a regression model makes predictions. This statistic helps you identify cases where the model provides a good fit for the existing data but isn’t as good at making predictions. However, even if you aren’t using your model to make predictions, predicted R-squared still offers valuable insights about your model.

Statistical software calculates predicted R-squared using the following procedure:

    It removes a data point from the dataset.
    Calculates the regression equation.
    Evaluates how well the model predicts the missing observation.
    And repeats this for all data points in the dataset.

Predicted R-squared helps you determine whether you are overfitting a regression model. Again, an overfit model includes an excessive number of terms, and it begins to fit the random noise in your sample.

By its very definition, it is not possible to predict random noise. Consequently, if your model fits a lot of random noise, the predicted R-squared value must fall. A predicted R-squared that is distinctly smaller than R-squared is a warning sign that you are overfitting the model. Try reducing the number of terms(features).

All study areas involve a certain amount of variability that you can’t explain. If you chase a high R-squared by including an excessive number of variables, you force the model to explain the unexplainable. This is not good. While this approach can obtain higher R-squared values, it comes at the cost of misleading regression coefficients, p-values, R-squared, and imprecise predictions.

Adjusted R-squared and predicted R-square help you resist the urge to add too many independent variables to your model.

Adjusted R-square compares models with different numbers of variables.
Predicted R-square can guard against models that are too complicated.

# <mark> Regression Vs. Correlation
    
|Property|Correlation|Regression|
|:------:|:---------:|:--------:|
|When to use|When summarizing the direct relationship between two variables|To predict or explain the numeric response|
Able to quantify the direction of the relationship?|Yes|Yes
Able to quantify the strength of the relationship?|Yes|Yes
Able to show cause and effect?|No|Yes
Able to predict and optimize?|No|Yes
X and Y are interchangeable?|Yes|No