## Defining performance metrics on Regression


## Regression Evaluation Metrics


Here are three common evaluation metrics for regression problems:

Mean Absolute Error (MAE) is the mean of the absolute value of the errors:

$$\frac 1n\sum_{i=1}^n|y_i-\hat{y}_i|$$
Mean Squared Error (MSE) is the mean of the squared errors:

$$\frac 1n\sum_{i=1}^n(y_i-\hat{y}_i)^2$$
Root Mean Squared Error (RMSE) is the square root of the mean of the squared errors:

$$\sqrt{\frac 1n\sum_{i=1}^n(y_i-\hat{y}_i)^2}$$
Comparing these metrics:

MAE is the easiest to understand, because it's the average error.
MSE is more popular than MAE, because MSE "punishes" larger errors, which tends to be useful in the real world.
RMSE is even more popular than MSE, because RMSE is interpretable in the "y" units.
All of these are loss functions, because we want to minimize them.

### Mean Absolute Error (MAE)

The Mean Absolute Error (MAE) is an evaluation metric that measures the average
magnitude of residuals in a regression model. Residuals are simply the differences
between the true values of the target variable Y and its corresponding predicted
values.

The MAE score provides insights into the average absolute value of residuals
regardless if they are positive or negative values. This is achieved by utilizing the
absolute function to calculate the distances between the true values and the
predicted values. Here is the formula for the MAE:

MAE = Σ|Yᵢ — Ŷᵢ| ÷ n

Where:
- Yᵢ = Represents the true value for the iᵗʰ sample
- Ŷᵢ = Represents the predicted value for iᵗʰ sample
- |…| = The absolute value function, i.e., the distance between two values
- Σ = The sum of the absolute differences over all the samples
- n = The total number of data samples

To obtain the Mean Absolute Error, we compute the absolute differences between
the true values Yᵢ and predicted values Ŷᵢ, sum these differences, and then divide it
by the total number n of samples available in our dataset.
Unlike other metrics that involve squaring the residuals, the MAE score is linear.
This means that each individual residual contributes equally to the overall mean
score, making it less sensitive to outliers.
By not squaring the residuals, the MAE score retains the same unit as the target
variable Y, which ensures an easier interpretability. For instance, if we are
predicting house prices in U.S. Dollars, the Mean Absolute Error will also be
expressed in U.S. Dollars. This characteristic makes it a much more
straightforward metric for explaining the model’s performance to non-technical
stakeholders.

### Mean Absolute Percentage Error (MAPE)

The Mean Absolute Percentage Error(MAPE) is very similar to the Mean Absolute
Error, but its main difference is that it expresses the average value of residuals in
percentage form. The formula for the MAPE score is as follows:

MAPE = (Σ[(|Yᵢ — Ŷᵢ|) ÷ |Yᵢ|] ÷ n) * 100

Where:

- Yᵢ = Represents the true value for the iᵗʰ sample
- Ŷᵢ = Represents the predicted value for iᵗʰ sample
- |…| = The absolute value function, i.e., the distance between two values
- Σ = Is the sum of the squared differences between true and predicted values
- n = Is the total number of data samples

To compute the MAPE score, we adopt a very similar approach as with the MAE
score. We start by taking the absolute value of the difference between the true
value Yᵢ and the predicted value Ŷᵢ, and then we divide it by the absolute value of
the true value |Yᵢ|. This gives us the individual absolute percentage error for each
data point. We sum these values and divide it by the total number of data points n.
To obtain the percentage values, we then multiply the result by 100.
By measuring the precision of our model as a percentage, the MAPE makes it
extremely easy for non-technical stakeholders to understand what it conveys. If our
house price prediction model has a MAPE of 4, it implies that, on average, our
model deviates by 4% in price prediction.

### Mean Squared Error (MSE)

The Mean Squared Error (MSE) is another extremely popular metric used to
evaluate regression models. Here’s its formula:

MSE = Σ(Yᵢ — Ŷᵢ)² ÷ n

Where:
- Yᵢ = Represents the true value for the iᵗʰ sample
- Ŷᵢ = Represents the predicted value for iᵗʰ sample
- (…)² = Represents the squared difference between true and predicted values
- Σ = Is the sum of the squared differences between true and predicted values
- n = Is the total number of data samples

The MSE score is calculated by computing the differences between the true value
Yᵢ and predicted value Ŷᵢ. The differences are then squared, so we eliminate
negative residuals, and these squared differences are summed up. We then divide it
by the total number n of data samples.
By squaring the errors, we ensure that the MSE score is always greater than zero. It
also gives more weight to outliers to emphasize their influence on the final score.
This is particularly useful in situations where outliers may have a great impact on
final predictions.
It is important to notice that the MSE is not in the same unit as the true values in Y.
It is expressed in the square of the original unit, which can make it less intuitive to
interpret, specially for those who do not have a background in statistics.

### Root Mean Squared Error (RMSE)

The Root Mean Squared Error (RMSE) can be seen as the standard deviation of the
residuals in our regression model. It is simply the square root of the MSE score,
and we use the same exact method of the Scikit-learn library to obtain this metric.
Here is its formula:

RMSE = √(Σ(Yᵢ — Ŷᵢ)² ÷ n)

Where:
- Yᵢ = Represents the true value for the iᵗʰ sample
- Ŷᵢ = Represents the predicted value for iᵗʰ sample
- (…)² = Represents the squared difference between true and predicted values
- Σ = Is the sum of the squared differences between true and predicted values
- n = Is the total number of data samples
- √ = Represents the square root function

To simply explain the formula above, we obtain the squared differences between Yᵢ
and Ŷᵢ. We sum them up, divide by the total number n of samples, and then we
compute its square root value.
Since the squaring element is also present in the RMSE formula, this metric is also
sensible to outliers, and it gives more weight to larger errors.
Its interpretability, however, is easier than that of the MSE, since its values are
expressed in the same unit as the target variable Y. For instance, if we’re building a
regression model that predicts the daily returns of a certain stock, an RMSE value
of 2.8 would indicate that, on average, the predictions of the model differ 2.8%
from the actual daily returns, since those are given in percentage values.
Being it a very easy to comprehend metric, the Root Mean Square Error is
considered the standard evaluation metric for regression models in many fields.

### Coefficient of Determination (R²)

The Coefficient of Determination ( score) is a measure that represents the
proportion of variance of the target variable that is explained by the independent
variables. It is an indication of how well our model fits to the data. The formula is
as follows:

R² = 1 — [(Σ(Yᵢ — Ŷᵢ)²) ÷ (Σ(Yᵢ — Ȳ)²)]

Where:

- Yᵢ = Represents the true value for the iᵗʰ sample
- Ŷᵢ = Represents the predicted value for iᵗʰ sample
- Ȳ = Represents the mean of the actual values
- Σ = Is the sum of the squared differences between true values and predicted values, and
true values and the mean of the true values.

In this formula, we calculate the ratio between the sum squared of residuals, and
the total sum of squares, which measures the squared distance between each
specific sample and the mean of all samples.
The R² score values tend to range between 0 and 1, although you can have negative
values too. Generally, a value closer to 1 implies a better-fitting model. However,
it’s important to consider other metrics such as MAE, MSE, MAPE, or RMSE
alongside R² to gain a deeper comprehension of the model’s performance, since
the R² alone can be misleading, considering it doesn’t inform the magnitude of
deviations between predicted and actual values.

### Adjusted R squared

Adjusted R² is a corrected goodness-of-fit (model accuracy) measure for linear models. It identifies the percentage of variance in the target field that is explained by the input or inputs.
R² tends to optimistically estimate the fit of the linear regression. It always increases as the number of effects are included in the model. Adjusted R² attempts to correct for this overestimation. Adjusted R² might decrease if a specific effect does not improve the model.
Adjusted R squared is calculated by dividing the residual mean square error by the total mean square error (which is the sample variance of the target field). The result is then subtracted from 1.
Adjusted R² is always less than or equal to R². A value of 1 indicates a model that perfectly predicts values in the target field. A value that is less than or equal to 0 indicates a model that has no predictive value. In the real world, adjusted R² lies between these values.

### Which Metric is the Best?

Overall, there is no definitive answer as to which metric is the best. Each one of the
metrics we have seen in this article has its own advantages and disadvantages.
The Mean Absolute Error (MAE) is a straightforward metric that is very easy to
interpret. It’s less sensitive to outliers, which can be useful in cases where extreme
values have a minimal impact on the overall performance.

The Mean Absolute Percentage Error (MAPE), also offers easy interpretability by
expressing errors as a percentage of the true values. This feature makes it more
intuitive for non-technical stakeholders to grasp the performance of the model.
However, it’s important to note that the MAPE score can encounter issues when we
have zeros as true values. It is important to be cautious when using it in situations
where zero values are present.

The Mean Squared Error (MSE) places a higher penalty on larger errors compared
to the MAE. By squaring the errors, the MSE score amplifies the impact of these
errors, which makes it more suitable for situations where larger errors should be
heavily penalized. The main drawback of the MSE score, however, is the fact that
its unit of measurement is different from the original data, which makes it less
intuitive to interpret.

To address the issue of interpretability, the Root Mean Squared Error (RMSE)
retains the beneficial properties of MSE while maintaining the same unit as the
dependent variable Y. This characteristic makes the RMSE score much more easily
interpretable and allows for direct comparison with the original values, making it
the go-to metric.

In addition to these metrics, the R² score provides a measure of how well the
model fits to the data. It is highly interpretable, making it easy to compare
different models among each other. However, it can be misleading when assessing
complex models, as it fails to account for overfitting or the model’s ability to
generalize to new data. To gain a deeper understanding of the model’s
performance, it is advisable to supplement the R² with other evaluation metrics.

In [1]:
import pandas as pd

import warnings
warnings.filterwarnings('ignore')

path = "https://frenzy86.s3.eu-west-2.amazonaws.com/python/data/Startup.csv"
df = pd.read_csv(path)
df

Unnamed: 0,R&D Spend,Administration,Marketing Spend,Profit
0,165349.2,136897.8,471784.1,192261.83
1,162597.7,151377.59,443898.53,191792.06
2,153441.51,101145.55,407934.54,191050.39
3,144372.41,118671.85,383199.62,182901.99
4,142107.34,91391.77,366168.42,166187.94
5,131876.9,99814.71,362861.36,156991.12
6,134615.46,147198.87,127716.82,156122.51
7,130298.13,145530.06,323876.68,155752.6
8,120542.52,148718.95,311613.29,152211.77
9,123334.88,108679.17,304981.62,149759.96


In [2]:
## 1 - Declare Features and target
X = df.drop(columns='Profit')
y = df['Profit']

In [3]:
## 2 -suddividere il problema in Training e Test
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, 
                                                    test_size = 0.2, 
                                                    random_state = 667
                                                    )

In [4]:
## 3 - Creare ed allenare il modello (fit) sulla parte di training
from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(X_train, y_train)

In [5]:
## 4 - creare la predizione sulla parte di TEST
y_pred = model.predict(X_test) #on Test set

In [6]:
## 5 -  Misurare l'errore del mio modello
from sklearn.metrics import r2_score, mean_absolute_error, mean_squared_error

mae = mean_absolute_error(y_test, y_pred)
mse = mean_squared_error(y_test, y_pred)
rmse = mean_squared_error(y_test, y_pred, squared=False)
r2score = r2_score(y_test, y_pred)
ad_r2score = 1-(1-r2score)*(len(X_test)-1)/(len(X_test)-X_test.shape[1]-1)

print('MAE: ', mae)
print('MSE: ', mse)
print('RMSE: ', rmse)
print('R2_score: ', r2score)
print('Adjusted_R2_score: ', ad_r2score)

MAE:  6863.327578772788
MSE:  81932298.4533166
RMSE:  9051.646173670102
R2_score:  0.9441590602423614
Adjusted_R2_score:  0.9162385903635422


In [7]:
model.predict([[324,34,56]])[0]

46990.028023760206

<img src="https://frenzy86.s3.eu-west-2.amazonaws.com/python/savemodel.png" width=600>

## JOBLIB
<img src='https://frenzy86.s3.eu-west-2.amazonaws.com/python/joblib.png' widht=2000>

In [8]:
import joblib

## to save a model
joblib.dump(model,'regression_test.pkl')

['regression_test.pkl']

In [9]:
## to load model
newmodel = joblib.load('regression_test.pkl')
newmodel

In [10]:
newmodel.predict([[324,34,56]])[0]

46990.028023760206

## MLEM
<img src='https://frenzy86.s3.eu-west-2.amazonaws.com/python/mlem.png' widht=600>

In [11]:
!pip install mlem -q



In [12]:
import mlem

In [13]:
#Save model

mlem.api.save(model,
              'model_', # model_.mlem
              sample_data = X_train #features
              )

MlemModel(location=Location(path='c:/Users/danie/My Drive/lessons/IFOA/IFTS-23/08Boston/model_.mlem', project=None, rev=None, uri='file://c:/Users/danie/My Drive/lessons/IFOA/IFTS-23/08Boston/model_.mlem', project_uri=None, fs=<fsspec.implementations.local.LocalFileSystem object at 0x00000201660EF9D0>), params={}, artifacts={'data': LocalArtifact(uri='model_', size=585, hash='83d8255e757511a07e06888cb538882a')}, requirements=Requirements(__root__=[InstallableRequirement(module='sklearn', version='1.2.1', package_name='scikit-learn', extra_index=None), InstallableRequirement(module='pandas', version='1.5.2', package_name=None, extra_index=None), InstallableRequirement(module='numpy', version='1.23.5', package_name=None, extra_index=None)]), processors_cache={'model': SklearnModel(model=LinearRegression(), io=SimplePickleIO(), methods={'predict': Signature(name='predict', args=[Argument(name='X', type_=DataFrameType(value=None, columns=['', 'R&D Spend', 'Administration', 'Marketing Spend

In [14]:
!cat model_.mlem

"cat" non � riconosciuto come comando interno o esterno,
 un programma eseguibile o un file batch.


In [15]:
## Load Model

new_model = mlem.api.load('model_.mlem')

new_model.predict(X_test)

array([113783.74732396, 161047.15549438,  95980.28965584, 164871.12705721,
        44330.77114766,  67888.27214267, 191528.21340132, 112251.03459869,
        73092.82547184, 116557.72221927])

In [16]:
new_model.predict([[1,1,1],
                   [2,3,4]
                   ])

array([46723.24017642, 46724.11164189])