# 1. Performance Measures

* [1 - Regression Problems](#regression)
* [2 - Classification Problems](#classification)

<a class="anchor" id="regression">

## 1. Regression Problems
</a>

* [1.1. - $R^{2}$ Score](#rsquare)
* [1.2. - Adjusted $R^{2}$ Score](#adjusted)
* [1.3. - MAE](#mae)
* [1.4. - RMSE](#mse)
* [1.5. - MedAE](#medae)

Import the needed libraries

In [17]:
import pandas as pd
from sklearn.linear_model import LinearRegression

__`Step 1`__ Import the dataset __Boston.csv__ and define as data the independent variables and target the dependent variable (last column) 

In [18]:
boston = pd.read_csv(r'./Datasets/Boston.csv')
data_boston = boston.iloc[:,:-1]
target_boston = boston.iloc[:,-1]

__`Step 2`__ By using the method train_test_split from sklearn.model_selection, split your dataset into train(80%) and validation(20%).

In [19]:
from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(data_boston, 
                                                    target_boston, 
                                                    test_size=0.2, 
                                                    random_state=15, 
                                                    shuffle=True, # shuffle the data before splitting
                                                    # don't use stratify, because there's no classes (this is a regression problem, not a classification problem)
                                                   )

__`Step 3`__ Create an instance of LinearRegression named as lr with the default parameters and fit to your train data.

In [20]:
lr = LinearRegression().fit(X_train,y_train)

__`Step 4`__ Now that you have your model created, assign the predictions to y_pred, using the method predict().

In [21]:
y_pred = lr.predict(X_val)

__`Step 5`__ From __slearn.metrics__ import r2_score, mean_absolute_error, mean_squared_error, median_absolute_error

In [22]:
from sklearn.metrics import r2_score, mean_absolute_error, mean_squared_error, median_absolute_error

<a class="anchor" id="rsquare">
    
### 1.1. $R^{2}$ Score

</a>

<div class="alert alert-block alert-info">
<a href = 'https://scikit-learn.org/stable/modules/generated/sklearn.metrics.r2_score.html#sklearn.metrics.r2_score'>sklearn.metrics.r2_score(y_true, y_pred, ... )</a>

__Definition:__ <br>
R^2 (coefficient of determination) regression score function.

__Interpretation:__ <br>
Best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). 

__Parameters:__ <br>
_y_true_: Ground truth (correct) target values; <br>
_y_pred_: Estimated target values; <br>
...
</div>

__`Step 6`__ Check the R^2 score of the model you created previously

In [23]:
r2_score(y_val, y_pred)

0.692074903865215

__When to use?__ <br>
When we want to measure the amount of variance in the target variable that can be explained by our model. <br>
It gives the degree of variability in the target variable that is explained by the model or the independent variables. <br>
If this value is 0.7, then it means that the independent variables explain 70% of the variation in the target variable.

<a class="anchor" id="adjusted">
    
### 1.2. Adjusted $R^{2}$ Score

</a>

There is no direct way to obtain the adjusted R^2 using sklearn, but we can apply the formula:
<img src="adj_r2.png" alt="Drawing" style="width: 300px;"/> <br>


where n stands for the sample size and p for the number of the regressors.

__`Step 7`__ Calculate the Adjusted R^2 Score for your model.

In [24]:
# DO IT
r2 = r2_score(y_val, y_pred)
n = len(y_val)
p = len(X_train.columns)

def adj_r2 (r2,n,p):
    return 1-(1-r2)*(n-1)/(n-p-1)

adj_r2(r2,n,p)

0.64658596920894

__When to use?__ <br>
When we want to measure the amount of variance in the target variable that can be explained by our model. <br>
This is a form of R-squared that is adjusted for the number of terms in the model. <br>
Tries to avoid the problem associated with R-squared:  even if we are adding redundant variables to the data, the value of R-squared does not decrease - it either remains the same or increases with the addition of new independent variables.

__Then what is the advantage of $R^{2}$?__ <br>
It has a direct interpretation as the proportion of variance in the dependent variable that is accounted for by the model.


<hline>

***
    
However in some cases we are more interested in quantifying the error in the same measuring unit of the variable:
    - we can use metrics like MAE, MSE and MedAE for that.
    
***

<a class="anchor" id="mae">
    
### 1.3. MAE (Mean absolute error)

</a>

<img src="mae.png" alt="Drawing" style="width: 200px;"/>

__`Step 8`__ Check the MAE of the model you created previously

<div class="alert alert-block alert-info">
<a href = 'https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_absolute_error.html#sklearn.metrics.mean_absolute_error'>sklearn.metrics.mean_absolute_error(y_true, y_pred, ... )</a>

__Definition:__ <br>
Mean absolute error regression loss.

__Interpretation:__ <br>
Best possible value is 0.0. MAE is always non-negative.

__Parameters:__ <br>
_y_true_: Ground truth (correct) target values; <br>
_y_pred_: Estimated target values; <br>
...
</div>

In [25]:
mean_absolute_error(y_val, y_pred)

3.6860868233802537

__When to use?__ <br>
It measures the average magnitude of the errors in a set of predictions, without considering their direction.

<a class="anchor" id="mse">
    
### 1.4. RMSE (Root Mean squared error)

</a>

<img src="rmse.png" alt="Drawing" style="width: 250px;"/>

<!-- __`Step 9`__ Check the RMSE of the model you created previously -->

<div class="alert alert-block alert-info">
<a href = 'https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_squared_error.html#sklearn.metrics.mean_squared_error'>sklearn.metrics.mean_squared_error(y_true, y_pred, ... )</a>

__Definition:__ <br>
Mean absolute error regression loss.

__Interpretation:__ <br>
Best possible value is 0.0. MSE is always non-negative.

__Parameters:__ <br>
_y_true_: Ground truth (correct) target values; <br>
_y_pred_: Estimated target values; <br>
...
</div>

In [26]:
mean_squared_error(y_val, y_pred, squared = False)



4.879779243478196

__When to use?__ <br>
Since the errors are squared before they are averaged, the RMSE gives a relatively high weight to large errors. This means the RMSE should be more useful when large errors are particularly undesirable.

__MAE vs. RMSE__ <br>
RMSE has the benefit of penalizing large errors more so can be more appropriate in some cases, for example, if being off by 20 is more than twice as bad as being off by 10. But if being off by 20 is just twice as bad as being off by 10, then MAE is more appropriate.

<a class="anchor" id="medae">
    
### 1.5. MedAE (Median absolute error)

</a>

__`Step 10`__ Check the MedAE score of the model you created previously

<div class="alert alert-block alert-info">
<a href = 'https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_absolute_error.html#sklearn.metrics.median_absolute_error'>sklearn.metrics.median_absolute_error(y_true, y_pred, ... )</a>

__Definition:__ <br>
Median absolute error regression loss

__Interpretation:__ <br>
Best possible value is 0.0. MedAE is always non-negative.

__Parameters:__ <br>
_y_true_: Ground truth (correct) target values; <br>
_y_pred_: Estimated target values; <br>
...
</div>

In [27]:
median_absolute_error(y_val, y_pred)

2.8209646896270435

__When to use?__ <br>
Using the median instead of the mean implies that we are ignoring the outliers.

<a class="anchor" id="classification">

## 2. Classification Problems
</a>

* [2.1. - The confusion matrix](#confusion)
* [2.2. - The accuracy Score](#accuracy)
* [2.3. - The precision](#precision)
* [2.4. - The recall](#recall)
* [2.5. - The F1 Score](#f1)


__`Step 11`__ Import the needed libraries to apply logistic Regression

In [28]:
# DO IT
from sklearn.linear_model import LogisticRegression

__`Step 12`__ Import the dataset __final_tugas.csv__ and define the independent variables as __data__ and the dependent variable ('DepVar') as __target__. 

In [29]:
tugas = pd.read_csv(r'./Datasets/final_tugas.csv')
data_tugas = tugas.drop(['DepVar'], axis=1)
target_tugas = tugas['DepVar']

__`Step 13`__ By using the method train_test_split from sklearn.model_selection, split your dataset into train(80%) and validation(20%).

In [30]:
X_train, X_val, y_train, y_val = train_test_split(data_tugas, 
                                                  target_tugas, 
                                                  test_size = 0.2, 
                                                  random_state=5, 
                                                  stratify = target_tugas)

__`Step 14`__ Create an instanve of LogisticRegression named as __log_model__ with the default parameters and fit to your train data.

In [31]:
log_model = LogisticRegression()

In [32]:
log_model.fit(X_train, y_train)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


__`Step 15`__ Now that you have your model created, assign the predictions to y_pred, using the method predict().

In [33]:
# DO IT
y_pred = log_model.predict(X_val)

__`Step 16`__ From __slearn.metrics__ import confusion_matrix, accuracy_score, precision_score, recall_score, f1_score

The metrics used for classification differ from the ones used for regression. <br>The sklearn library offers a wide range of metrics for this situation. We are going to see the most used ones. 

In [34]:
from sklearn.metrics import confusion_matrix, accuracy_score, precision_score, recall_score, f1_score

<a class="anchor" id="confusion">
    
### 2.1. The confusion matrix

</a>

<div class="alert alert-block alert-info">
<a href = 'https://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html#sklearn.metrics.confusion_matrix'>sklearn.metrics.confusion_matrix(y_true, y_pred, ...)</a>

__Definition:__ <br>
Compute confusion matrix to evaluate the accuracy of a classification

__Parameters:__ <br>
_y_true_: Ground truth (correct) target values.; <br>
_y_pred_: Estimated targets as returned by a classifier.; <br>
...
</div>

__`Step 17`__ Obtain the confusion matrix

In [35]:
confusion_matrix(y_val, y_pred)

array([[455,  10],
       [ 29,   6]])

The confusion matrix in sklearn is presented in the following format: <br>
[ [ TN  FP  ] <br>
    [ FN  TP ] ]

<a class="anchor" id="accuracy">
    
### 2.2. The accuracy score

</a>

<img src="accuracy.png" alt="Drawing" style="width: 300px;"/>

<div class="alert alert-block alert-info">
<a href = 'https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html#sklearn.metrics.accuracy_score'>sklearn.metrics.accuracy_score(y_true, y_pred, normalize=True,...)</a>

__Definition:__ <br>
Accuracy classification score.

__Interpretation:__ <br>
If normalize is True, then the best performance is 1. When normalize = False, then the best performance is the number of samples.

__Parameters:__ <br>
_y_true_: Ground truth (correct) target values.; <br>
_y_pred_: Estimated targets as returned by a classifier.; <br>
_normalize_: If False, return the number of correctly classified samples. Otherwise, return the fraction of correctly classified samples. <br>
...
</div>

__`Step 18`__ Get the accuracy score

In [36]:
# DO IT
accuracy_score(y_val, y_pred)

0.922

Is accuracy always a good option? Let's check with an example:

<img src="example_1.png" alt="Drawing" style="width: 400px;"/>

In this case, what is the accuracy?

<img src="example_2.png" alt="Drawing" style="width: 300px;"/>

We have an accuracy of 99,1% which is very very high! That is great, right? <br>
Well, not really...<br>
Imagine that we are testing people potentially with covid... A positive person is actually someone who is sick and carrying a virus that can spread very quickly! The cost of having a misclassified actual positive (or a false negative) is very high!

<a class="anchor" id="precision">
    
### 2.3. The precision

</a>

<img src="precision.png" alt="Drawing" style="width: 200px;"/>

<div class="alert alert-block alert-info">
<a href = 'https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_score.html#sklearn.metrics.precision_score'>sklearn.metrics.precision_score(y_true, y_pred, ...)</a>

__Definition:__ <br>
Compute the precision.

__Interpretation:__ <br>
The best value is 1, and the worst value is 0.

__Parameters:__ <br>
_y_true_: Ground truth (correct) target values.; <br>
_y_pred_: Estimated targets as returned by a classifier.; <br>
...
</div>

__`Step 19`__ Get the precision score

In [37]:
precision_score(y_val, y_pred)

0.375

If you look at the confusion matrix, we can verify that precision is only concerned to the predicted values that were considered positive:
    
<img src="example_3.png" alt="Drawing" style="width: 400px;"/>

So precision gives us how precise / accurate our model is out of those predicted positive, how many of them are actual positive.

__When to use?__

`When the cost of False Positives is high.` <br>
For example, in email spam detection, where a negative is considered not spam and a positive is a spam email. <br>
A false positive will be an email that is considered spam when in reality it was not - the user will loose potentially importante information if the precision is not high in the spam detection model.

<a class="anchor" id="recall">
    
### 2.4. The recall

</a>
<img src="recall.png" alt="Drawing" style="width: 180px;"/>

<div class="alert alert-block alert-info">
<a href = 'https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_score.html#sklearn.metrics.recall_score'>sklearn.metrics.recall_score(y_true, y_pred, ...)</a>

__Definition:__ <br>
Compute the recall.

__Interpretation:__ <br>
The best value is 1 and the worst value is 0.

__Parameters:__ <br>
_y_true_: Ground truth (correct) target values.; <br>
_y_pred_: Estimated targets as returned by a classifier.; <br>
...
</div>

__`Step 20`__ Get the recall score

In [38]:
recall_score(y_val, y_pred)

0.17142857142857143

Looking at the confusion matrix:
    
<img src="example_4.png" alt="Drawing" style="width: 400px;"/>

Recall calculates how many of the actual positives our model is able to capture through labeling it as positive (True positive).

__When to use?__

`When the cost of False Negatives is high.` <br>
For example, in the example we gave before concerning Covid tests. If a sick patient (Actual Positive) does the test and is predicted as not sick (predicted as negative), the risk will be extremely high since the sickness is contagious. 

<a class="anchor" id="f1">
    
### 2.5. The F1 Score

</a>

<img src="f1.png" alt="Drawing" style="width: 270px;"/>

<div class="alert alert-block alert-info">
<a href = 'https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html#sklearn.metrics.f1_score'>sklearn.metrics.f1_score(y_true, y_pred, ...)</a>

__Definition:__ <br>
Compute the F1 score, also known as balanced F-score or F-measure.

__Interpretation:__ <br>
F1 score reaches its best value at 1 and worst score at 0.

__Parameters:__ <br>
_y_true_: Ground truth (correct) target values.; <br>
_y_pred_: Estimated targets as returned by a classifier.; <br>
...
</div>

__`Step 21`__ Get the F1 Score

In [39]:
f1_score(y_val, y_pred)

0.23529411764705882

__When to use?__

F1 Score should be used when you want to seek a balance between Precision and Recall and if there is an uneven class distribution (large number of Actual Negatives).

__`Step 22`__ To evaluate the results, we are going to use also the classification report method. <br>
Import __classification_report__ from __sklearn.metrics__

In [40]:
# DO IT
from sklearn.metrics import classification_report

__`Step 23`__ Create  a function named `metrics` that will print the results of the classification report and the confusion matrix for both datasets (train and validation) _(written for you)_

In [41]:
def metrics(y_train, pred_train , y_val, pred_val):
    print('___________________________________________________________________________________________________________')
    print('                                                     TRAIN                                                 ')
    print('-----------------------------------------------------------------------------------------------------------')
    print(classification_report(y_train, pred_train))
    print(confusion_matrix(y_train, pred_train))


    print('___________________________________________________________________________________________________________')
    print('                                                VALIDATION                                                 ')
    print('-----------------------------------------------------------------------------------------------------------')
    print(classification_report(y_val, pred_val))
    print(confusion_matrix(y_val, pred_val))

__`Step 24`__ Create an object named __labels_train__ that will containt the predicted values for the train and another one named __labels_val__ that will contain the predicted values for the validation set.

In [42]:
labels_train = log_model.predict(X_train)
labels_val = log_model.predict(X_val)

__`Step 25`__ Call the function metrics() defined previously, and define the arguments: <br> (`y_train = y_train`, `pred_train = labels_train` , `y_val = y_val`, `pred_val = labels_val`)

In [43]:
# DO IT
metrics(y_train = y_train, pred_train = labels_train, y_val = y_val, pred_val = labels_val)

___________________________________________________________________________________________________________
                                                     TRAIN                                                 
-----------------------------------------------------------------------------------------------------------
              precision    recall  f1-score   support

           0       0.95      0.99      0.97      1860
           1       0.64      0.26      0.37       140

    accuracy                           0.94      2000
   macro avg       0.79      0.63      0.67      2000
weighted avg       0.93      0.94      0.93      2000

[[1839   21]
 [ 103   37]]
___________________________________________________________________________________________________________
                                                VALIDATION                                                 
----------------------------------------------------------------------------------------------------------

Sources: <br>
https://medium.com/human-in-a-machine-world/mae-and-rmse-which-metric-is-better-e60ac3bde13d <br>
https://towardsdatascience.com/accuracy-precision-recall-or-f1-331fb37c5cb9