# Model Evaluation Metrics

They are used to measure how good or bad a regression model‚Äôs predictions are.

In [3]:
# Supervised Learning
#           ‚îî‚îÄ‚îÄ Regression
#                  ‚îî‚îÄ‚îÄ Model Evaluation Metrics
#                         ‚îú‚îÄ‚îÄ R (Correlation Coefficient)
#                         ‚îú‚îÄ‚îÄ R¬≤ (R-Squared Score)
#                         ‚îú‚îÄ‚îÄ MSE
#                         ‚îú‚îÄ‚îÄ MAE
#                         ‚îî‚îÄ‚îÄ RMSE

## R (Correlation Coefficient):

R measures the strength and direction of relationship between input variables and output.
- It tells how strongly two variables move together.

## Range of R 
### -1 ‚â§ ùëÖ ‚â§ 1

- +1 = Perfect positive relationship
- 0 = No relationship
- -1 = Perfect negative relationship

### Example
- Experience ‚Üë ‚Üí Salary ‚Üë ‚Üí R close to +1 
- Speed ‚Üë ‚Üí Time ‚Üì ‚Üí R close to ‚àí1

## R¬≤ (R-Squared Score)

R¬≤ measures how well the regression model explains variation in the target variable.

### Formula

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

![image.png](attachment:image.png)

or it can be said:

![image.png](attachment:image.png)

## Range of R¬≤
### 0 ‚â§ R¬≤ ‚â§ 1

- 1 : perfect prediction
- 0 : Model explains nothing 
- 0.80 : Model explains 80% of the variation

### Example
Suppose house prices depend on many factors:

- if,  R¬≤ = 0.85

It means: 

- 85% of price variation is explained by your model, 15% comes from unknown factors.

## Relationship Between R and R¬≤

- R¬≤ removes direction and keeps only strength.

- Why We Use R¬≤ Instead of R:
    
    - R only tells correlation.
    
    - R¬≤ tells:
        - how much variance model explains
        - goodness of fit
        - model usefulness

That‚Äôs why R¬≤ is widely used in regression evaluation.

R measures correlation strength between variables, while R¬≤ measures how well a regression model explains the variance in the target variable and is used as a regression evaluation metric.

## Adjusted R¬≤

Adjusted R¬≤ is a corrected version of R¬≤.

- It adjusts the score based on:
    
    - number of features
    - dataset size

![image.png](attachment:image.png)

### What Adjusted R¬≤ Checks

- Does the new feature really improve prediction? 
- Or is it just increasing complexity?

    
     If feature is useless:
    - Adjusted R¬≤ decreases.

### Example

Imagine hiring workers:

- R¬≤ ‚Üí judges team by total output only.

- Adjusted R¬≤ ‚Üí checks whether each new worker actually contributes.

If someone adds no value ‚Üí score drops.

### R¬≤ Using Scikit-Learn

In [50]:
from sklearn.metrics import r2_score

# actual and predicted values
y_true = [50, 60, 70, 80]
y_pred = [48, 63, 68, 79]

# R¬≤ score
r2 = r2_score(y_true, y_pred)

print("R¬≤ score:", r2)

R¬≤ score: 0.964


### R score usually derives from R¬≤:

In [53]:
import numpy as np

r2 = r2_score(y_true, y_pred)
R = np.sqrt(r2)

print("R score:", R)

R score: 0.9818350166906862


### Adjusted R¬≤:

In [54]:
from sklearn.metrics import r2_score
import numpy as np

# actual and predicted values
y_true = [50, 60, 70, 80]
y_pred = [48, 63, 68, 79]

# number of samples
n = len(y_true)

# number of features (example: assume 2 input features)
p = 2

# calculate R¬≤
r2 = r2_score(y_true, y_pred)

# calculate Adjusted R¬≤
adjusted_r2 = 1 - ((1 - r2) * (n - 1)) / (n - p - 1)

print("R¬≤ score:", r2)
print("Adjusted R¬≤:", adjusted_r2)


R¬≤ score: 0.964
Adjusted R¬≤: 0.8919999999999999


### Measuring the error of Model's prediction from the true value  :

![image.png](attachment:image.png)

        R, R¬≤, MSE, MAE, RMSE

## MSE :  Mean Squared Error

The average of the squared differences between actual values and predicted values.

### Formula

we:

1. Square the error ‚Üí removes negative signs
2. Penalize large mistakes more
- so : Large errors become very large after squaring.

![image.png](attachment:image.png)

### Advantages :

### Disadventages :

### summary:
Mean Squared Error is the average of squared differences between actual and predicted values, used to measure regression model accuracy while penalizing large errors.

### significance:

- It provides the quantitative measure of the accuracy of the predictive models.
- It penalizes large errors more heavily than small errors, making it sensitive to the outliers.
- It is mathematically convenient and easy to interpret, making it a preferred choice for evaluating model performance.

### MSE Value validation
    Smaller MSE ‚Üí Better model
    Larger MSE ‚Üí Poor predictions
    MSE = 0 means perfect prediction.

### MSE Using Scikit-Learn

In [44]:
# example:
from sklearn.metrics import mean_squared_error

y_true = [50,60,70]
y_actual = [10,40,80]

mse = mean_squared_error(y_true, y_actual)

if  mse > 0 and mse < 5 :
    print('MSE:',mse)
    print('model is good, the predctions considerable')
elif mse == 0 :
    print('MSE:', mse)
    print('Perfect prediction, the model is great')
else:
    print('MSE',mse)
    print('predctions are far from the actual values, dont consider this model')

MSE 700.0
predctions are far from the actual values, dont consider this model


## MAE : Mean Absolute Error

The average of absolute differences between actual values and predicted values.

### Formula

![image.png](attachment:image.png)

### MAE Value Validation
    Smaller MAE ‚Üí Better model
    MAE = 0 ‚Üí Perfect prediction

## MAE VS MSE :

- MAE : Linear penalty.
- MSR : Large mistakes strongly punished.

use cases:

1. Use MAE when all prediction errors are equally important.
    
    Example:
    - Business Forecasting : Being off by 10 or 20 units is acceptable ‚Äî no need to heavily punish large errors

2. Use MSE when large mistakes are very dangerous.

     Example:
     - Safety-Critical Systems
        
        - Medical predictions
        - Autonomous driving
        - Risk prediction

### why absolute value | | is used :

Absolute Value is Used To measure size of mistake, not direction.(only considering the distance of predicted value from the true value not the sign).

### significance of mean absolute error
- Calculated using absolute differences
- Simple to compute and interpret
- Treats all errors equally
- Less sensitive to large errors than MSE
- Commonly used to evaluate regression models

### MAE Using Scikit-Learn

In [41]:
from sklearn.metrics import mean_absolute_error

y_true = [50,60,70]
y_actual = [40,50,60]

mse = mean_absolute_error(y_true, y_actual)
print('MAE:',mse)

MAE: 10.0


## RMSE : Root Mean Squared Error
The square root of the average squared prediction error.
- It tells the average prediction error in the same unit as the target value.

### Formula

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

## MSE vs RMSE
- MSE squares errors, so units change :  
    - price unit = ‚Çπ
    - mse unit = ‚Çπ¬≤
- so taking  square root brings it back to original unit to become easy to understand
- RMSE will be in the same unit as the target value.So it becomes easy to interpret.

### RMSE Using Scikit-Learn

In [2]:
from sklearn.metrics import mean_squared_error
import numpy as np

y_true = [50, 60, 70]
y_pred = [47, 65, 68]

mse = mean_squared_error(y_true, y_pred)
rmse = np.sqrt(mse)

print(mse)
print(rmse)

12.666666666666666
3.559026084010437
