#### Regression Evalutaion metrics

Model Evaluation is an integral part of the model development process. This notebook focuses on different model evaluation metrics for the Regression model. Regression is a task of predicting the value of target (numerical variable) by building a model based on one or more predictors (numerical and categorical variables). There are many regression models ranging from linear regression to artificial neura networks. 

##### Code for Regression metrics

In [1]:
from sklearn.metrics import mean_squared_error
from scipy.stats.stats import pearsonr 
from sklearn.metrics import r2_score, mean_absolute_error
from sklearn.metrics import median_absolute_error
import numpy as np

In [2]:
def percentage_error(y_true, y_pred):
    """
    function for calculating MAPE. To handle the error while actual value is zero
    """
    res = np.empty(actual.shape)
    for j in range(actual.shape[0]):
        if actual[j] != 0:
            res[j] = (actual[j] - predicted[j]) / actual[j]
        else:
            res[j] = predicted[j] / np.mean(actual)
    return res

def mean_absolute_percentage_error(y_true, y_pred): 
    return np.mean(np.abs(percentage_error(np.asarray(y_true), np.asarray(y_pred)))) * 100


In [3]:
def regression_metrics(y_true, y_pred):
    
    '''
    This function calculates different metrics that can be used for evaluating the performance of a regression model. Comparision of the below stats helps to identify the model performance.
    
    '''
    
    reg_metric = {}
    
    # Mean Square Error
    reg_metric["mse"] = mean_squared_error(y_true,y_pred)
    # Root Mean Square Error
    reg_metric['rmse'] = np.sqrt(mean_squared_error(y_true,y_pred))
    # Normalized Root Mean Square Error
    reg_metric['norm_rmse'] = reg_metric["rmse"]/np.mean(y_true)
    # Mean of the actual values
    reg_metric['actual_mean'] = np.mean(y_true)
    # Median of the actual values
    reg_metric['actual_median'] = np.median(y_true)
    # Standard deviation of the actual values
    reg_metric['actual_std'] = np.std(y_true)
    # Median of the predicted values
    reg_metric['predicted_median'] = np.median(y_pred)
    # Mean of the predicted values
    reg_metric['predicted_mean'] = np.mean(y_pred)
    # Standard deviation of the predicted values
    reg_metric['predicted_std'] = np.std(y_pred)
    # R2 metric - How well predictions are compared to the average value
    reg_metric['r2'] = r2_score(y_true,y_pred)
    # Mean of the absolute residuals
    reg_metric['mae'] = mean_absolute_error(y_true,y_pred)    
    # Correlation between actual and predicted values
    reg_metric['corr'] = pearsonr(y_true,y_pred)[0]
    # Median of the absolute residuals
    reg_metric['median_absoulte_error'] = median_absolute_error(y_true,y_pred)
    # MAPE
    y_true, y_pred = np.array(y_true), np.array(y_pred)     
    reg_metric['mean_absolute_percentage_error'] = mean_absolute_percentage_error(y_true,y_pred)
    
    return pd.DataFrame([reg_metric])
