<h1 align='center'> Performance Metrics for Clustering Problems </h1>

In this note book, I am going to illsutrate how we can evaluate the ML models deployed for clustering task using different kind of evaluation metrics.

For illsutareting purpose I have collected the data from kaggle. I will be doing all the analysis over them only. I have attach the link for each dataset, you can also download the same.


For the implemetations of these metrics, I am using following tools and frameworks:
- Python - as a primary language
- Pandas - as an analytical engine for processing the data
- numpy - for computation using numpy arrays
- matplotlib - for plotting the figures
- sklearn - for implememting the metrics
- seaborn - for graph plotting 

Note that I will be implementing all the metrices from scratch.



Let's write some generic code which will be used throughout this notebook.


## Imports


In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn
import sklearn

## Data Loader


In [3]:
#function for loading a csv file from specified path

def data_loader(path):
    
    #use pandas to load the data from path
    df = pd.read_csv(path)
    
    return df


<h2 align='center'>Market Basket Analysis</h2>


**Problem Statement :**

You own the mall and want to understand the customers like who can be easily converge [Target Customers] so that the sense can be given to marketing team and plan the strategy accordingly.

**About Data :** 

You are owing a supermarket mall and through membership cards , you have some basic data about your customers like Customer ID, age, gender, annual income and spending score. 
Spending Score is something you assign to the customer based on your defined parameters like customer behavior and purchasing data.

**Columns Info:**

 The following describes the dataset columns:

- **CustomerID:** Unique Id for each customer
- **Gender:** Gender of a customer
- **Age:** Age of a customer
- **Annual Income (k$):** Anuual Income of each customer in 1000's
- **Spending Score (1-100):** Score assigned by the mall based on customer behavior and spending nature

**Variable of Interest Seggregation**

This is a clustering problem we don't have any variable attach with data for prediction. 

**Algorithms to be used :**

Since this is a regression problem I will be using `K-Means`, `Hierarchical clustering`, `DBSCAN`, `Gausian Mixture Model(GMM)` and `Agglomerative clustering` which are based on Regression Algorithms for modeling pourpose.

**Evaluations metrics :**

Since this is a clustering problem, I will be using following metrics for evaluating output of the predictive models.

- Cluster Purity
- Homogeneity, completeness, and V-measur
- Rand Index
- Adjusted Rand Index
- Silhouette score
- Dun Index


`Always look at your data`
## Load the data

I have downloaded the data from above link and have stored the same in my local file system. 

In [4]:
#df contains all the data
path= "/Users/ajitkumarsingh/Desktop/Data-Science-Interview-Questions/performance-metrics/data/Mall_Customers.csv"
df = data_loader(path)

# show first row
df.head(1)

Unnamed: 0,CustomerID,Gender,Age,Annual Income (k$),Spending Score (1-100)
0,1,Male,19,15,39


### Basic Exploratory Data Analysis (EDA)

In [5]:
#columns name in churn df
print(f"columns name : {', '.join(df.columns.tolist())}\n")

#total number of columns
print(f"total number of columns : {len(df.columns)}\n")

#total count of the data
print(f"Number of rows in data : {len(df)}\n")



columns name : CustomerID, Gender, Age, Annual Income (k$), Spending Score (1-100)

total number of columns : 5

Number of rows in data : 200



We have 14 attributes and 10000 rows in the churn data.

In [6]:
#number of nulls per attributes
df.isnull().sum()

CustomerID                0
Gender                    0
Age                       0
Annual Income (k$)        0
Spending Score (1-100)    0
dtype: int64

We don't have any null values in of the columns.

In [9]:
# number of unique values
df.nunique()

CustomerID                200
Gender                      2
Age                        51
Annual Income (k$)         64
Spending Score (1-100)     84
dtype: int64

From unique values of `CustomerID` it seems like Primary key in the dataset. 



In [10]:
# data types
df.dtypes

CustomerID                 int64
Gender                    object
Age                        int64
Annual Income (k$)         int64
Spending Score (1-100)     int64
dtype: object

All columns seems to have `numeric` data type except `Gender`.



### Feature Selection

This is the crucial part of a ML modeling. 

For time being I am asssuming all the columns present in `df` are equaly important and are key driving factor in the target value prediction.

In [10]:
redundant_features = []

#drop redundant features(in this case it is empty)
df = df.drop(redundant_features, axis=1)

#show 1 row
df.head(1)

Unnamed: 0,CRIM,ZN,INDUS,CHAS,NOX,RM,AGE,DIS,RAD,TAX,PTRATIO,B,LSTAT,MEDV
0,0.00632,18.0,2.31,0.0,0.538,6.575,65.2,4.09,1,296,15.3,396.9,4.98,24.0


### Features Extraction

Feature extraction is a process in machine learning and data analysis that involves transforming raw data or input features into a set of representative features that are more meaningful and informative for a particular task or problem. It is an important step in the feature engineering process, which involves selecting, transforming, and creating new features from raw data to improve the performance and interpretability of machine learning models.

This this problem we noted that we do have some columns which have like high cardinality and we can reduce cardinality using `binning` process.

In `binning` method, we convert `numeric` columns to `categorical` columns to reduce overall cardinality of an attribute.

Let's say, if the number of distinct values in a column is greater than the `50%` of the total count it means that column has high cardinality

In [11]:
# number of unique values
df.nunique()

CRIM       485
ZN          27
INDUS       77
CHAS         3
NOX         81
RM         446
AGE        349
DIS        412
RAD          9
TAX         66
PTRATIO     46
B          357
LSTAT      439
MEDV       229
dtype: int64

In [12]:
#columns with high cardinality
total_count = df.shape[0]
unique_counts = df.nunique()

#filter for high cardinality
high_cardinality_columns = unique_counts[unique_counts>0.5*total_count].index.tolist()

print("Columns with high cardinalities are {}".format(', '.join(high_cardinality_columns)))

Columns with high cardinalities are CRIM, RM, AGE, DIS, B, LSTAT


Discreatize the categorical columns one by one.

In [13]:
#dicreatization based on quatiles
#classify crimes(CRIM) into HIGH, Medium and Low
df['CRIM_BINNED'] = pd.qcut(df['CRIM'],q=3,labels=['Low', 'Moderate', "High"])

#bin the age coliumn into Young, Adult and Old
df['AGE_BINNED'] = pd.qcut(df['AGE'],q=3,labels=['Young', 'Adult', "Old"])

#bin the average number of room per dwelling(RM) into Low, Moderate and High
df['RM_BINNED'] = pd.qcut(df['RM'], q=3, labels=['Low', 'Moderate', "High"])

#bin distance column (DIS) into Very Near, Near, Far and Very Far
df['DIS_BINNED'] = pd.qcut(df['DIS'], q=4, labels=['Very Near', 'Near', "Far", "Very Far"])

#bin Black population proportion
df['B_BINNED'] = pd.qcut(df['B'], q=2, labels=['Low', 'High'])

#bin % of lower status of the popultion (LSTAT)
df['LSTAT_BINNED'] = pd.qcut(df['LSTAT'], q=3, labels=['Low', 'Medium','High'])




**Note** that there might be many permutation and combinations here like what is optimal number of categories a column should be binned. For time being I have used common intution for transforming the above high cardinality columns.

Also note that while binning we are replacing the numeric values to some non-numeric data type mainly strings here. We need to encode them before we feed the data to our ML models.

There are several techniques for encoding the values into numeric once. Here I am using `One Hot Encoding` method to label the categorical values. In pandas we have `pd.get_dummies(df, columns=column_list)` to get dummies numeric values for non-numeric values. 

This method in pandas append new columns and the number of these new columns are only dependent on the number of `unique` values in the original columns. For example suppose we have a column name `Gender` and it has two distinct values `Male` and `Female`, this methods will create two new columns namely `Gender_Male` and `Gender_Female` and values of `Gender_Male` will be `0` where `Gender` column is `Female` and `1` if the value is `Male`. 

In [14]:
#remove the high catdinality columns now
df_binned = df.drop(high_cardinality_columns, axis=1)

Let's see how many unique values we have per attributes after binning high cardinality columns.

In [15]:
#number of unique values
df_binned.nunique()

ZN               27
INDUS            77
CHAS              3
NOX              81
RAD               9
TAX              66
PTRATIO          46
MEDV            229
CRIM_BINNED       3
AGE_BINNED        3
RM_BINNED         3
DIS_BINNED        4
B_BINNED          2
LSTAT_BINNED      3
dtype: int64

In [16]:
#Encode the categorica columns now using One Hot Encoding
suffix = '_BINNED'
binned_columns = list(filter(lambda x : x[-len(suffix):]==suffix, df.columns))

#use pd.get_dummies to encode the above columns
df_encoded = pd.get_dummies(df, columns=binned_columns)

#now let's see if we have new columns 

In [17]:
#now see the columns name

df_encoded.columns

Index(['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX',
       'PTRATIO', 'B', 'LSTAT', 'MEDV', 'CRIM_BINNED_Low',
       'CRIM_BINNED_Moderate', 'CRIM_BINNED_High', 'AGE_BINNED_Young',
       'AGE_BINNED_Adult', 'AGE_BINNED_Old', 'RM_BINNED_Low',
       'RM_BINNED_Moderate', 'RM_BINNED_High', 'DIS_BINNED_Very Near',
       'DIS_BINNED_Near', 'DIS_BINNED_Far', 'DIS_BINNED_Very Far',
       'B_BINNED_Low', 'B_BINNED_High', 'LSTAT_BINNED_Low',
       'LSTAT_BINNED_Medium', 'LSTAT_BINNED_High'],
      dtype='object')

Clearly `pd.get_dummies` method has appended some new columns in the data.

Just make sure all the columns are of `numeric` data type.

In [18]:
#data types of final df
df_encoded.dtypes

CRIM                    float64
ZN                      float64
INDUS                   float64
CHAS                    float64
NOX                     float64
RM                      float64
AGE                     float64
DIS                     float64
RAD                       int64
TAX                       int64
PTRATIO                 float64
B                       float64
LSTAT                   float64
MEDV                    float64
CRIM_BINNED_Low           uint8
CRIM_BINNED_Moderate      uint8
CRIM_BINNED_High          uint8
AGE_BINNED_Young          uint8
AGE_BINNED_Adult          uint8
AGE_BINNED_Old            uint8
RM_BINNED_Low             uint8
RM_BINNED_Moderate        uint8
RM_BINNED_High            uint8
DIS_BINNED_Very Near      uint8
DIS_BINNED_Near           uint8
DIS_BINNED_Far            uint8
DIS_BINNED_Very Far       uint8
B_BINNED_Low              uint8
B_BINNED_High             uint8
LSTAT_BINNED_Low          uint8
LSTAT_BINNED_Medium       uint8
LSTAT_BI

We don't have any non numeric data type columns.

## Modeling

### Train/Test Split

For evaluating a model, we need some data to test it on, once training part is done. We usually split the data into two parts i.e train and test. 
On train data we update the models parameters and on test data we see how the trained model is performing. 

For spliting the data, I am using `train_test_split` function of module `sklearn.model_selection`. 

Also we need to segregate the `label` from the rest of the data.


In [19]:
from sklearn.model_selection import train_test_split

#split the data into test and train in 85:15 ratio and drop the label from the rest of the data

train_data, test_data, train_labels, test_labels = train_test_split(
        df_encoded.drop(['MEDV'], axis=1), 
        df_encoded['MEDV'], 
        test_size=0.15, 
        random_state=42
    )

#count after split
print(f"Total count : {len(df_encoded)}\n")
print(f"Train Data count : {len(train_data)}\n")
print(f"Test Data count : {len(test_data)}\n")
print(f"Train Labels count : {len(train_labels)}\n")
print(f"Test Labels count : {len(test_labels)}\n")

# number of rows and labels should match
assert len(train_data)==len(train_labels)
assert len(test_data)==len(test_labels)


Total count : 506

Train Data count : 430

Test Data count : 76

Train Labels count : 430

Test Labels count : 76



We have stored train and test data in `train_data`, `test_data` and train and test labels in `train_labels`, `test_labels`. I will be using them during training and testing time accordingly.

### Model Selection

We have numerous models out there to solve same type of poblems. But the interesting part is like we don't know which model will be best fitting our dataset and perform well on the test dataset. We need to make choice here and to do that we need to evaluate these models one by one using some performance metrics.

I will be training following models and evaluating their performance on test data

- Linear Regressor
- Random Forest Regressor
- Gradient Boosting Regressor
- Neighrest Neighbour Regressor

Here, I will be using `sklearn` module of `scikit-learn` library to implement the above mentioned models.

In [20]:
#import the above models
from sklearn.linear_model import LinearRegression
from sklearn.ensemble import RandomForestRegressor
from sklearn.neighbors import KNeighborsRegressor
from sklearn.ensemble import GradientBoostingRegressor

In [26]:
#create a list that contains one instances of each model type

model_instances = {
          'linear_regressor':LinearRegression(),
          'random_forest_regressor': RandomForestRegressor(n_estimators=4, max_depth=4, max_features=4), 
          'knn_regressor':KNeighborsRegressor(), 
          'gradient_boosting_regressor':GradientBoostingRegressor(n_estimators=4, max_depth=4)
          }


### Model Training

In the above step we have created instances of each model and stored in `model_instances` dict type variable. Now we need to train these instances by feeding the train distribution. 

Once the training is done save the trained instances in the same dict `model_instances`

In [27]:
#train the models on training set and save the trained model for evaluation

for model_name, model in model_instances.items():

    #fit the model with train_data and train_labels
    model.fit(train_data,train_labels)
    
    #save the trained model
    model_instances[model_name] = model
    

### Metrics For Model Evaluation

Now we have trained our models and they are ready to make predictions over test dataset. To make sure the model is predicting meaningfull values not random output we need some metrics to evaluate the output. This is where performance metrics come into picture. 

Let's implement some of the most widely used evaluation metrics for regresion problems from scratch.

#### Performance Metrics Implementation

##### Mean Squared Error (MSE)

It estimated as average squared differences of predicted values and the actual values. For an ideal model `mse` would be equal to `0`. Lower `mse` depicts more accurate predictions. It is expressed in squared units of the target variable

Mathematically it is expressed as :

$$
    MSE = \frac{1}{n} \sum_{i=1}^{n} (y_{\text{pred}, i} - y_{\text{true}, i})^2
$$

Note that the above expression is `differential` and hence can we calculate `gradient` for optimizing purposes.



In [28]:
"""
    Calculate Mean Squared Error (MSE) between true and predicted values.

    Parameters:
        -- y_true (numpy array or list): True values
        -- y_pred (numpy array or list): Predicted values

    Returns:
        -- mse (float): Mean Squared Error
    """

def mean_squared_error(y_true, y_pred):

    #convert then into umpy array if not already
    y_true = np.array(y_true)
    y_pred = np.array(y_pred)
    #squared the differences
    squared_diff = (y_true - y_pred)**2
    #find the avg of squared_diff
    mse = np.mean(squared_diff)
    return mse


**Cautions**

- **Sensitive to outliers:** MSE is sensitive to outliers because if a predicted value (`y_pred_i`) is an outlier, the squared difference with the corresponding true value (`y_true_i`) will be larger, potentially resulting in an inflated MSE.

- **Imbalanced errors:** MSE treats all errors equally, regardless of their magnitude or direction. This means that overestimation and underestimation errors are weighted equally, even though they may have different implications or costs in practice.

- **Not robust to non-Gaussian errors:** MSE assumes that the errors follow a Gaussian (normal) distribution, which may not always be the case in real-world scenarios. If the errors are not normally distributed, MSE may not accurately reflect the model's performance.

- **Lack of sensitivity to small errors:** MSE may not capture small errors well, as it squares the differences between predicted and true values. This can lead to a model with good MSE but poor performance in capturing small errors, which may be important in certain applications.



##### Root Mean Squared Error(RMSE)

This is root of mean squared error we talked about. The good thing about this metric is like the unit is in sync with target variable and hence can be used for comparing multiple models at once. Since `RMSE` is expressed in the same units as the dependent variable, making it easy to interpret in the context of the original data

The expression of `RMSE` is differentiable just like we have for `MSE`.

In [29]:
"""
    Calculate Root Mean Squared Error (RMSE) between true and predicted values.

    Parameters:
        -- y_true (numpy array or list): True values
        -- y_pred (numpy array or list): Predicted values

    Returns:
        -- mse (float): Mean Squared Error
    """

def root_mean_squared_error(y_true, y_pred):

    #convert then into umpy array if not already
    y_true = np.array(y_true)
    y_pred = np.array(y_pred)
    #squared the differences
    squared_diff = (y_true - y_pred)**2
    #find the avg of squared_diff
    mse = np.mean(squared_diff)
    return np.sqrt(mse)


**Cautions**

- Not robust to outliers
- Always positive so not usefull when negative errors make sense.
- Bias towards large values
- Can be affected by sample size

##### Mean Absolute Error (MAE)

It is the avearge of absolute difference of predicted and actual values. It is more robust to outliers unlike MSE and RMSE.

Mathematically, It can be expressed as:

$$
\text{MAE} = \frac{1}{n} \sum_{i=1}^{n} |y_{\text{true}_i} - y_{\text{pred}_i}|
$$

Note that it is not differential at zero because of the mod `|.|` function.



In [30]:
"""
    Calculate Mean Absolute Error (MAE) between true and predicted values.

    Parameters:
        -- y_true (numpy array or list): True values
        -- y_pred (numpy array or list): Predicted values

    Returns:
        -- mae (float): Mean Absolute Error
    """

def mean_absolute_error(y_true, y_pred):

    #convert then into umpy array if not already
    y_true = np.array(y_true)
    y_pred = np.array(y_pred)
    #squared the differences
    absolute_diff = np.absolute(y_true - y_pred)
    #find the avg of squared_diff
    mae = np.mean(absolute_diff)
    return mae

##### Coeffiecient Of Determination (R^2)

It is a statistical measure used to evaluate the `goodness of fit` of a regression model. It represents the proportion of the variance in the dependent variable that is explained by the independent variables in the model. It is more usefull when we have linear relationship between dependent and independent variables.

R-squared ranges from `0` to `1`, with `1` indicating a perfect fit and `0` indicating no fit at all.

Mathematially, It can be expressed as:

$$
\text{R-squared} = 1 - \frac{SSR}{SST}
$$

Where :

- *SSR(Sum of Squared Residuals)* represents the sum of the squared differences between the predicted values and the actual values of the dependent variable.

- *SST(Sum of Squared Total)* represents the sum of the squared differences between the actual values of the dependent variable and the mean of the dependent variable.





In [31]:
"""
    Calculate R-squared value given predicted value and actual label

    Parameters:
        -- y_true (numpy array or list): True values
        -- y_pred (numpy array or list): Predicted values

    Returns:
        -- r^2 (float): Coefficient of determination(R^2)
    """

def r2_score(y_true, y_pred):

    #convert then into umpy array if not already
    y_true = np.array(y_true)
    y_pred = np.array(y_pred)
    #sum of squared residuals
    SSR = np.sum((y_true - y_pred)**2)
    #sum of squared total
    y_avg = np.mean(y_true)
    SST = np.sum((y_avg-y_true)**2)
    return 1 - (float(SSR)/float(SST))

**Note :** `R-squared` value can be negative due to outliers i.e when `MSE(model) > MSE(Baseline)`

Limitations:

- **Lack of information on prediction accuracy:** R-squared does not provide information about the accuracy of the model in making predictions. A high R-squared value does not necessarily guarantee accurate predictions, as the model may still have residual errors or may be overfitting the data.

- **Sensitivity to outliers:** R-squared can be affected by outliers in the data. Outliers can disproportionately influence the sum of squared residuals (SSR) component of R-squared, leading to an inflated or deflated R-squared value. 

- **Inability to determine causality:** R-squared does not provide information about causality. Even if a regression model has a high R-squared value, it does not necessarily mean that the independent variables are causing the observed variation in the dependent variable. There may be other confounding variables or omitted variables that are driving the relationship.

- **Limited to linear relationships:** R-squared is a measure of the goodness of fit of a linear regression model, which assumes a linear relationship between the dependent and independent variables. If the relationship is non-linear, R-squared may not accurately reflect the goodness of fit.

##### Adjusted R-squared 

`R-squared` suffers from problem that the scores improve on increasing number of predictors even if the additional predictors do not significantly improve the model's ability to explain the variation in the dependent variable. This can result in overfitting, where the model appears to fit the data well but may not generalize well to new data.

To overcome the above problem associated with `R-squared`, `Adjusted R-squared` adjusts for the number of predictors in the model, penalizing models with more predictors if the additional predictors do not contribute significantly to the model's ability to explain the variation in the dependent variable.

Mathematically It can be expressed as:

$$
\text{Adjusted R-squared} = 1 - \left( \frac{{(1 - R^2) \cdot (n - 1)}}{{n - k - 1}} \right)

$$

Where:

- R-squared (Goodness of Fit): It represents the proportion of variance in the dependent variable explained by the regression model.

- n: Total number of observations in the dataset.

- k: number of predictors(independent variables) in the regression model.



In [32]:
"""
    Calculate Adjusted R-squared value given predicted and actual label and number of predictors

    Parameters:
        -- y_true (numpy array or list): True values
        -- y_pred (numpy array or list): Predicted values
        -- k (integer): Number of predictors

    Returns:
        -- adjusted r^2 (float): Coefficient of determination(R^2)
    """

def adjusted_r2_score(y_true, y_pred, k):

    #convert then into umpy array if not already
    y_true = np.array(y_true)
    y_pred = np.array(y_pred)

    #number of observations
    n = len(y_true)
    #sum of squared residuals
    SSR = np.sum((y_true - y_pred)**2)
    #sum of squared total
    y_avg = np.mean(y_true)
    SST = np.sum((y_avg-y_true)**2)
    
    #r2_score
    r2_score = 1 - (float(SSR)/float(SST))
    #adjusted r2 square
    r2_score_adj = 1 - float((1-r2_score**2)*(n-1))/float(n - k - 1)
    
    return r2_score_adj

**Note** `Adjusted R_squared` value is sensitive to sample size. It means the value may decrease as we increase the sample size even if the model is performing better.

##### 

### Model Prediction

Now we have defined the performance metrics, we can use them to evaluate the model predctions.

From Train/Test Split Section we have :

- `train_data`, `test_data`, `train_labels` and `test_labels` we can use them for models performance evaluations.

From Model Training section we have : 

- trained instances of `Linear Regressor`, `KNN Regressor`, `Random Forest Regressor` and `Gradient Booster Regressor` in `model_instances` dictionary. We need to feed the test data to these instances  to get the predictions

In [33]:
"""
    get predictions from test instances

    Args:
        trained_model () : trained instance
        test_data (pandas dataframe) : test data without label

    Returns:
        list of predictions against each row in test data
"""


def make_predictions(trained_model, test_data):

    return trained_model.predict(test_data)

    

In [34]:
#store predictions against each model

y_pred_per_model = {}

for model_name, model_instance in model_instances.items():
    
    y_pred_per_model[model_name] = make_predictions(model_instance, test_data) 


We have stored model name and it's predictions over test data in `y_pred_per_model` and we have also `test_labels` which is like ground truths.

### Model Evaluation

This like the last step we need to perform to compare which classification algorithm is performating relatively better over test distribution. In the above sctions, We are done with implementaion of performance metrics now we can use them to evaluate the trained models.

Let's create a report in form of a data frame where headers are `model_name` and the performance metrics we discussed above.

In [41]:
#test labels
y_true = np.array(test_labels)

#lets create a dict which contains model_name and the key pair of metric name and its value
result_dict = {}

#iterate over y_pred_per_model to get model name and corresponding predicted label
for model_name, y_pred in y_pred_per_model.items():

    #find mse
    mse = mean_squared_error(y_true, y_pred)
    
    #find rmse
    rmse = root_mean_squared_error(y_true, y_pred)

    #find mae
    mae = mean_absolute_error(y_true, y_pred)

    #find r2_score
    r2_value = r2_score(y_true, y_pred)

    #find adjusted r2_score
    adjusted_r2_value = adjusted_r2_score(y_true, y_pred, 31)

    result_dict[model_name] = {"mean_squared_error":mse, "root_mean_squared_error":rmse, "mean_absolute_error":mae, "r2_score":r2_value, "adjusted_r2_score":adjusted_r2_value}


#convert result_dict into pandas data frame
result_df = pd.DataFrame.from_dict(result_dict, orient="index")

#disply the result df
result_df.head(20)



Unnamed: 0,mean_squared_error,root_mean_squared_error,mean_absolute_error,r2_score,adjusted_r2_score
linear_regressor,10.948576,3.308863,2.515675,0.832344,0.476358
random_forest_regressor,13.64762,3.694269,2.595294,0.791013,0.361992
knn_regressor,23.958647,4.894757,3.780789,0.63312,-0.021293
gradient_boosting_regressor,34.745582,5.894538,4.333064,0.467939,-0.331306


##### Observations

From above table, we can deduce that `linear regression` is performing best among all the models across all the metrics. 2nd best model is `random forest regressor`.

Worst performance is given by `Gradient Boost Regressor` across all the metrics.


Note that here I am using the default settings of each model and not doing any hyper parameters tuning. So the above observation is done on default setup of each model and hence can change if we do hyper parameter tunings or optimizations.