# **Evaluation Metrics in Regression Models**

Regression is a process where the models are built to predict a continuous variable for example if we need to predict the house prices for the upcoming year. 

In regression problem we do the basic data processing followed by splitting the data into training and testing sets. We use training data to train the model whereas testing data is used to compute prediction by the model. Many different algorithms can be used for regression problems but the idea is to choose that algorithm that works effectively on our data. This can be done by doing the evaluation of the model and using error metrics. Different evaluation methods are used like accuracy score, mean square error etc.

## **Implementation**

We will first import the required libraries that are required and load the data set. We will be using the wine dataset for this problem that can be downloaded directly from [Kaggle](https://www.kaggle.com/sgus1318/winedata). After which we will load the data followed by pre-processing of the data. There are a total of 1599 rows and 12 columns in the data set. There were no missing values found in the data. Use the below to code to the same.

In [None]:
!python -m pip install pip --upgrade --user -q
!python -m pip install numpy pandas seaborn matplotlib scipy statsmodels sklearn --user -q

In [None]:
import IPython
IPython.Application.instance().kernel.do_shutdown(True)

In [None]:
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error,mean_absolute_error

data = pd.read_csv("winequality_red.csv")
print(data.head(10))
print(data.shape)
print(data.isnull().any())

In [None]:
X = data.drop('quality', axis =1)
y = data['quality']

std = StandardScaler()

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

X_train = std.fit_transform(X_train)
X_test = std.fit_transform(X_test)

lr = LinearRegression()

lr.fit(X_train,y_train)

y_pred_lr = lr.predict(X_test)

We have stored our prediction of testing data in y_pred_lr. We will make use of this variable for the evaluation of the model. We will now compute different error metrics to check the model performance like mean squared error and mean absolute data. 

In [None]:
print("Mean Squared Error: ", mean_squared_error(y_pred_lr,y_test))
print("Mean Absolute Error: ",mean_absolute_error(y_pred_lr,y_test))

To know more about the evaluation methods, please refer [here](https://analyticsindiamag.com/practical-guide-to-machine-learning-model-evaluation-and-error-metrics/) and for theory please refer [this](https://analyticsindiamag.com/hands-on-guide-to-loss-functions-used-to-evaluate-a-ml-algorithm/) post.