In [None]:
# Install the necessary dependencies

import os
import sys
!{sys.executable} -m pip install --quiet pandas scikit-learn numpy matplotlib jupyterlab_myst ipython


---
license:
    code: MIT
    content: CC-BY-4.0
github: https://github.com/ocademy-ai/machine-learning
venue: By Ocademy
open_access: true
bibliography:
  - https://raw.githubusercontent.com/ocademy-ai/machine-learning/main/open-machine-learning-jupyter-book/references.bib
---

#  Linear Regression Metrics

Linear regression is a fundamental and widely used technique in machine learning and statistics for predicting continuous values based on input variables. It finds its application in various domains, from finance and economics to healthcare and engineering. When using linear regression, it's essential to assess the model's performance accurately. This is where linear regression metrics come into play.

In this tutorial, we will delve into the world of linear regression metrics, exploring the key evaluation measures that allow us to gauge how well a linear regression model fits the data and makes predictions. These metrics provide valuable insights into the model's accuracy, precision, and ability to capture the underlying relationships between variables.

We will cover essential concepts such as Mean Squared Error (MSE), Root Mean Squared Error (RMSE), R-squared (R2) score, and Mean Absolute Error (MAE). Understanding these metrics is crucial for data scientists, machine learning practitioners, and anyone looking to harness the power of linear regression for predictive modeling.

Whether you are building models for price predictions, sales forecasts, or any other regression task, mastering these metrics will empower you to make informed decisions and fine-tune your models for optimal performance. Let's embark on this journey to explore the intricacies of linear regression metrics and enhance our ability to assess and improve regression models.

## Mean Squared Error (MSE)

In the realm of linear regression metrics, one fundamental measure of model performance is the **Mean Squared Error (MSE)**. MSE serves as a valuable indicator of how well your linear regression model aligns its predictions with the actual data points. This metric quantifies the average of the squared differences between predicted values and observed values.

### The Formula

Mathematically, the MSE is computed using the following formula:

$$ MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 $$

Where:

- $n$ is the number of data points.
- $y_i$ represents the actual observed value for the $i^{th}$ data point.
- $\hat{y}_i$ represents the predicted value for the $i^{th}$ data point.

### Interpretation

A lower MSE value indicates that the model's predictions are closer to the actual values, signifying better model performance. Conversely, a higher MSE suggests that the model's predictions deviate more from the true values, indicating poorer performance.

### Python Implementation

Let's take a look at how to calculate MSE in Python. We'll use a simple example with sample data:

In [4]:
# Import necessary libraries
import numpy as np

# Sample data for demonstration (replace with your actual data)
actual_values = np.array([22.1, 19.9, 24.5, 20.1, 18.7])
predicted_values = np.array([23.5, 20.2, 23.9, 19.8, 18.5])

# Calculate the squared differences between actual and predicted values
squared_errors = (actual_values - predicted_values) ** 2

# Calculate the mean of squared errors to get MSE
mse = np.mean(squared_errors)

# Print the MSE
print("Mean Squared Error (MSE):", mse)

Mean Squared Error (MSE): 0.5079999999999996
