# MSE (Mean Sqaured Error) 

<img src = "https://ambrapaliaidata.blob.core.windows.net/ai-storage/articles/2.png">

Mean Squared Error (MSE) is a common metric used to measure the performance of regression models. It measures the average squared difference between the predicted and actual values of a given dataset.

To calculate the MSE, you need to take the difference between the predicted and actual values for each data point in the dataset, square the difference, and then take the average of all these squared differences. This provides an overall measure of how well the model is performing in terms of predicting the target variable.

MSE is a useful metric because it penalizes large errors more heavily than small errors, since the errors are squared. This means that a model with a high MSE is likely making larger errors on average than a model with a lower MSE.

However, MSE is sensitive to outliers in the data, as they can have a large impact on the overall score. In cases where outliers are a concern, other metrics such as Mean Absolute Error (MAE) may be more appropriate.

$$Error = \frac {1}{n}\sum\limits_{i = 1}^{n}(actual - predicted)^2$$

In [3]:
import numpy as np 

Lets assume we have these values

In [1]:
predicted = 12
actual = 10

The error, also can be said as the difference between these values will be 

In [2]:
predicted - actual

2

Now lets assume we have a list of values

In [4]:
predicted = np.array([x for x in range(10)])
actual = np.array([x for x in range(0 , 20 , 2)])

In [5]:
predicted

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [6]:
actual

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

The error for this will be 

In [8]:
error = actual - predicted

In [9]:
error

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

If we calculate the mean of this error, it will be 

In [11]:
error.mean()

4.5

It can be then said as `mean error`

But as we move to a real world dataset, it is quite possible that these values start to cancel out each other 

Thats we introduce a `square` to all the terms, to make them posiitve and still get the results

In [12]:
error = (actual - predicted) ** 2

In [13]:
error

array([ 0,  1,  4,  9, 16, 25, 36, 49, 64, 81])

And now the mean will be 

In [14]:
error.mean()

28.5

Though the mean is high, but it would be less vulnerable 

Lets now just make a function for this and put all of this seperately 

In [None]:
def MSE(predicted , actual):
    
    error = ((predicted - actual) ** 2).mean()

    return error