In [29]:
import numpy as np
import matplotlib.pyplot as plt

<p style="color: blue; font-size: 30px;"><b>Calculate the Mean Squared Error (MSE) step by step by hand</b></p>

**Mean Squared Error (MSE)** is a commonly used metric to measure the average of the squares of the errors between predicted values and actual values. It is widely used in regression analysis and machine learning to evaluate how well a model's predictions match the actual data.

###  Formula
Given a set of predictions $\hat{y}_i$ and actual values $y_i$ for $n$ data points:

$$
\text{MSE} = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2
$$

Where:
- $ n $ is the number of data points,
- $ y_i $ is the actual value,
- $ \hat{y}_i $ is the predicted value.

### Explanation

MSE calculates the mean of the squared differences between actual and predicted values, providing a measure of how well a model performs.

* **Error Term $(y_i - \hat{y}_i)$:** The difference between the actual value and the predicted value for each point.
* **Squaring the Error:** By squaring, we ensure all error values are positive and penalize larger errors more heavily.
* **Averaging:** The sum of squared errors is divided by the number of data points to get the mean.

### Properties
* **Non-negative:** MSE is always greater than or equal to 0. A value of 0 indicates perfect predictions.
* **Sensitive to Outliers:** Because errors are squared, larger errors have a disproportionately large effect on the MSE.

### Example
* Suppose you have actual values [3, 5, 2] and predictions [2, 5, 4]:

$$
\begin{align*}
\text{MSE} &= \frac{1}{3} \left( (3-2)^2 + (5-5)^2 + (2-4)^2 \right) \\
           &= \frac{1}{3} (1 + 0 + 4) \\
           &= \frac{5}{3} \\
           &\approx 1.67 \\
\end{align*}
$$

### Usage
* **Model Evaluation:** Lower MSE means better model performance.
* **Optimization:** Many machine learning algorithms try to minimize MSE during training.
---
In summary, MSE provides a quantitative way to assess how close predictions are to actual outcomes, with a focus on penalizing larger errors.


## Calculation Mean Squared Error

In [34]:
actual_values = [3, 5, 2]
predicted_values = [2, 5, 4]

squared_errors = [0, 0, 0]

for x in range(3):  # Loop through indices 0 to 2
    error = actual_values[x] - predicted_values[x]
    squared_errors[x] = error ** 2
    print(f"{x} : ({actual_values[x]} - {predicted_values[x]})^2 = {squared_errors[x]}")

mse = sum(squared_errors) / len(actual_values)
print("ANSWER")
print("MSE =", mse)

0 : (3 - 2)^2 = 1
1 : (5 - 5)^2 = 0
2 : (2 - 4)^2 = 4
ANSWER
MSE = 1.6666666666666667


#### OR ( Using Numpy)

In [37]:
# Arrays
actual_values = np.array([3, 5, 2])
predicted_values = np.array([2, 5, 4])

# Subtraction
difference = actual_values - predicted_values
print("Difference:", difference)

# Squared differences
squared_difference = difference ** 2
print("Squared Differences:", squared_difference)

# Mean Squared Error calculation
mse = np.mean(squared_difference)
print("Mean Squared Error (MSE):", mse)

Difference: [ 1  0 -2]
Squared Differences: [1 0 4]
Mean Squared Error (MSE): 1.6666666666666667


# [Math Deep Dives](./README.md)