# Interactive Visualization of the Linear Regression Cost Function

## Introduction
In linear regression, we fit a line to data points to predict a continuous target value.
The quality of our model's fit is measured using a cost function, commonly the Mean Squared Error (MSE).
By tuning the parameters (the weight w and the bias b), we aim to minimize this cost, achieving the best-fitting line.
### This notebook:
- Introduces the concept of the cost function in linear regression.
- Explains the relationship between speed and fuel efficiency in a simple dataset.
- Provides an interactive visualization to observe how changing the weight w affects both the prediction line and the resulting cost.

## Understanding the Cost Function

For a linear regression model with parameters weight $w$ and bias $b$, and a training dataset $(x^{(i)}, y^{(i)})$ for $i = 1, 2, \dots, m$:

$
f_{w,b}(x^{(i)}) = w x^{(i)} + b
$

The cost function (Mean Squared Error) is defined as:

$$
J(w, b) = \frac{1}{2m} \sum_{i=1}^{m} \bigl(f_{w,b}(x^{(i)}) - y^{(i)}\bigr)^2
$$

- This measures how far off predictions are from the actual values.
- It averages the squared errors to return a single number.
- By adjusting $w$ and $b$, we want to minimize $J(w, b)$. A lower cost implies a better fit.

### Statistical Connection

The cost function is conceptually similar to the variance calculation in statistics, where we measure the spread of data points around a mean. Here, we measure the spread of prediction errors around the actual values. Minimizing the cost reduces the spread of these errors, improving the model's fit.

## Dataset

In this example, we consider a simple dataset:

- **x_train**: Speeds of a car (in km/h).
- **y_train**: Fuel efficiency of the car (in km/l).

As the car's speed increases, its fuel efficiency typically decreases due to higher aerodynamic drag. We fix the bias (intercept) at $b = 20$, and we will adjust the weight $w$ to see how it affects our model’s predictions and the resulting cost.

## Setup and Imports

First, we import all the necessary libraries and define our dataset.

In [75]:
# Core numerical libraries
import numpy as np

# Plotting and visualization
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# Interactivity
from ipywidgets import interact, FloatSlider

# Define the dataset
x_train = np.array([20, 40, 60, 80, 100])  # Speed (km/h)
y_train = np.array([18, 16, 12, 8, 5])     # Efficiency (km/l)

## Defining the Cost Function

We will define a function `calculate_cost` that returns the MSE-based cost for given parameters `w` and `b`.

In [76]:
def calculate_cost(x, y, w, b):
    """
    Calculate the Mean Squared Error-based cost for linear regression.
    
    Parameters:
        x (np.array): Feature values (e.g. car speeds)
        y (np.array): Target values (e.g. car fuel efficiency)
        w (float): Weight (slope) parameter
        b (float): Bias (intercept) parameter
    
    Returns:
        float: The calculated cost (average of squared errors).
    """
    m = len(x)
    predictions = w * x + b
    errors = predictions - y
    total_cost = np.sum(errors ** 2) / (2 * m)
    return total_cost

## Interactive Visualization

We will create an interactive visualization to:
- Show the relationship between speed (x-axis) and fuel efficiency (y-axis), along with the predicted line.
- Display a cost vs. weight curve, highlighting how changing `w` affects the cost.

Use the slider to adjust the weight `w`. The bias `b` is fixed at 20. Notice how the prediction line and cost value change as `w` varies.

In [77]:
def update_plots(w):
    # Fixed bias (intercept)
    b = 20
    
    # Compute predictions
    y_pred = w * x_train + b
    
    # Compute current cost
    cost = calculate_cost(x_train, y_train, w, b)
    
    # Generate data for cost vs weight curve
    w_values = np.linspace(-0.5, 0.215, 100)
    cost_values = [calculate_cost(x_train, y_train, w_val, b) for w_val in w_values]
    
    # Create subplots: 1 row, 2 columns
    fig = make_subplots(rows=1, cols=2, subplot_titles=("Car Efficiency vs. Speed", "Cost vs. Weight"))
    
    # Plot 1: Car Efficiency vs. Speed
    fig.add_trace(
        go.Scatter(x=x_train, y=y_train, mode='markers', name='Actual Values', marker=dict(color='blue')),
        row=1, col=1
    )
    fig.add_trace(
        go.Scatter(x=x_train, y=y_pred, mode='lines', name=f'Prediction (w={w:.2f}, b={b})', line=dict(color='red')),
        row=1, col=1
    )
    
    # Plot 2: Cost vs. Weight
    fig.add_trace(
        go.Scatter(x=w_values, y=cost_values, mode='lines', name='Cost Curve', line=dict(color='teal')),
        row=1, col=2
    )
    fig.add_trace(
        go.Scatter(x=[w], y=[cost], mode='markers', name=f'Cost at w={w:.2f}',
                   marker=dict(color='red', size=10)),
        row=1, col=2
    )
    
    # Add a dynamic annotation showing current cost
    fig.add_annotation(
        x=w,
        y=cost,
        text=f"Cost: {cost:.2f}",
        showarrow=True,
        arrowhead=2,
        ax=40,
        ay=-40,
        row=1,
        col=2,
        font=dict(size=12, color="black"),
        arrowcolor="red"
    )
    
    # Update layout and axes
    fig.update_layout(
        width=900,
        height=400,
        title_text="Interactive Visualization of Cost Function",
    )
    fig.update_xaxes(title_text="Speed (km/h)", row=1, col=1)
    fig.update_yaxes(title_text="Efficiency (km/l)", row=1, col=1)
    fig.update_xaxes(title_text="Weight (w)", row=1, col=2)
    fig.update_yaxes(title_text="Cost", row=1, col=2)
    
    fig.show()

## Interact with the Weight Parameter

Use the slider below to adjust the weight parameter `w`. Notice how the prediction line and cost curve change. The goal in a typical linear regression scenario is to find the `w` that minimizes the cost.

In [79]:
interact(update_plots, w=FloatSlider(min=-0.5, max=0.22, step=0.01, value=-0.3, description='Weight (w):'))

interactive(children=(FloatSlider(value=-0.3, description='Weight (w):', max=0.22, min=-0.5, step=0.01), Outpu…

<function __main__.update_plots(w)>

## Advanced 3D Visualization of the Cost Function J(w, b)

To gain a better understanding of the relationship between the parameters \( w \) and \( b \) and the cost \( J(w,b) \), we will create an interactive 3D surface plot.

**Key Points:**

- We will plot \( J(w,b) \) over a range of values for \( w \) and \( b \).
- The result should show a convex surface (a bowl-shaped curve) demonstrating that there is a unique minimum where the cost is lowest.
- Depending on the data ranges and scales, the bowl may appear more like a valley or slope, but it remains convex.

In [81]:
# Define the ranges for w and b
w_values = np.linspace(-20, 20, 50)
b_values = np.linspace(-20, 20, 50)

W, B = np.meshgrid(w_values, b_values)

# Compute the cost surface for each (w, b) pair
Z = np.zeros_like(W)
for i in range(len(b_values)):
    for j in range(len(w_values)):
        Z[i, j] = calculate_cost(x_train, y_train, W[i, j], B[i, j])

**Explanation:**

- We set `w_values` and `b_values` to range from -20 to 20.
- Using `np.meshgrid`, we create a grid of (W, B) points.
- For each point (W[i,j], B[i,j]), we compute the cost function using `calculate_cost`.
- The matrix `Z` stores the cost values corresponding to each (w, b) pair.

In [82]:
# Create a 3D interactive surface plot using Plotly
fig = go.Figure(data=[go.Surface(
    x=W, y=B, z=Z, 
    colorscale='RdYlGn',
    showscale=True,
    colorbar=dict(title='Cost')
)])

fig.update_layout(
    title="Cost Surface J(w, b) [Rotate and Zoom for Better View]",
    scene=dict(
        xaxis_title='W',
        yaxis_title='B',
        zaxis_title='J(w,b)'
    ),
    width=700,
    height=600
)

fig.show()

With this 3D visualization, you can see that as we move away from the optimal parameters, the cost rises, forming a convex surface. The global minimum — where the bowl is at its lowest point — corresponds to the best-fitting parameters for our linear model.

## Conclusion

By adjusting the weight `w` and visualizing the resulting predictions, we gain insight into how changes in parameters affect the cost. The 3D surface plot of $J(w,b)$ reinforces the idea that the cost function is convex, forming a bowl-shaped landscape. The lowest point on this surface corresponds to the parameters that best fit the data.

In practice, methods like gradient descent are used to find these parameter values, efficiently navigating the cost surface to arrive at the global minimum. These visualizations help build intuition about the nature of the cost function and guide our understanding of parameter tuning in linear regression.