# Cost Function

In [None]:
import numpy as np
%matplotlib widget
import matplotlib.pyplot as plt
from matplotlib.widgets import Slider

# Problem Statement

To build model which can predict housing prices given the size of the house. 

Let's use the same example as before - a house with 1000 square feet sold for \\$300,000 and a house with 2000 square feet sold for \\$500,000.


| Size (1000 sqft)     | Price (1000s of dollars) |
| -------------------| ------------------------ |
| 1                 | 300                      |
| 2                  | 500                      |


In [None]:
x_train = np.array([1.0, 2.0])
y_train = np.array([300.0, 500.0])
print(x_train)
print(y_train)

# Computing Cost

The equation for cost with one variable is:
  $$J(w,b) = \frac{1}{2m} \sum\limits_{i = 0}^{m-1} (f_{w,b}(x^{(i)}) - y^{(i)})^2 \tag{1}$$ 
 
where 
  $$f_{w,b}(x^{(i)}) = wx^{(i)} + b \tag{2}$$
  
- $f_{w,b}(x^{(i)})$ is our prediction for example $i$ using parameters $w,b$.  
- $(f_{w,b}(x^{(i)}) -y^{(i)})^2$ is the squared difference between the target value and the prediction.   
- These differences are summed over all the $m$ examples and divided by `2m` to produce the cost, $J(w,b)$.  

In [None]:
def compute_cost(x, y, w, b):
    m = len(x);
    cost_sum = 0;
    for i in range (m):
        f_wb = w*x[i] + b
        cost = (f_wb - y[i])**2
        cost_sum += cost
    total_cost = (1/(2*m))*cost_sum
    return total_cost

Your goal is to find a model $f_{w,b}(x) = wx + b$, with parameters $w,b$,  which will accurately predict house values given an input $x$. The cost is a measure of how accurate the model is on the training data.

The cost equation (1) above shows that if $w$ and $b$ can be selected such that the predictions $f_{w,b}(x)$ match the target data $y$, the $(f_{w,b}(x^{(i)}) - y^{(i)})^2 $ term will be zero and the cost minimized. In this simple two point example, you can achieve this!

For this instance, let's set $b$ to 100 and focus on $w$.

<br/>
Run the code below and use the slider control to select the value of $w$ that minimizes cost.

In [20]:
#Fixing b=100
b = 100

In [None]:
# --------------------------
# Prepare w range
# --------------------------
w_vals = np.linspace(0, 400, 200)
cost_vals = [compute_cost(x_train, y_train, w, b) for w in w_vals]

# --------------------------
# Create Figure
# --------------------------
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12,5))
plt.subplots_adjust(bottom=0.25)

# --------------------------
# Left Plot
# --------------------------
ax1.scatter(x_train, y_train, color='red', label='Actual Value')

w0 = 200
pred_line, = ax1.plot(x_train, w0*x_train + b, lw=3, label='Prediction')

ax1.set_title("Housing Prices")
ax1.set_xlabel("Size (1000 sqft)")
ax1.set_ylabel("Price (1000s of dollars)")
ax1.legend()

# --------------------------
# Right Plot
# --------------------------
ax2.plot(w_vals, cost_vals)

cost_point, = ax2.plot(w0, compute_cost(x_train, y_train, w0, b),
                       'ro', markersize=10)

ax2.set_title("Cost vs w (b=100)")
ax2.set_xlabel("w")
ax2.set_ylabel("Cost")

# --------------------------
# TEXT DISPLAY (NEW)
# --------------------------
cost_text = fig.text(0.5, 0.93,
                     f"Current Cost = {compute_cost(x_train, y_train, w0, b):.2f}",
                     ha="center",
                     fontsize=14,
                     fontweight="bold")

# --------------------------
# Slider
# --------------------------
ax_w = plt.axes([0.2, 0.1, 0.6, 0.04])
w_slider = Slider(ax=ax_w,
                  label="w",
                  valmin=0,
                  valmax=400,
                  valinit=w0)

# --------------------------
# Update Function
# --------------------------
def update(val):
    w = w_slider.val
    
    # Update line
    pred_line.set_ydata(w*x_train + b)
    
    # Update cost dot
    cost = compute_cost(x_train, y_train, w, b)
    cost_point.set_data([w], [cost])
    
    # Update text
    cost_text.set_text(f"Current Cost = {cost:.2f}")
    
    fig.canvas.draw_idle()

w_slider.on_changed(update)

plt.show()


- Cost is minimized when $w = 200$
- Because the difference between the target and pediction is squared in the cost equation, the cost increases rapidly when $w$ is either too large or too small.
- Using the `w` and `b` selected by minimizing cost results in a line which is a perfect fit to the data.