In [2]:
import numpy as np
import pandas as pd 
import matplotlib.pyplot as plt

In [3]:
# random data about houses
x_train = np.array([1.0, 2.0, 3.0, 3.5, 4.0, 6.5, 8.0])
y_train = np.array([300.0, 500.0, 730.0, 800.0, 950.0, 1300, 1900])

## Computing Cost
The term 'cost' in this assignment might be a little confusing since the data is housing cost. Here, cost is a measure how well our model is predicting the target price of the house. The term 'price' is used for housing data.

The equation for cost with one variable is:
  $$J(w,b) = \frac{1}{2m} \sum\limits_{i = 0}^{m-1} (f_{w,b}(x^{(i)}) - y^{(i)})^2 \tag{1}$$ 
 
where 
  $$f_{w,b}(x^{(i)}) = wx^{(i)} + b \tag{2}$$
  
- $f_{w,b}(x^{(i)})$ is our prediction for example $i$ using parameters $w,b$.  
- $(f_{w,b}(x^{(i)}) -y^{(i)})^2$ is the squared difference between the target value and the prediction.   
- These differences are summed over all the $m$ examples and divided by `2m` to produce the cost, $J(w,b)$.  
>Note, in lecture summation ranges are typically from 1 to m, while code will be from 0 to m-1.


The code below calculates cost by looping over each example. In each loop:
- `f_wb`, a prediction is calculated
- the difference between the target and the prediction is calculated and squared.
- this is added to the total cost.

In [4]:
def compute_cost(x, y, w, b):
    m = x.shape[0]

    cost_sum = 0
    for i in range(m):
        f_wb = w * x[i] + b
        cost = (f_wb - y[i]) ** 2
        cost_sum = cost_sum + cost
    total_cost = (1 / (2 * m)) * cost_sum

    return total_cost


In [5]:
w = 100
b = 100
compute_cost(x_train, y_train, w, b)

np.float64(127600.0)

In [6]:
w = 200
b = 100
compute_cost(x_train, y_train, w, b)


np.float64(3814.285714285714)

In [7]:
w = 200
b = 0.1
compute_cost(x_train, y_train, w, b)

np.float64(11373.14785714286)

In [8]:
w = 214
b = 66
compute_cost(x_train, y_train, w, b)

np.float64(2961.5714285714284)