# Problem Statement

Let's use the two data points  - a house with 1000 square feet sold for \\$300,000 and a house with 2000 square feet sold for \\$500,000.

| Size (1000 sqft)     | Price (1000s of dollars) |
| ----------------| ------------------------ |
| 1               | 300                      |
| 2               | 500                      |

In [1]:
import numpy as np
import math

In [2]:
x_train=np.array([1,2])
y_train=np.array([300,500])

The equation for cost with one variable is:
  $$J(w,b) = \frac{1}{2m} \sum\limits_{i = 0}^{m-1} (f_{w,b}(x^{(i)}) - y^{(i)})^2 \tag{1}$$ 

In [3]:
def compute_cost(x,y,w,b):
    m=x.shape[0]
    cost=0
    for i in range(m):
        fwb_x=w*x[i]+b
        cost=cost+(fwb_x-y[i])**2
    total_cost=1/(2*m)*cost
    return total_cost

The gradient is defined as:
$$
\begin{align}
\frac{\partial J(w,b)}{\partial w}  &= \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{w,b}(x^{(i)}) - y^{(i)})x^{(i)}\\
  \frac{\partial J(w,b)}{\partial b}  &= \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{w,b}(x^{(i)}) - y^{(i)}) \
\end{align}
$$

In [4]:
def compute_gradient(x,y,w,b):
    m=x.shape[0]
    dj_dw=0
    dj_db=0
    for i in range(m):
        fwb_x=w*x[i]+b
        dj_dw_i=(fwb_x-y[i])*x[i]
        dj_db_i=(fwb_x-y[i])
        dj_dw+=dj_dw_i
        dj_db+=dj_db_i
    dj_dw=dj_dw/m
    dj_db=dj_db/m
    return dj_dw,dj_db        

$$\begin{align*} \text{gradient}&\text{ descent:} \; \lbrace \newline
\;  w &= w -  \alpha \frac{\partial J(w,b)}{\partial w} \tag{3}  \; \newline 
 b &= b -  \alpha \frac{\partial J(w,b)}{\partial b}  \newline \rbrace
\end{align*}$$
where, parameters $w$, $b$ are updated simultaneously.

In [5]:
def gradient_descent(x, y, w_in, b_in, alpha, num_iters, cost_function, gradient_function):
    J_history=[]
    p_history=[]
    b=b_in
    w=w_in
    for i in range(num_iters):
        dj_dw,dj_db=gradient_function(x,y,w,b)
        b=b-alpha*dj_db
        w=w-alpha*dj_dw
        J_history.append(cost_function(x,y,w,b))
        p_history.append([w,b])
        if(i%math.ceil(num_iters/10)==0):
            print(f"Iteration {i}: Cost {J_history[-1]:0.2e}",
                  f"dj_dw: {dj_dw:0.3e},dj_db:{dj_db:0.3e}",
                  f"w:{w:0.3e},b:{b:0.5e}")
    return w, b, J_history, p_history    

In [6]:
w_init=0
b_init=0
iterations=10000
tmp_alpha=1.0e-2
w_final,b_final,J_hist,p_hist=gradient_descent(x_train,y_train,w_init,b_init,tmp_alpha,iterations,compute_cost,compute_gradient)
print(f"(w,b) found by gradient descent: ({w_final:8.4f},{b_final:8.4f})")

Iteration 0: Cost 7.93e+04 dj_dw: -6.500e+02,dj_db:-4.000e+02 w:6.500e+00,b:4.00000e+00
Iteration 1000: Cost 3.41e+00 dj_dw: -3.712e-01,dj_db:6.007e-01 w:1.949e+02,b:1.08228e+02
Iteration 2000: Cost 7.93e-01 dj_dw: -1.789e-01,dj_db:2.895e-01 w:1.975e+02,b:1.03966e+02
Iteration 3000: Cost 1.84e-01 dj_dw: -8.625e-02,dj_db:1.396e-01 w:1.988e+02,b:1.01912e+02
Iteration 4000: Cost 4.28e-02 dj_dw: -4.158e-02,dj_db:6.727e-02 w:1.994e+02,b:1.00922e+02
Iteration 5000: Cost 9.95e-03 dj_dw: -2.004e-02,dj_db:3.243e-02 w:1.997e+02,b:1.00444e+02
Iteration 6000: Cost 2.31e-03 dj_dw: -9.660e-03,dj_db:1.563e-02 w:1.999e+02,b:1.00214e+02
Iteration 7000: Cost 5.37e-04 dj_dw: -4.657e-03,dj_db:7.535e-03 w:1.999e+02,b:1.00103e+02
Iteration 8000: Cost 1.25e-04 dj_dw: -2.245e-03,dj_db:3.632e-03 w:2.000e+02,b:1.00050e+02
Iteration 9000: Cost 2.90e-05 dj_dw: -1.082e-03,dj_db:1.751e-03 w:2.000e+02,b:1.00024e+02
(w,b) found by gradient descent: (199.9929,100.0116)


In [7]:
print(f"1000 sft house prediction {w_final*1.0+ b_final:0.1f} Thousand dollars")
print(f"2000 sft house prediction {w_final*2.0+b_final:0.1f} Thousand dollars")
print(f"3000 sft house prediction {w_final*3.0+ b_final:0.1f} Thousand dollars")
print(f"4000 sft house prediction {w_final*4.0+b_final:0.1f} Thousand dollars")

1000 sft house prediction 300.0 Thousand dollars
2000 sft house prediction 500.0 Thousand dollars
3000 sft house prediction 700.0 Thousand dollars
4000 sft house prediction 900.0 Thousand dollars
