Linear Regression with single variable/feature:
1. MODEL: 
$$ f_{\mathbf{w},b}(\mathbf{x}) =  wx + b \tag{1}$$

2. COST-FUNCTION: 
$$ J(\mathbf{w},\mathbf{b}) =  \frac{1}{2m} \sum\limits_{i = 0}^{m-1} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - y^{(i)})^2 \tag{2}$$ 
where:
$$ f_{\mathbf{w},b}(\mathbf{x}^{(i)}) = \mathbf{w} \cdot \mathbf{x}^{(i)} + b  \tag{3} $$

3. GRADIENT-DESCENT:
$$\begin{align*} \text{repeat}&\text{ until convergence:} \; \lbrace \newline
\;  w &= w -  \alpha \frac{\partial J(w,b)}{\partial w} \tag{4}  \; \newline 
 b &= b -  \alpha \frac{\partial J(w,b)}{\partial b}  \newline \rbrace
\end{align*}$$
where, parameters $w$, $b$ are updated simultaneously.  

4. THE GRADIENT IS DEFINED AS:
$$
\begin{align}
\frac{\partial J(w,b)}{\partial w}  &= \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{w,b}(x^{(i)}) - y^{(i)})x^{(i)} \tag{5}\\
  \frac{\partial J(w,b)}{\partial b}  &= \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{w,b}(x^{(i)}) - y^{(i)}) \tag{6}\\
\end{align}
$$

Here, *simultaniously* means that you calculate the partial derivatives for all the parameters before updating any of the parameters.

In [None]:
# Import libraries
import numpy as np
import matplotlib.pyplot as plt

In [None]:
# Training data/ Input data
x_train = np.array([1.0, 2.0])           # feature data-set
y_train = np.array([300.0, 500.0])    # target data-set   

In [None]:
m = len(x_train)
print(f"Size of dataset: {m}")

Size of dataset: 2


In [None]:
# For a particular row, model is depicted by equation (1)
# To compute the overall value, we sum the result obtained from each row
def compute_model(x, w, b):
  f_total = 0
  for i in range (0, m):
    f_total += (w * x[i]) + b
  return f_total

# Testing the function with sample values of 'w' and 'b'
print(f"f_total: {compute_model(x_train, 10.6, 20)}") 

f_total: 71.80000000000001


In [None]:
# Computing the cost-function using equation (2)
def compute_cost(x, y, w, b):
  cost = 0
  for i in range (0, m):
    f_wb = (w * x[i]) + b
    cost += (f_wb - y[i]) ** 2
  total_cost = cost / (2 * m)
  return total_cost

# Testing the function with sample values
print(f"total_cost: {compute_cost(x_train, y_train, 10.6, 20)}")


total_cost: 70768.45


In [None]:
# Computing the gradient using equations (5) and (6)
def compute_gradient(x, y, w, b):
  dj_w = 0
  dj_b = 0
  for i in range (0,m):
    f_wb = w * x[i] + b
    # Simultaneously update both derivatives of w and b
    dj_w_tmp = (f_wb - y[i]) * x[i]
    dj_b_tmp = (f_wb - y[i])
    dj_w = dj_w + dj_w_tmp
    dj_b = dj_b + dj_b_tmp
  dj_w = dj_w/m
  dj_b = dj_b/m
  return dj_w, dj_b

# Testing the function with sample values
dj_w, dj_b = compute_gradient(x_train, y_train, 100, 200)
print(f"dj_w = {dj_w} and dj_b = {dj_b}")

dj_w = -100.0 and dj_b = -50.0


In [None]:
# Computing the gradient-descent to obtain values for w and b
def gradient_descent(x, y, w, b, alpha):
  no_of_iterations = 10000

  for i in range (0, no_of_iterations):
    # Repeating until convergence
    dj_w, dj_b = compute_gradient(x, y, w, b)     
    w = w - (alpha * dj_w)
    b = b - (alpha * dj_b)
  return w, b

In [None]:
# Starting by supplying initial values for w, b and alpha
w_init = 0
b_init = 0
alpha = 1.0e-2
w_final, b_final = gradient_descent(x_train, y_train, w_init, b_init, alpha)
print(f"w = {w_final} and b = {b_final}")

w = 65000.0 and b = 40000.0


In [None]:
# Making Predictions
print(f"1000 sqft house prediction {w_final*1.0 + b_final:0.1f} Thousand dollars")
print(f"1200 sqft house prediction {w_final*1.2 + b_final:0.1f} Thousand dollars")
print(f"2000 sqft house prediction {w_final*2.0 + b_final:0.1f} Thousand dollars")

1000 sqft house prediction 105000.0 Thousand dollars
1200 sqft house prediction 118000.0 Thousand dollars
2000 sqft house prediction 170000.0 Thousand dollars
