# Steps to train a Linear Regression Model

1. Choose Dataset and fill <span style="color:blue">x_train</span> & <span style="color:blue">y_train</span> accordingly
2. Adjust <span style="color:blue">iterations</span> and <span style="color:blue">tmp_aplha</span> (learning rate) as needed.
3. Run the gradient descent function to get the optimal parameters.
4. Use the final w and b to make predictions:

In [47]:
def predict(x, w, b):
    return w * x + b

5. Visualize results using matplotlib as shown in the notebook.

# 1. Data Loading and Setup

In [50]:
import numpy as np
import math

# House sizes in thousands of square feet (1000-3000 sq ft)
x_train = np.array([1.0, 1.5, 2.0, 2.5, 3.0])

# House prices in thousands of dollars
# Each 500 sqft increase adds $100k to the price, starting at $200k for 1000 sqft
y_train = np.array([200.0, 300.0, 400.0, 500.0, 600.0])

This code loads the necessary libraries and sets up a simple dataset with two points: houses with 1000 and 2000 square feet, sold for 300,000€ and 500,000€ respectively.

# 2. Cost Function

In [53]:
#Function to calculate the cost
def compute_cost(x, y, w, b):
   
    m = x.shape[0] 
    cost = 0
    
    for i in range(m):
        f_wb = w * x[i] + b
        cost = cost + (f_wb - y[i])**2
    total_cost = 1 / (2 * m) * cost

    return total_cost

This function calculates the cost (error) of the current model parameters (w and b) given the training data.

# 3. Gradient Computation

In [56]:
def compute_gradient(x, y, w, b): 
    """
    Computes the gradient for linear regression 
    Args:
      x (ndarray (m,)): Data, m examples 
      y (ndarray (m,)): target values
      w,b (scalar)    : model parameters  
    Returns
      dj_dw (scalar): The gradient of the cost w.r.t. the parameters w
      dj_db (scalar): The gradient of the cost w.r.t. the parameter b     
     """
    
    # Number of training examples
    m = x.shape[0]    
    dj_dw = 0
    dj_db = 0
    
    for i in range(m):  
        f_wb = w * x[i] + b 
        dj_dw_i = (f_wb - y[i]) * x[i] 
        dj_db_i = f_wb - y[i] 
        dj_db += dj_db_i
        dj_dw += dj_dw_i 
    dj_dw = dj_dw / m 
    dj_db = dj_db / m 
        
    return dj_dw, dj_db

This function computes the gradients of the cost function with respect to w and b.

# 4. Gradient Descent

In [59]:
def gradient_descent(x, y, w_in, b_in, alpha, num_iters): 
    # An array to store cost J and w's at each iteration primarily for graphing later
    J_history = []
    p_history = []
    b = b_in
    w = w_in
    
    for i in range(num_iters):
        # Calculate the gradient and update the parameters
        dj_dw, dj_db = compute_gradient(x, y, w, b)     
        
        # Update Parameters
        b = b - alpha * dj_db                            
        w = w - alpha * dj_dw                            
        
        # Save cost J at each iteration
        if i < 100000:      # prevent resource exhaustion 
            J_history.append(compute_cost(x, y, w, b))
            p_history.append([w,b])
            
        # Print cost every at intervals 10 times or as many iterations if < 10
        if i % math.ceil(num_iters/10) == 0:
            print(f"Iteration {i:4}: Cost {J_history[-1]:0.2e} ",
                  f"dj_dw: {dj_dw: 0.3e}, dj_db: {dj_db: 0.3e}  ",
                  f"w: {w: 0.3e}, b:{b: 0.5e}")
 
    return w, b, J_history, p_history

This function performs the gradient descent algorithm to optimize w and b.

# 5. Training the Model

In [62]:
w_init = 0
b_init = 0
iterations = 10000
tmp_alpha = 0.01

w_final, b_final, J_hist, p_hist = gradient_descent(x_train, y_train, w_init, b_init, tmp_alpha, iterations)
print(f"\n(w,b) found by gradient descent: ({w_final:8.4f},{b_final:8.4f})")

Iteration    0: Cost 8.06e+04  dj_dw: -9.000e+02, dj_db: -4.000e+02   w:  9.000e+00, b: 4.00000e+00
Iteration 1000: Cost 4.96e+01  dj_dw: -1.252e+00, dj_db:  2.759e+00   w:  1.865e+02, b: 2.98126e+01
Iteration 2000: Cost 7.79e+00  dj_dw: -4.964e-01, dj_db:  1.094e+00   w:  1.946e+02, b: 1.18209e+01
Iteration 3000: Cost 1.22e+00  dj_dw: -1.968e-01, dj_db:  4.338e-01   w:  1.979e+02, b: 4.68705e+00
Iteration 4000: Cost 1.93e-01  dj_dw: -7.805e-02, dj_db:  1.720e-01   w:  1.992e+02, b: 1.85844e+00
Iteration 5000: Cost 3.03e-02  dj_dw: -3.095e-02, dj_db:  6.820e-02   w:  1.997e+02, b: 7.36885e-01
Iteration 6000: Cost 4.76e-03  dj_dw: -1.227e-02, dj_db:  2.704e-02   w:  1.999e+02, b: 2.92180e-01
Iteration 7000: Cost 7.48e-04  dj_dw: -4.865e-03, dj_db:  1.072e-02   w:  1.999e+02, b: 1.15851e-01
Iteration 8000: Cost 1.18e-04  dj_dw: -1.929e-03, dj_db:  4.251e-03   w:  2.000e+02, b: 4.59357e-02
Iteration 9000: Cost 1.85e-05  dj_dw: -7.649e-04, dj_db:  1.686e-03   w:  2.000e+02, b: 1.82138e-02


With w and b found, we can now predict values for the Model we trained with following equation:

In [64]:
def predict(x, w, b):
    return w * x + b

The function predict will return a value y for the entered x data as a prediction

In [83]:
value_to_predict = 1.8

In [85]:
predict(value_to_predict,w_final,b_final)

360.00132439802763