In [None]:
import numpy as np
import matplotlib.pyplot as plt
from utils import *
import copy
import math
%matplotlib inline

Problem Statement
Suppose you are the CEO of a restaurant franchise and are considering different cities for opening a new outlet.

You would like to expand your business to cities that may give your restaurant higher profits.
The chain already has restaurants in various cities and you have data for profits and populations from the cities.
You also have data on cities that are candidates for a new restaurant.
For these cities, you have the city population.
Can you use the data to help you identify which cities may potentially give your business higher profits?

Dataset
You will start by loading the dataset for this task.

The load_data() function shown below loads the data into variables x_train and y_train
x_train is the population of a city
y_train is the profit of a restaurant in that city. A negative value for profit indicates a loss.
Both X_train and y_train are numpy arrays.

In [None]:
# load the dataset
x_train, y_train = load_data()

Complete the compute_cost below to:

Iterate over the training examples, and for each example, compute:

The prediction of the model for that example
𝑓𝑤𝑏(𝑥(𝑖))=𝑤𝑥(𝑖)+𝑏
 
The cost for that example
𝑐𝑜𝑠𝑡(𝑖)=(𝑓𝑤𝑏−𝑦(𝑖))2
 
Return the total cost over all examples
𝐽(𝐰,𝑏)=12𝑚∑𝑖=0𝑚−1𝑐𝑜𝑠𝑡(𝑖)
 
Here,  𝑚  is the number of training examples and  ∑  is the summation operator
If you get stuck, you can check out the hints presented after the cell below to help you with the implementation.

In [None]:
## Function for computing the cost
def compute_cost(x, y, w, b): 

    m = x.shape[0] 
    
    total_cost = 0
    
    for i in range(m):
        cost = (w*x[i]+b-y[i])**2
        total_cost+=cost/(2*m)
    return total_cost

Please complete the compute_gradient function to:

Iterate over the training examples, and for each example, compute:

The prediction of the model for that example
𝑓𝑤𝑏(𝑥(𝑖))=𝑤𝑥(𝑖)+𝑏
 
The gradient for the parameters  𝑤,𝑏  from that example
∂𝐽(𝑤,𝑏)∂𝑏(𝑖)=(𝑓𝑤,𝑏(𝑥(𝑖))−𝑦(𝑖))
 
∂𝐽(𝑤,𝑏)∂𝑤(𝑖)=(𝑓𝑤,𝑏(𝑥(𝑖))−𝑦(𝑖))𝑥(𝑖)
 
Return the total gradient update from all the examples
∂𝐽(𝑤,𝑏)∂𝑏=1𝑚∑𝑖=0𝑚−1∂𝐽(𝑤,𝑏)∂𝑏(𝑖)
 
∂𝐽(𝑤,𝑏)∂𝑤=1𝑚∑𝑖=0𝑚−1∂𝐽(𝑤,𝑏)∂𝑤(𝑖)
 
Here,  𝑚  is the number of training examples and  ∑  is the summation operator
If you get stuck, you can check out the hints presented after the cell below to help you with the implementation.

In [None]:
## Code for calculating the gradients
def compute_gradient(x, y, w, b): 

    m = x.shape[0]
    dj_dw = 0
    dj_db = 0

    for i in range(m):
        dj_dw += (w*x[i]+b-y[i])*x[i]/m
        dj_db += (w*x[i]+b-y[i])/m

    return dj_dw, dj_db

In [None]:
## function for optimizing the regression coefficient w,b
def gradient_descent(x, y, w_in, b_in, cost_function, gradient_function, alpha, num_iters): 
    
    m = len(x)

    J_history = []
    w_history = []
    w = copy.deepcopy(w_in) 
    b = b_in
    
    for i in range(num_iters):

        dj_dw, dj_db = gradient_function(x, y, w, b )  

        w = w - alpha * dj_dw               
        b = b - alpha * dj_db               

        if i<100000:   
            cost =  cost_function(x, y, w, b)
            J_history.append(cost)

        if i% math.ceil(num_iters/10) == 0:
            w_history.append(w)
            print(f"Iteration {i:4}: Cost {float(J_history[-1]):8.2f}   ")
        
    return w, b, J_history, w_history 