## Polynomial regression implemnentation 

In [17]:
import numpy as np 
import matplotlib.pyplot as plt 
import math  , copy


<a name='FeatureEng'></a>
# Feature Engineering and Polynomial Regression Overview

Out of the box, linear regression provides a means of building models of the form:
$$f_{\mathbf{w},b} = w_0x_0 + w_1x_1+ ... + w_{n-1}x_{n-1} + b \tag{1}$$ 
What if your features/data are non-linear or are combinations of features? For example,  Housing prices do not tend to be linear with living area but penalize very small or very large houses resulting in the curves shown in the graphic above. How can we use the machinery of linear regression to fit this curve? Recall, the 'machinery' we have is the ability to modify the parameters $\mathbf{w}$, $\mathbf{b}$ in (1) to 'fit' the equation to the training data. However, no amount of adjusting of $\mathbf{w}$,$\mathbf{b}$ in (1) will achieve a fit to a non-linear curve.

In [21]:
# let's create run run_gradient_descent_feng


def compute_cost(x , y , w, b):
    m = x.shape[0]
    cost = 0.0

    for i in range(m): 

        f_wb_i = np.dot(x[i], w )

        f_wb = (f_wb_i - y[i]) **2

        cost+= f_wb
    cost/=(2*m)

    return cost 




def compute_gradient(x , y , w , b ):
    m, n = x.shape
    dj_dw = np.zeros(n)
    dj_db = 0.0

    for i in range(m):
        f_wb_i = np.dot(x[i], w) +b 
        f_wb = f_wb_i - y[i]

        for j in range(n) :
            dj_dw[j] += x[i, j] * f_wb
        dj_db += f_wb 

    dj_dw /= m 
    dj_dw /= m 

    return  dj_dw , dj_db


# now we have cost compute function => cost value 

# we also have the gradient compute function that return 
# the value of dj_dw , dj-db 



def gradient_descent(x , y ,w_in ,b_in , compute_cost, compute_gradient , alpha , number_iterations): 


    J_history = []
    w = copy.deepcopy(w_in)
    b =b_in

    for i in range(number_iterations):

        dj_dw , dj_db = compute_gradient(x , y , w , b)

        # mise a jours des gradients descents 

        w = w - alpha*dj_dw
        b = b- alpha*dj_db

        if i % 10 ==0:

            J_history.append(compute_cost(x , y ,w , b))

            
        

    return w ,b ,J_history 





    

In [22]:
b_init = 785.1811367994083
w_init = np.array([ 0.39133535, 18.75376741, -53.36032453, -26.42131618]) 
X_train = np.array([[2104, 5, 1, 45], [1416, 3, 2, 40], [852, 2, 1, 35]])
y_train = np.array([460, 232, 178])

initial_w = np.zeros_like(w_init)
initial_b = 0.
alpha = 5.0e-7
iterations = 1000

w_final , b_final , J_his = gradient_descent(X_train, y_train,initial_w , initial_b,compute_cost, compute_gradient, alpha ,iterations)

m,_ = X_train.shape
print(f"b,w found by gradient descent: {b_final:0.2f},{w_final} ")

for i in range(m):
    print(f"prediction: {np.dot(X_train[i], w_final) + b_final:0.2f}, target value: {y_train[i]}")

b,w found by gradient descent: -0.01,[ 0.2027783   0.00156243 -0.00365863 -0.01889488] 
prediction: 425.79, target value: 460
prediction: 286.37, target value: 232
prediction: 172.10, target value: 178


In [None]:
# plot numbers 