# Linear Regression (Multiple Variables)

## Function
In here we have multiple features so $w$ is a vector and is written as $\overrightarrow{w}$
and $x$ is also a vector written as $\overrightarrow{x}$. $b$ is the same number (bias) and we write the function as<br>
$f_{\overrightarrow{w},b}(\overrightarrow{x}) = \overrightarrow{w}.\overrightarrow{x} + b$<br>
which is its vectorized form; without vectorization it will be like <br>
$f_{w, b}(x) = w_1x_1 + w_2x_2 + . . . + w_nx_n + b$ <br>
The result of the dot product is the same<br>
$\overrightarrow{w}.\overrightarrow{x} = w_1x_1 + w_2x_2 + . . . + w_nx_n$

Let's use the same example of price of the houses but this time with multiple features; other than predicting the price via its size, we'll have number of bedrooms, age of the house, number of bathrooms.

In [1]:
import numpy as np
import pandas as pd
import copy
import matplotlib.pyplot as plt

In [2]:
# Our dataset
x_train = np.array([[1.275, 4, 1, 12], [1.674, 5, 2, 6], [2, 6, 3, 1], [0.987, 2, 1, 34], [1.275, 4, 1, 4]], dtype='float64')
y_train = np.array([452.983, 673.983, 983.992, 122.111, 555.211])
m, n = x_train.shape # m is the number of data and n is the number of features
w = np.random.random(n) # initial value for w
b = 100 # initial value for b

In [3]:
df = pd.DataFrame(x_train, columns=['Size (1k feet squared)', 'Number of bedrooms', 'Number of bathrooms', 'Age'])
df['Price(1k $)'] = pd.Series(y_train)
df

Unnamed: 0,Size (1k feet squared),Number of bedrooms,Number of bathrooms,Age,Price(1k $)
0,1.275,4.0,1.0,12.0,452.983
1,1.674,5.0,2.0,6.0,673.983
2,2.0,6.0,3.0,1.0,983.992
3,0.987,2.0,1.0,34.0,122.111
4,1.275,4.0,1.0,4.0,555.211


In [4]:
def predict(x, w, b):
    return np.dot(w, x) + b

Now let's try to predict the value of the first house in the dataset; 

In [5]:
print(f'Value for a house with size=1.275 and numberOfBedrooms=4 and numberOfBathrooms=1 and age=12 is {predict(x_train[0], w, b):.3f}k$')
print(f'The actual price is {y_train[0]}k$')

Value for a house with size=1.275 and numberOfBedrooms=4 and numberOfBathrooms=1 and age=12 is 111.024k$
The actual price is 452.983k$


## Cost function
Now that we use vectorization, cost function is defined as <br>
$J(\overrightarrow{w}, b) = \frac{1}{2m}\Sigma_{i=1}^{m}(f_{\overrightarrow{x},b}(\overrightarrow{x}^{(i)}) - y^{(i)})^2$

In [6]:
def cost(x, y, w, b):
    err_sum = 0
    m = x.shape[0]
    for i in range(m):
        f_wb = np.dot(w, x[i]) + b
        err_sum += (f_wb - y[i]) ** 2
    err_sum = err_sum / (2 * m)
    return err_sum

If we run the cost function now due to our last test for predicting the price of the first house, we anticipate a large number


In [7]:
print(cost(x_train, y_train, w, b))

140502.6662169472


## Gradient Descent
Now since we have multiple features, we should find the derivative of each of the features. So, we have to find a good value for each of $w$s from $w_1$ to $w_n$
So we have to find the derivative for each of the features
$w_i = w_i - \alpha\frac{d}{dm}J(\overrightarrow{w}, b)$ and $b = b - \alpha\frac{d}{db}J(\overrightarrow{w}, b)$<br>
for each of the $w$s we have<br>
$w_j = w_j - \alpha\frac{1}{m}\Sigma_{i=1}^{m}(f_{\overrightarrow{w},b}(x^{(i)}) - y^{(i)})x_{j}^{(i)}$<br>
$b = b - \alpha\frac{1}{m}\Sigma_{i=1}^{m}(f_{\overrightarrow{w},b}(x^{(i)})-y^{(i)})$

In [8]:
def gradient(x, y, w, b):
    m, n = x.shape
    w_t = np.zeros((n,))
    b_t = 0
    for i in range(m):
        err = np.dot(w, x[i]) + b - y[i]
        for j in range(n):
            w_t[j] += err * x[i, j]
        b_t += err
    w_t = w_t / m
    b_t = b_t / m
    return w_t, b_t

In [9]:
def gradient_descent(x, y, init_w, init_b, alpha, cost_function, gradient_function, iterations=1000):
    w = copy.deepcopy(init_w)
    b = init_b
    for i in range(1, iterations + 1):
        w_t, b_t = gradient_function(x, y, w, b)
        w = w - (alpha * w_t)
        b = b - (alpha * b_t)
        if i % 10 == 0:
            print(f'w={w}, b={b}, cost={cost_function(x, y, w, b)}')
    return w, b

In [10]:
w, b = gradient_descent(x_train, y_train, w, b, 0.007115, cost, gradient)

w=[ 25.08072794  78.19449514  32.86088915 -11.08611512], b=113.67747807350128, cost=13590.326253446872
w=[ 30.87923146  95.81361785  41.13841122 -12.62416192], b=116.66856456739684, cost=4537.680956026259
w=[ 32.28256838  99.66516768  43.92001485 -12.13007739], b=117.19680856594368, cost=2751.2991536387462
w=[ 32.66515437 100.33103416  45.41641241 -11.38687277], b=117.15350682426963, cost=1928.3765898101894
w=[ 32.80463277 100.24814449  46.59742318 -10.74783208], b=116.97529101384534, cost=1476.0587899621935
w=[ 32.88170763  99.98150194  47.68804199 -10.25245677], b=116.76418443455188, cost=1219.5887861557662
w=[32.93947528 99.66490088 48.74170066 -9.87894146], b=116.54491756999211, cost=1070.7345894811958
w=[32.98902527 99.33206871 49.7720259  -9.59937015], b=116.32412628307864, cost=981.3311199704285
w=[33.0337917  98.99296753 50.78319614 -9.39033396], b=116.10403325324103, cost=924.8748339213407
w=[33.07522329 98.65151637 51.77699034 -9.23384408], b=115.88565602689556, cost=886.7680

Let's now predict the value of the house like before

In [11]:
print(f'Value for a house with size=1.275 and numberOfBedrooms=4 and numberOfBathrooms=1 and age=12 is {predict(x_train[0], w, b):.3f}k$')
print(f'The actual price is {y_train[0]}k$')

Value for a house with size=1.275 and numberOfBedrooms=4 and numberOfBathrooms=1 and age=12 is 470.329k$
The actual price is 452.983k$


In [12]:
predicted_values = []
for i in range(m):
    predicted_values.append(predict(x_train[i], w, b))
df['Prediction (1k $)'] = pd.Series(predicted_values)
df

Unnamed: 0,Size (1k feet squared),Number of bedrooms,Number of bathrooms,Age,Price(1k $),Prediction (1k $)
0,1.275,4.0,1.0,12.0,452.983,470.329057
1,1.674,5.0,2.0,6.0,673.983,716.91238
2,2.0,6.0,3.0,1.0,983.992,952.378824
3,0.987,2.0,1.0,34.0,122.111,111.259518
4,1.275,4.0,1.0,4.0,555.211,538.907668


As it seems, we've been able to predict not too close to the actual price but somehow promising.