# Linear Regression (Multiple Variables)

## Function
In here we have multiple features so $w$ is a vector and is written as $\overrightarrow{w}$
and $x$ is also a vector written as $\overrightarrow{x}$. $b$ is the same number (bias) and we write the function as<br>
$f_{\overrightarrow{w},b}(\overrightarrow{x}) = \overrightarrow{w}.\overrightarrow{x} + b$<br>
which is its vectorized form; without vectorization it will be like <br>
$f_{w, b}(x) = w_1x_1 + w_2x_2 + . . . + w_nx_n + b$ <br>
The result of the dot product is the same<br>
$\overrightarrow{w}.\overrightarrow{x} = w_1x_1 + w_2x_2 + . . . + w_nx_n$

Let's use the same example of price of the houses but this time with multiple features; other than predicting the price via its size, we'll have number of bedrooms, age of the house, number of bathrooms.

In [1]:
import numpy as np
import pandas as pd
import copy
import matplotlib.pyplot as plt

In [2]:
# Our dataset
x_train = np.array([[1.275, 4, 1, 12], [1.674, 5, 2, 6], [2.000, 6, 3, 1], [0.987, 2, 1, 34], [1.275, 4, 1, 4]], dtype='float64')
y_train = np.array([452.983, 673.983, 983.992, 122.111, 555.211], dtype='float64')
m, n = x_train.shape # m is the number of data and n is the number of features
w = np.random.random(n) # initial value for w
b = 100 # initial value for b

In [3]:
df = pd.DataFrame(x_train, columns=['Size (1k feet squared)', 'Number of bedrooms', 'Number of bathrooms', 'Age'])
df['Price(1k $)'] = pd.Series(y_train)
df

Unnamed: 0,Size (1k feet squared),Number of bedrooms,Number of bathrooms,Age,Price(1k $)
0,1275,4,1,12,452
1,1674,5,2,6,673
2,2000,6,3,1,983
3,987,2,1,34,122
4,1275,4,1,4,555


In [4]:
def predict(x, w, b):
    return np.dot(w, x) + b

Now let's try to predict the value of the first house in the dataset; 

In [5]:
print(f'Value for a house with size=1.275 and numberOfBedrooms=4 and numberOfBathrooms=1 and age=12 is {predict(x_train[0], w, b):.3f}k$')
print(f'The actual price is {y_train[0]}k$')

Value for a house with size=1.275 and numberOfBedrooms=4 and numberOfBathrooms=1 and age=12 is 111.141k$
The actual price is 452.983k$


## Cost function
Now that we use vectorization, cost function is defined as <br>
$J(\overrightarrow{w}, b) = \frac{1}{2m}\Sigma_{i=1}^{m}(f_{\overrightarrow{x},b}(\overrightarrow{x}^{(i)}) - y^{(i)})^2$

In [6]:
def cost(x, y, w, b):
    err_sum = 0
    m = x.shape[0]
    for i in range(m):
        f_wb = np.dot(w, x[i]) + b
        err_sum += (f_wb - y[i]) ** 2
    err_sum = err_sum / (2 * m)
    return err_sum

If we run the cost function now due to our last test for predicting the price of the first house, we anticipate a large number


In [7]:
print(cost(x_train, y_train, w, b))

140870.5556297853


## Gradient Descent
Now since we have multiple features, we should find the derivative of each of the features. So, we have to find a good value for each of $w$s from $w_1$ to $w_n$
So we have to find the derivative for each of the features
$w_i = w_i - \alpha\frac{d}{dm}J(\overrightarrow{w}, b)$ and $b = b - \alpha\frac{d}{db}J(\overrightarrow{w}, b)$<br>
for each of the $w$s we have<br>
$w_j = w_j - \alpha\frac{1}{m}\Sigma_{i=1}^{m}(f_{\overrightarrow{w},b}(x^{(i)}) - y^{(i)})x_{j}^{(i)}$<br>
$b = b - \alpha\frac{1}{m}\Sigma_{i=1}^{m}(f_{\overrightarrow{w},b}(x^{(i)})-y^{(i)})$

In [8]:
def gradient(x, y, w, b):
    m, n = x.shape
    w_t = np.zeros((n,))
    b_t = 0
    for i in range(m):
        err = np.dot(w, x[i]) + b - y[i]
        for j in range(n):
            w_t[j] += err * x[i, j]
        b_t += err
    w_t = w_t / m
    b_t = b_t / m
    return w_t, b_t

In [9]:
def gradient_descent(x, y, init_w, init_b, alpha, cost_function, gradient_function, iterations=1000):
    w = copy.deepcopy(init_w)
    b = init_b
    for i in range(1, iterations + 1):
        w_t, b_t = gradient_function(x, y, w, b)
        w = w - (alpha * w_t)
        b = b - (alpha * b_t)
        if i % 10 == 0:
            print(f'w={w}, b={b}, cost={cost_function(x, y, w, b)}')
    return w, b

In [10]:
w, b = gradient_descent(x_train, y_train, w, b, 0.00714, cost, gradient, 1000)

w=[ 25.01176683  78.30780322  32.70608604 -11.53864819], b=113.70269635023277, cost=14360.076743202566
w=[ 30.78498049  95.85211343  40.96149938 -13.32949009], b=116.67403627377844, cost=5538.18171359211
w=[ 32.17246894  99.6544119   43.73079928 -12.94668817], b=117.18998258953428, cost=3658.0392223530444
w=[ 32.54994317 100.30215862  45.22698969 -12.22769576], b=117.14178220949354, cost=2652.1784433688335
w=[ 32.68981515 100.21689811  46.41382129 -11.5594951 ], b=116.96260807287705, cost=2017.3209598038259
w=[ 32.76978771  99.9549093   47.51285358 -11.0045217 ], b=116.75244693985616, cost=1608.535575832861
w=[ 32.83151043  99.64590258  48.57591823 -10.55638214], b=116.53498432555247, cost=1342.9243517513455
w=[ 32.88535215  99.32163577  49.61587529 -10.19716637], b=116.31630157004935, cost=1168.3401096874861
w=[32.93436617 98.99105559 50.63651088 -9.90967934], b=116.0983145859541, cost=1051.7041244343393
w=[32.97979707 98.65754452 51.63939848 -9.67955364], b=115.88187194624996, cost=9

Let's now predict the value of the house like before

In [11]:
print(f'Value for a house with size=1.275 and numberOfBedrooms=4 and numberOfBathrooms=1 and age=12 is {predict(x_train[0], w, b):.3f}k$')
print(f'The actual price is {y_train[0]}k$')

Value for a house with size=1.275 and numberOfBedrooms=4 and numberOfBathrooms=1 and age=12 is 470.303k$
The actual price is 452.983k$


In [12]:
predicted_values = []
for i in range(m):
    predicted_values.append(predict(x_train[i], w, b))
df['Prediction (1k $)'] = pd.Series(predicted_values)
df

Unnamed: 0,Size (1k feet squared),Number of bedrooms,Number of bathrooms,Age,Price(1k $),Prediction (1k $)
0,1.275,4.0,1.0,12.0,452.983,470.302643
1,1.674,5.0,2.0,6.0,673.983,716.912757
2,2.0,6.0,3.0,1.0,983.992,952.411766
3,0.987,2.0,1.0,34.0,122.111,111.271834
4,1.275,4.0,1.0,4.0,555.211,538.873502


As it seems, we've been able to predict not too close to the actual price but somehow promising.