# HOMEWORK 2 - BEGINNERS

##### TEAM F
Chouliarias Andreas 2143

Matzoros Christos-Konstantinos 2169

Pappas Apostolos 2109


## Exercise 1: Simple Linear Regression

The predict_sales and cost_function functions were given and they are implemented as:

In [21]:
# imports
import pandas as pd

def predict_sales(radio, weight, bias):
    return weight*radio + bias

def cost_function(radio, sales, weight, bias):
    companies = len(radio)
    total_error = 0.0
    
    for i in range(companies):
        total_error += (sales[i] - (weight*radio[i] + bias))**2
    return total_error / companies

The function update_weights was given and is based on the simple gradient descent algorith.

In [22]:
def update_weights(radio, sales, weight, bias, learning_rate):
    weight_deriv = 0
    bias_deriv = 0
    companies = len(radio)

    for i in range(companies):
        # Calculate partial derivatives
        # -2x(y - (mx + b))
        weight_deriv += -2*radio[i] * (sales[i] - (weight*radio[i] + bias))

        # -2(y - (mx + b))
        bias_deriv += -2*(sales[i] - (weight*radio[i] + bias))

    # We subtract because the derivatives point in direction of steepest ascent
    weight -= (weight_deriv / companies) * learning_rate
    bias -= (bias_deriv / companies) * learning_rate

    return weight, bias

The train function was modified to produce the requested output format.

In [23]:
def train(radio, sales, weight, bias, learning_rate, iters):
    cost_history = []

    for i in range(iters):
        weight,bias = update_weights(radio, sales, weight, bias, learning_rate)

        #Calculate cost for auditing purposes
        cost = cost_function(radio, sales, weight, bias)
        cost_history.append(cost)
        
        # Log Progress
        if (i % 10 == 0 or i==1):
            print ("iter: %-2i" %i + " weight: %.2f" %weight + " bias: %.4f" %bias + " cost: %.2f" %cost)
    return weight, bias, cost_history

This is the part where we construct our model.<br>
We import the dataset,<br>
We split the data we will need to seperate variables,<br>
We initialize the weights and bias,<br>
We set the learning rate and the number of iterations for the train function.<br>

In [24]:
df = pd.read_csv('Advertising.csv')
#print(df.shape)

companies = df['company'].values
radio = df['radio'].values
sales = df['sales'].values
weight = 0.0
bias   = 0.0001

learning_rate=0.000048
iters = 41

train(radio, sales, weight, bias, learning_rate, iters);


iter: 0  weight: 0.04 bias: 0.0014 cost: 198.27
iter: 1  weight: 0.07 bias: 0.0027 cost: 176.40
iter: 10 weight: 0.28 bias: 0.0114 cost: 77.03
iter: 20 weight: 0.39 bias: 0.0174 cost: 50.45
iter: 30 weight: 0.44 bias: 0.0216 cost: 44.61
iter: 40 weight: 0.46 bias: 0.0249 cost: 43.32


Based on the previous output, we came to the conclusion that the linear relationship between sales and radio is: 

$$Sales = 0.46*Radio + 0.0249$$

## Exercise 2: Multivariable regression

In the following piece of code, we have the normalization function. We normalize our data using the followin formula:

$$ \frac{x - μ}{x_{max} - x_{min}}$$ ,where $x_{max} - x_{min}$ is the range.

In [25]:
# imports
import pandas as pd
import numpy as np

def normalize(features):
    '''
    features     -   (200, 4)
    features.T   -   (4, 200)

    We transpose the input matrix, swapping
    cols and rows to make vector math easier
    '''

    for feature in features.T:
        fmean = np.mean(feature)
        frange = np.amax(feature) - np.amin(feature)

        #Vector Subtraction
        feature -= fmean

        #Vector Division
        feature /= frange

    return features

Following, we present the predict and cost functions. We use the Mean Squared Error as our cost function:

$$MSE =  \frac{1}{N} \sum_{i=1}^{n} (y_i - (m x_i + b))^2$$


In [26]:
def predict(features, weights):
  '''
  features - (200, 4)
  weights - (4, 1)
  predictions - (200,1)
  '''
  return np.dot(features,weights)

def cost_function(features, targets, weights):
    
    #Features:(200,4)
    #Targets: (200,1)
    #Weights:(4,1)
    #Returns 1D matrix of predictions
    
    N = len(targets)

    predictions = predict(features, weights)

    # Matrix math lets use do this without looping
    sq_error = (predictions - targets)**2

    # Return average squared error among predictions
    return 1.0/(2*N) * sq_error.sum()

The next step is to update our weights using the vectorized weight update function:

In [27]:
def update_weights_vectorized(X, targets, weights, lr):
    '''
    gradient = X.T * (predictions - targets) / N
    X: (200, 4)
    Targets: (200, 1)
    Weights: (4, 1)
    '''
    companies = len(X)

    #1 - Get Predictions
    predictions = predict(X, weights)
    #2 - Calculate error/loss
    error = targets - predictions
    #3 Transpose features from (200, 3) to (3, 200)
    # So we can multiply w the (200,1)  error matrix.
    # Returns a (3,1) matrix holding 3 partial derivatives --
    # one for each feature -- representing the aggregate
    # slope of the cost function across all observations
    gradient = np.dot(-X.T,  error)

    #4 Take the average error derivative for each feature
    gradient /= companies

    #5 - Multiply the gradient by our learning rate
    gradient *= lr
    
    #6 - Subtract from our weights to minimize cost
    weights -= gradient

    return weights

We implemented the training function based on the previous training function, modifying it to match our problem.<br>
We present the updated parameters every 1000 iterations:

In [28]:
def train(features, targets, weights, lr, iters):
    cost_history = []

    for i in range(iters+1):
        weights = update_weights_vectorized(features, targets, weights, lr)

        #Calculate cost for auditing purposes
        cost = cost_function(features, targets, weights)
        cost_history.append(cost)
        
        # Log Progress
        if (i % 1000 == 0 or i==1):
            print ("iter: %-5i"         %i + 
                   " weights: TV: %.2f" %weights[1] + 
                   " radio: %.2f"       %weights[2] + 
                   " newspaper: %.2f"   %weights[3] + 
                   " bias: %.4f"        %weights[0] + 
                   " cost: %.2f"        %cost)
            
    return weights, bias, cost_history


This is the part where we construct our model.<br>
We import the dataset,<br>
We split the data we will need to seperate variables,<br>
We initialize the weights and bias,<br>
We set the learning rate and the number of iterations for the train function.<br>

In [29]:
df = pd.read_csv('Advertising.csv')
#print(df.shape)

features = df[['TV', 'radio' , 'newspaper']]
features = normalize(features.values)
targets = df['sales'].values.reshape(-1, 1) 

B  = 0.0
W1 = 0.0
W2 = 0.0
W3 = 0.0
weights = np.array([
    [B ],   
    [W1],
    [W2],
    [W3]
])

bias = np.ones(shape=(len(features),1))
features = np.append(bias, features, axis=1)

lr = 0.0005
iters = 10000

train(features, targets, weights, lr, iters);

iter: 0     weights: TV: 0.00 radio: 0.00 newspaper: 0.00 bias: 0.0070 cost: 111.76
iter: 1     weights: TV: 0.00 radio: 0.00 newspaper: 0.00 bias: 0.0140 cost: 111.66
iter: 1000  weights: TV: 0.58 radio: 0.44 newspaper: 0.11 bias: 5.5227 cost: 48.59
iter: 2000  weights: TV: 1.13 radio: 0.85 newspaper: 0.21 bias: 8.8678 cost: 24.78
iter: 3000  weights: TV: 1.66 radio: 1.25 newspaper: 0.31 bias: 10.8964 cost: 15.49
iter: 4000  weights: TV: 2.16 radio: 1.62 newspaper: 0.40 bias: 12.1267 cost: 11.60
iter: 5000  weights: TV: 2.65 radio: 1.98 newspaper: 0.48 bias: 12.8728 cost: 9.72
iter: 6000  weights: TV: 3.11 radio: 2.32 newspaper: 0.56 bias: 13.3252 cost: 8.63
iter: 7000  weights: TV: 3.55 radio: 2.64 newspaper: 0.63 bias: 13.5996 cost: 7.87
iter: 8000  weights: TV: 3.98 radio: 2.95 newspaper: 0.70 bias: 13.7661 cost: 7.25
iter: 9000  weights: TV: 4.38 radio: 3.24 newspaper: 0.76 bias: 13.8670 cost: 6.72
iter: 10000 weights: TV: 4.77 radio: 3.52 newspaper: 0.82 bias: 13.9282 cost: 6.25


After training our model through 10000 iterations with a learning rate of
.0005, we finally arrive at a set of weights we can use to make
predictions:

$$Sales = 4.77TV + 3.52Radio + 0.82Newspaper + 13.9282$$

Our MSE cost dropped to 6.25.
