# Machine Learning Lab 2 Exercise 2a

## Cost (Loss) Function

You have learnt some cost functions commonly used in machine learning. In sum, in machine learning a cost function is a measure of how wrong the model is in terms of its ability to estimate the relationship between X (input) and y (output)

In the previous class, we have discussed:
+ Mean Square Error
+ Cross Entropy

We are going to implement them using Python and Numpy.

First, begin the code by importing numpy.

In [None]:
# Execute the cell below to import numpy
import numpy as np

## 1. Mean Square Error

Recall the mean square error equation is

\begin{equation}
MSE = \frac{1}{N} \sum_{i=1}^n (\hat{Y_{i}} - Y_{i})^{2}
\end{equation}
where $\hat{Y_{i}}$ is the i-th prediction of the model and $\ Y_{i}$ is the i-th ground truth (sample output)

Sometimes, it is useful to further divide the result by 2, as in the following equation:

\begin{equation}
J(\theta) = \frac{1}{2m}\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})^2
\end{equation}
where $\theta$ denotes the parameters

Now try to implement the code below. You should use vectorized implementation.

In [None]:
# Exercise 2a-1: Complete the following mean square error function in Python
# You should have imported numpy. Make sure you have run the previous cell of import.

def mean_square_loss(Y, Y_hat):
    '''
    Inputs - Y: vector / matrix of ground truths (outcomes)
             Y_hat: vector / matrix of predicted values from a model
    Return - Mean square error
    '''
    m = Y.shape[0]
    ### Your codes here (2 - 3 lines) ###
    ### Hints: Remember to divide the result by 2, as in the equation of J(theta) above
    sq = 
    J = 
    return J
    

To test your implementation, we will generate for y and y_hat the column vector of length 5 with some random values and feed them into the mean_square_loss above. If your mean_square_loss function above is declared correctly, you should expect to see this output:

**MSE loss =  0.0347007606832**

Now run the following cell and compare the result. You do not have to change any codes below.

In [None]:
# make sure our output is the same
np.random.seed(0)

# generate some random values of y and y^
y = np.random.rand(5)
y_hat = np.random.rand(5)

# display y and y_hat data
print("y     = ",y)
print("y_hat = ",y_hat)

# display the data type of y and y_hat
print("Type of y     :", type(y))
print("Type of y_hat :", type(y_hat))
print("="*50)
# run the mean_square_loss function you have implemented above
print("MSE loss = ",mean_square_loss(y, y_hat))


## 2. Cross Entropy Loss

Equation of cross entropy loss is
\begin{equation}
J = -\sum_{i=1}^{m} y_i\log (h_\theta(x_i)) + (1 - y_i)\log(1 - h_\theta(x_i))
\end{equation}

where $h_\theta(x_i)$ is the predicted output $(\hat{Y_{i}})$ of the input sample $x_i$ after running through the model with some parameters $\theta$.

You are encouraged to look into the term "Entropy" and "Cross Entropy" for better understanding.

Now try to implement the code below. Again you should use vectorized implementation.

In [None]:
# Exercise 2a-2: Complete the following cross entropy loss in Python
# You should have imported numpy. Make sure you have run the previous cell of import.

def cross_entropy_loss(Y, Y_hat):
    '''
    Inputs - Y: vector / matrix of ground truths (outcomes)
             Y_hat: vector / matrix of predicted values from a model
    Return - Cross Entropy Loss
    '''
    ### Your codes here (2 - 3 lines) ###
    m = 
    J = 
    return J

To test your implementation, we will generate for y and y_hat the column vector of length 5 with some random values and feed them into the mean_square_loss above. If your mean_square_loss function above is declared correctly, you should expect to see this output:

**Cross Entropy loss =  0.925855227907**

Now run the following cell and compare the result. You do not have to change any codes below.


In [None]:
# make sure our output is the same
np.random.seed(0)

# generate some random values of y and y^
y = np.random.rand(5)
y_hat = np.random.rand(5)

# display y and y_hat data
print("y     = ",y)
print("y_hat = ",y_hat)

# display the data type of y and y_hat
print("Type of y     :", type(y))
print("Type of y_hat :", type(y_hat))
print("="*50)
# run the mean_square_loss function you have implemented above
print("Cross Entropy loss = ",cross_entropy_loss(y, y_hat))

## 3. Why vectorized?

Using vectorized approach in calculation of cost function with the examples of column vectors above may not gain much speed than using traditional programming approach of for-loop. However, if the data size is huge, using for-loop to iterate through each sample is computational costly. We will compare the time taken on calculation using both approaches as follows.

In [None]:
import time
import math

# make sure our random values are the same
np.random.seed(1)
size = 1000000
# generate some random values of y and y^
y = np.random.rand(size)
y_hat = np.random.rand(size)

# we've implemented the cross entropy loss using for-loop as below
def cross_entropy_for_loop(Y, Y_hat):
    m = Y.shape[0]
    acc = 0.0
    for i in range(m):
        tmp = Y[i] * math.log(Y_hat[i]) + (1 - Y[i]) * math.log(1 - Y_hat[i])
        acc += tmp
    return -acc / m

# The following will run the cross entropy loss using both approaches and compare the running time required.
start_time = time.time()
J = cross_entropy_loss(y, y_hat)
total_time = time.time() - start_time
print("Cross Entropy loss (vectorized) = {:2.10}, total time taken = {:2.4}s".format(J,total_time))

start_time = time.time()
J = cross_entropy_for_loop(y, y_hat)
total_time = time.time() - start_time
print("Cross Entropy loss (for-loop) = {:2.10}, total time taken = {:2.4}s".format(J,total_time))


For a input vector of size 1000000 above, even though the cross entropy losses calculated using both approaches are the same, using vectorized approach is indeed over 100 times faster than using traditional for-loop. Numpy plays a key role in doing all the matrix manipulation behind!