# The standard loss function for neural network - cross-entropy

Dec 12 2019

This time, your job is to do a few more things with the same network:

• You should calculate not just the output, but also the cross-entropy loss and the mean squared loss

• NOTE: This time, the initial values should be different from what we did in class, x1 = 0.5 and x2 = 0.4.

1) Softmax Layer


Recall the behavior of the softmax layer: given a vector x the ith element in the result of the softmax becomes:

def softmax(x):
return np.exp(x) / np.sum(np.exp(x))
That is, the exponential of the ith element, divided by the sum of the exponentials of all ele- ments. This guarantees that the sum of all elements in the resulting softmax will be 1.

2) No bias

Note again that we ignore the bias feature and weight for the sake of simplicity in this exercise.

3) Nonlinearity

Recall that the outputs of the first hidden layer need to be passed through the tanh-function, while the outputs of the second layer do not. This is reflected in the network diagram.

In [1]:
from math import log, exp, tanh
import numpy as np

initial = np.array([0.5, 0.4])
W1 = np.array([[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]])

# Hidden layer + activation function (tanh)
output1 = np.tanh(initial.dot(W1))
print('Output1: ', output1)

Output1:  [0.2069665  0.29131261 0.37136023]


In [2]:
W2 = np.array([[0.1, 0.2], [0.3, 0.4], [0.5, 0.6]])

# Second hidden layer, NO activation function

output2 = output1.dot(W2)
print('Output2: ', output2)

Output2:  [0.29377055 0.38073448]


In [3]:
# Softmax
def softmax(x):
    return np.exp(x) / np.sum(np.exp(x))

result = softmax(output2)
print('Softmax: ', result)

Softmax:  [0.47827271 0.52172729]


# Cross-entropy

Remember the formula for cross-entropy: given two vectors y and ŷ (where ŷ is the network’s output and y the true, desired value):

In this example, the “desired” output vector should be [0,1] (y), that is, the uppermost output (say class 0) should be 0 and the lowermost one (say class 1) should be 1. Your job is then to calculate the output of the entire neural network, which should produce a vector of two values[?,?] (ŷ), after which you should calculate the cross-entropy.

You can solve this completely manually, but it may be helpful to write small helper functions in Python to make sure you get your calculations right. Be sure to turn in all your work, not just final numbers. To use the exp, log, and tanh functions in Python, import the following: from math import log, exp, tanh

Of course, you can use the numpy equivalents as well.

Also, note that the log-function should be base 2. That means that you need to specify the base as the second argument if you call log. For example, to calculate log(4) to base two, you would do:

log(4, 2)

In [4]:
gold = np.array([0,1])

def calculateLoss(y, x):
    sum1 = 0
    for i in range(len(x)):
        sum1 += y[i] * log(1/x[i], 2) 
    return sum1

loss1 = calculateLoss(gold, result)
print("Cross-entropy: ", loss1)

Cross-entropy:  0.9386321906685277


# Mean Squared Error (MSE)
The sum of the squares of errors

* gold = 1, prediction = 0.8
* MSE = 1/100 * (0.8 - 1)^2 = 0.0004

In [5]:
# gold = np.array([0,1])
# result = [0.47827271 0.52172729]

def calcMSE(gold, result):
    for i in range(len(gold)):
        total = 0
        number = result[i] - gold[i]
        total += number * number
    return total
    
MSE = calcMSE(gold, result)
print('MSE: ', MSE)

MSE:  0.22874478312422142
