## Loss Function
A loss function is a measure of how well a neural network can predict the expected output. It measures the difference between the predicted output and the expected output. Loss functions are used to help optimize the parameters of a neural network and can be used to measure the performance of a neural network during the training process. Popular loss functions in deep learning include Mean Squared Error, Cross Entropy Loss, and Hinge Loss.
* The possible parameters which loss function can taks are:
    1. sparse_categorical_crossentropy
    2. binary_crossentropy
    3. categorical_crossentropy
    4. mean_absolute_error
    5. mean_squared_error
    
    
* For Logistic Regression we use 'Log Loss', we don't use Mean_Square_Error (MSE) or Mean_Absolute_Error (MAE). You can check it why we don't use MSE for Logistic Regression?

<img src = "losses.png" width = "800px" height = "600px"></img>

* **When we go through all the samples and find the Mean Square Error (MSE), it's called 1-epoch. See the bellow image:**

<img src = "epoch1.png" width = "800px" height = "600px"></img>

In [4]:
# Now let's see the three 'Loss' function, how they work?
import numpy as np

# Here we two numpy arrays, where we have five samples (the first one is the predicted value and the 2nd one is the true values.)
y_predicted = np.array([1,1,0,0,1])
y_true = np.array([0.30,0.7,1,0,0.5])

In [5]:
# We first implement the MAE function, so we define the function and it will takes two parameters y_predicted and the true values.
# We go through both this array by loop using zip function 
def mae (y_true, y_predicted):
    total_error = 0
    for yt, yp in zip(y_true, y_predicted):
        print(yt, yp)    # To print the values

In [6]:
# Call to the function to print the values for checking:
mae(y_true, y_predicted)

0.3 1
0.7 1
1.0 0
0.0 0
0.5 1


In [10]:
# So we get the abolute error using python 'abs' and we add it to total_error in each iteration:
def mae (y_true, y_predicted):
    total_error = 0
    for yt, yp in zip(y_true, y_predicted):
        total_error += abs(yt - yp)
    print("Total error: ", total_error)
    mae = total_error / len(y_predicted)   # Returen the mean of errors
    print("MAE: ", mae)

In [11]:
# Let's see what happen:
mae(y_true, y_predicted)

Total error:  2.5
MAE:  0.5


In [12]:
# So upper we use for loop, but numpy support vector operations very easily, so we can define the same function using numpy:
# If you have numpy array you can do:
y_predicted - y_true  # It's giving you the real differece.

array([ 0.7,  0.3, -1. ,  0. ,  0.5])

In [13]:
# But if you use 'abs', it will give you the absolute error:
np.abs(y_predicted - y_true)

array([0.7, 0.3, 1. , 0. , 0.5])

In [14]:
# And Mean Absolute Error is the absolute difference, so you can just take the mean of it:
np.mean(np.abs(y_predicted - y_true))

0.5

In [15]:
# If you want to get total errors, you can take sum of it:
np.sum(np.abs(y_predicted - y_true))

2.5

* So we see, we did the whole loop in just one line. This is the power of numpy.

In [16]:
# To implement Log loss or binary cross entropy, for that we need to know about log() function, let's see if we take log of 
# zero, it will not give an answer and will say that log of zero (log[0]) is not defined. So we need to take a number which 
# which is close to zero.
np.log([0])

  np.log([0])


array([-inf])

In [17]:
# If we do:
np.log([0.00000000000000001])  # this will be fine.

array([-39.14394658])

In [20]:
# So in log loss function we define the zero with a number which is very close to zero. Again if you see in the formula we 
# have log(1 - y_predicted), so if we have 1 in the y_predicted, it will agian create a problem. So for that we need to 
# define eplison first which could be very colse to 0.
epsilon = 1e-16

In [21]:
# So here what we do is, we replace all the values in y_predicted which is 1 to a value which is close to 1 but not 1. 
# Similarly we replace the 0 value into a number which is close to zero but not zero.
# For y_predicted, we'll go over all the values using loop. (list comprehension in python)
y_predicted_new = [max(i, eplison) for i in y_predicted] # Will replace all 0's close to 0.
y_predicted_new

[1, 1, 1e-16, 1e-16, 1]

In [22]:
# Let's do the same for changing 1 into close to 1:
y_predicted_new = [min(i, 1 - eplison) for i in y_predicted_new] # Will replace all 1's close to 1.
y_predicted_new

[0.9999999999999999, 0.9999999999999999, 1e-16, 1e-16, 0.9999999999999999]

In [23]:
# Now if we take the log of y_predicted_new, it won't generate an error:
y_predicted_new = np.array(y_predicted_new)
np.log(y_predicted_new)

array([-1.11022302e-16, -1.11022302e-16, -3.68413615e+01, -3.68413615e+01,
       -1.11022302e-16])

In [24]:
# If we take the log of the original y_predicted, it will give us some infinit values, so because of this reason we create
# a numpy array.
np.log(y_predicted)

  np.log(y_predicted)


array([  0.,   0., -inf, -inf,   0.])

In [26]:
# So now for implementing the log loss, we use the log loss formula which will give you the log loss:
-np.mean(y_true*np.log(y_predicted_new)+(1-y_true)*np.log(1-y_predicted_new))

18.389312468484075

In [27]:
# So here the above process is defined in a single function called 'log_loss':
def log_loss(y_true, y_predicted):
    epsilon=1e-16
    y_predicted_new = [max(i,epsilon) for i in y_predicted]
    y_predicted_new = [min(i,1-epsilon) for i in y_predicted_new]
    y_predicted_new = np.array(y_predicted_new)
    return -np.mean(y_true*np.log(y_predicted_new)+(1-y_true)*np.log(1-y_predicted_new))

In [28]:
# Now when we call the function, it will give us the log loss:
log_loss(y_true, y_predicted)

18.389312468484075

### Exercise
Implement mean squared error (or MSE) in two ways,

    1. Without using numpy (i.e. using plain python)
    2. With the use of numpy

In [33]:
# So to implement MSe using plain python, we define a general method which takes both the arrays. Then same as MAE we create
# loop to read both arrays using zip function. Then we calculate the total_erros using MSE function. As result we print the 
# Total errors and MSE:

def mse(y_true, y_predicted):
    total_error = 0
    for yt, yp in zip(y_true, y_predicted):
        total_error += (yt-yp)**2
    print("Total Generated Squared Error is: ", total_error)
    mse = total_error/len(y_predicted)
    print("Mean Squared Error is: ", mse)
    
mse(y_true, y_predicted)

Total Generated Squared Error is:  1.83
Mean Squared Error is:  0.366


In [34]:
# Now using numpy array we'll do using numpy 'square' function as follow:
np.mean(np.square(y_true - y_predicted))

0.366

* **Thats were all about Loss function...**