### `Cost` or `Loss` Function

- This **`loss`** function is used during training of the neural network. We can use different loss functions in Tensorflow:
  - **sparse_categorical_crossentropy**
  - **binary_crossentropy**
  - **categorical_crossentropy**
  - **mean_absolute_error**
  - **mean_squared_error**
  
- Using the **`loss`** function we can find the error between the actual and the predicted values.  
- Suppose we have 13 training samples then we can calculate the sum of all the errors for each of the 13 samples in the following way:

<img src="images/loss.png" width=700>

- Here the individual errors are called **loss** and the mean of total errors is called **cost** function.
- When we complete calculating errors of all the training samples then it is called one **epoch**.
- There are different **cost** functions used in **Tensorflow** in the following way:

<img src="images/cost_tf.png" width=900>

- For **logistic regression** we use **binary_crossentropy**. To know the reason go to https://towardsdatascience.com/why-not-mse-as-a-loss-function-for-logistic-regression-589816b5e03c

In [1]:
import numpy as np

In [3]:
# Creating predicted and true values of 5 samples

y_predicted = np.array([1,1,0,0,1])
y_true = np.array([0.30,0.7,1,0,0.5])

#### Implementing `mean_absolute_error` manually

In [4]:
# To run for loop parallelly on two arrays we use 'zip()'

def mae(y_predicted, y_true):
    total_error = 0
    for yp, yt in zip(y_predicted, y_true):
        total_error += abs(yp - yt)
    print("Total error is:",total_error)
    mae = total_error/len(y_predicted)
    print("Mean absolute error is:",mae)
    return mae

In [5]:
mae(y_predicted, y_true)

Total error is: 2.5
Mean absolute error is: 0.5


0.5

#### Implement same thing using numpy in much easier way

In [6]:
# Getting the absolute errors

np.abs(y_predicted - y_true)

array([0.7, 0.3, 1. , 0. , 0.5])

In [7]:
# Getting the mean absolute error

np.mean(np.abs(y_predicted-y_true))

0.5

In [8]:
# implementing mae using numpy

def mae_np(y_predicted, y_true):
    return np.mean(np.abs(y_predicted-y_true))

In [9]:
mae = mae_np(y_predicted, y_true)
print("Mean absolute error is:",mae)

Mean absolute error is: 0.5


#### Implementing `Log Loss` or `binary_crossentropy`

In [11]:
# if we do log[0] it is not defined

np.log([0])

  np.log([0])


array([-inf])

In [12]:
# So whenever we need to find log[0] we need to take a value close to 0 but not 0

np.log([0.000000000000001])

array([-34.53877639])

In [13]:
# We are needing epsilon so there will be no 0 when we use the formula where it is (1-yhat)

epsilon = 1e-15

In [14]:
# Checking the value of the epsilon

np.log([1e-15])

array([-34.53877639])

In [15]:
y_predicted

array([1, 1, 0, 0, 1])

In [16]:
# Similarly we want to replace the value 0 with a value close to 0 but not 0
# Here we take the maximum value between i and epsilon

y_predicted_new = [max(i,epsilon) for i in y_predicted]
y_predicted_new

[1, 1, 1e-15, 1e-15, 1]

In [18]:
1-epsilon

0.999999999999999

In [17]:
# Here we want to replace the value 1 with a value close to 1 but not 1
# Here we take the minimum value between i and epsilon

y_predicted_new = [min(i,1-epsilon) for i in y_predicted_new]
y_predicted_new

[0.999999999999999, 0.999999999999999, 1e-15, 1e-15, 0.999999999999999]

In [19]:
# Now if we do the log

y_predicted_new = np.array(y_predicted_new)
np.log(y_predicted_new)

array([-9.99200722e-16, -9.99200722e-16, -3.45387764e+01, -3.45387764e+01,
       -9.99200722e-16])

In [20]:
# In case of the original predicted value we will get

np.log(y_predicted)

  np.log(y_predicted)


array([  0.,   0., -inf, -inf,   0.])

In [21]:
# Now using the log loss formula

-np.mean(y_true*np.log(y_predicted_new)+(1-y_true)*np.log(1-y_predicted_new))

17.2696280766844

In [22]:
# defining the log loss function

def log_loss(y_true, y_predicted):
    y_predicted_new = [max(i,epsilon) for i in y_predicted]
    y_predicted_new = [min(i,1-epsilon) for i in y_predicted_new]
    y_predicted_new = np.array(y_predicted_new)
    return -np.mean(y_true*np.log(y_predicted_new)+(1-y_true)*np.log(1-y_predicted_new))

In [23]:
log_loss(y_true, y_predicted)

17.2696280766844