# Cost Functions

## Introduction

In this lab, we will explore the concept of cost functions in the context of neural networks. Cost functions play a crucial role in training neural networks as they provide feedback on the model's performance and guide it towards finding the optimal set of parameters. We will begin by understanding the role of cost functions using an analogy of a GPS navigation system. Then, we will discuss various types of cost functions such as Mean Squared Error (MSE), Cross-Entropy, and Log Loss, and their applications in different problem domains. Let's dive in!

## Part 1: Understanding Cost Functions

Cost functions serve as a measure of how 'far off' the model's predictions are from the actual outcomes. This distance or error is quantified by the cost function, which provides feedback to the model. To illustrate this concept, let's consider a GPS navigation system analogy. Imagine you are driving in a vast terrain, and your goal is to reach a specific destination. The GPS navigation system helps you by providing directions and constantly evaluating your current position. It compares your position with the desired destination and calculates the 'distance' or error between them. Similarly, a cost function guides a neural network towards the correct solution by measuring the error between the model's predictions and the actual outcomes.

## Part 2: Types of Cost Functions

There are various types of cost functions used in neural networks. Let's explore some of the commonly used ones:

### 1. Mean Squared Error (MSE)

MSE is a popular cost function used in regression problems. It calculates the average squared difference between the predicted values and the true values. The equation for MSE is:

$$\text{MSE} = \frac{1}{n}\sum_{i=1}^{n}(y_i - \hat{y_i})^2$$

where $n$ is the number of samples, $y_i$ represents the true value, and $\hat{y_i}$ represents the predicted value for the $i$-th sample.

### 2. Cross-Entropy

Cross-Entropy is commonly used in binary classification and multi-class classification problems. It quantifies the difference between the predicted class probabilities and the true class probabilities. The equation for Cross-Entropy is:

$$\text{Cross-Entropy} = -\sum_{i=1}^{n}(y_i\log(\hat{y_i}))$$

where $n$ is the number of classes, $y_i$ represents the true probability of class $i$, and $\hat{y_i}$ represents the predicted probability of class $i$.

### 3. Log Loss

Log Loss is another cost function commonly used in binary classification problems. It measures the performance of a classification model by penalizing false classifications. The equation for Log Loss is:

$$\text{Log Loss} = -\sum_{i=1}^{n}(y_i\log(\hat{y_i})) + (1 - y_i)\log(1 - \hat{y_i})$$

where $y_i$ represents the true class label (0 or 1), and $\hat{y_i}$ represents the predicted probability of class 1.

## Part 3: Choosing the Right Cost Function

The choice of a cost function depends on the problem at hand. Different problems require different cost functions to effectively measure and optimize the model's performance. Here are some examples:

1. **Regression**: In regression problems, where the goal is to predict continuous values, Mean Squared Error (MSE) is commonly used as it penalizes large errors more heavily. For example, predicting housing prices based on features such as area, location, and number of rooms.

2. **Binary Classification**: In binary classification problems, where the goal is to classify instances into two classes, Cross-Entropy or Log Loss can be used. Cross-Entropy is often preferred when the predicted probabilities are well-calibrated, while Log Loss provides a stronger penalty for misclassifications. An example scenario could be email spam detection.

3. **Multi-class Classification**: In multi-class classification problems, where the goal is to classify instances into multiple classes, Cross-Entropy is commonly used. It compares the predicted class probabilities with the true class probabilities for each instance. For instance, classifying images into different categories like cats, dogs, and birds.

## Conclusion

In this lab, we explored the concept of cost functions in neural networks. We learned that cost functions play a vital role in training models by providing feedback on the model's performance. We discussed different types of cost functions, including Mean Squared Error (MSE), Cross-Entropy, and Log Loss, along with their applications in various problem domains. Understanding cost functions and choosing the appropriate one based on the problem at hand is crucial for building effective neural network models.


## Python Implementation

Now, let's implement some code examples to demonstrate the calculation of cost functions in Python. We will start with MSE for a regression problem and then move on to Cross-Entropy and Log Loss for classification problems.

In [ ]:
# Importing necessary libraries
import numpy as np

# Function to calculate Mean Squared Error (MSE)
def mean_squared_error(y_true, y_pred):
    n = len(y_true)
    mse = np.sum((y_true - y_pred) ** 2) / n
    return mse

# Example usage
true_values = np.array([1, 2, 3, 4, 5])
predicted_values = np.array([1.2, 1.8, 3.2, 3.9, 5.1])
mse = mean_squared_error(true_values, predicted_values)
mse

0.17200000000000004

In the above code cell, we defined a function `mean_squared_error` that takes in the true values and predicted values as input and calculates the mean squared error using the provided equation. We then demonstrated the usage of the function with a sample set of true values and predicted values. The output is the calculated Mean Squared Error.

In [ ]:
# Function to calculate Cross-Entropy
def cross_entropy(y_true, y_pred):
    n = len(y_true)
    epsilon = 1e-15  # Small value to avoid division by zero
    ce = -np.sum(y_true * np.log(y_pred + epsilon)) / n
    return ce

# Example usage
true_probabilities = np.array([0, 1, 0, 0, 1])
predicted_probabilities = np.array([0.2, 0.8, 0.3, 0.4, 0.9])
ce = cross_entropy(true_probabilities, predicted_probabilities)
ce

0.576529444570645

In the code above, we defined a function `cross_entropy` that takes in the true probabilities and predicted probabilities as input. It calculates the cross-entropy using the provided equation. We demonstrated the usage of the function with a sample set of true probabilities and predicted probabilities. The output is the calculated Cross-Entropy.

In [ ]:
# Function to calculate Log Loss
def log_loss(y_true, y_pred):
    n = len(y_true)
    epsilon = 1e-15  # Small value to avoid division by zero
    ll = -np.sum(y_true * np.log(y_pred + epsilon) + (1 - y_true) * np.log(1 - y_pred + epsilon)) / n
    return ll

# Example usage
true_labels = np.array([0, 1, 1, 0, 1])
predicted_probabilities = np.array([0.1, 0.9, 0.8, 0.3, 0.7])
ll = log_loss(true_labels, predicted_probabilities)
ll

0.6851790109107685

In the code above, we defined a function `log_loss` that takes in the true labels and predicted probabilities as input. It calculates the log loss using the provided equation. We demonstrated the usage of the function with a sample set of true labels and predicted probabilities. The output is the calculated Log Loss.