### Cost Function: Definition, Intuition & Examples

| **Aspect**             | **Details**                                                                                                                                                                                                                                  |
| ---------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Definition**         | A **Cost Function** (also known as a **Loss Function**) is a mathematical function used to evaluate the performance of a machine learning model by quantifying the error between predicted and actual outputs.                               |
| **Objective**          | Minimize the cost to improve the model's predictive accuracy during training. Lower cost implies better performance.                                                                                                                         |
| **Mathematical Form**  | Let $\hat{y}_i$ be the predicted output and $y_i$ be the actual output. For $n$ samples:<br>**MSE (Mean Squared Error):**<br>$J(\theta) = \frac{1}{n} \sum_{i=1}^{n} (\hat{y}_i - y_i)^2$                                                    |
| **Types (Examples)**   | • **MSE (Mean Squared Error)** – Regression<br>• **MAE (Mean Absolute Error)** – Regression<br>• **Binary Cross-Entropy** – Binary Classification<br>• **Categorical Cross-Entropy** – Multi-class Classification<br>• **Hinge Loss** – SVMs |
| **Use Case Examples**  | • Predicting house prices (MSE)<br>• Image classification (Cross-Entropy)<br>• Sentiment analysis (Binary Cross-Entropy)                                                                                                                     |
| **Intuition**          | The cost function acts like a **compass**, guiding the optimization algorithm (e.g., Gradient Descent) on how to update model parameters to reduce prediction error.                                                                         |
| **Role in Training**   | During each epoch of training, the optimizer calculates the gradient of the cost function w\.r.t. model parameters and updates the weights to minimize the cost.                                                                             |
| **Visualization**      | Often represented as a **convex curve** for simple models; the global minimum corresponds to the optimal parameter values.                                                                                                                   |
| **Code Example (MSE)** | `python<br>import numpy as np<br>def mse(y_true, y_pred):<br>    return np.mean((y_true - y_pred) ** 2)<br>`                                                                                                                                 |


In [None]:
### Example for cost function in Python - Mean Squared Error (MSE) – Regression
# This script defines a simple cost function that calculates the mean squared error
# between predicted and actual values.
import numpy as np
def cost_function(predicted, actual):
    """
    Calculate the mean squared error between predicted and actual values.

    Parameters:
    predicted (np.ndarray): Predicted values.
    actual (np.ndarray): Actual values.

    Returns:
    float: Mean squared error.
    """
    if len(predicted) != len(actual):
        raise ValueError("Predicted and actual arrays must have the same length.")
    
    mse = np.mean((predicted - actual) ** 2)
    return mse

# Example usage
if __name__ == "__main__":
    # Sample data
    predicted_values = np.array([3.0, -0.5, 2.0, 7.0])
    actual_values = np.array([2.5, 0.0, 2.0, 8.0])
    
    # Calculate cost
    cost = cost_function(predicted_values, actual_values)
    print(f"Mean Squared Error: {cost}")

Mean Squared Error: 0.375


In [2]:
## Example Code Snippet for cost function in Python - Mean Absolute Error (MAE) – Regression
import numpy as np
def cost_function_mae(predicted, actual):
    """
    Calculate the mean absolute error between predicted and actual values.

    Parameters:
    predicted (np.ndarray): Predicted values.
    actual (np.ndarray): Actual values.

    Returns:
    float: Mean absolute error.
    """
    if len(predicted) != len(actual):
        raise ValueError("Predicted and actual arrays must have the same length.")
    
    mae = np.mean(np.abs(predicted - actual))
    return mae

# Example usage for MAE
if __name__ == "__main__":
    # Sample data
    predicted_values = np.array([3.0, -0.5, 2.0, 7.0])
    actual_values = np.array([2.5, 0.0, 2.0, 8.0])
    
    # Calculate cost
    cost_mae = cost_function_mae(predicted_values, actual_values)
    print(f"Mean Absolute Error: {cost_mae}")

Mean Absolute Error: 0.5


In [3]:
## Code Snippet for cost function in Python - Huber Loss – Regression
import numpy as np
def cost_function_huber(predicted, actual, delta=1.0):
    """
    Calculate the Huber loss between predicted and actual values.

    Parameters:
    predicted (np.ndarray): Predicted values.
    actual (np.ndarray): Actual values.
    delta (float): Threshold parameter for Huber loss.

    Returns:
    float: Huber loss.
    """
    if len(predicted) != len(actual):
        raise ValueError("Predicted and actual arrays must have the same length.")
    
    error = predicted - actual
    is_small_error = np.abs(error) <= delta
    squared_loss = 0.5 * error[is_small_error] ** 2
    linear_loss = delta * (np.abs(error[~is_small_error]) - 0.5 * delta)
    
    huber_loss = np.mean(np.concatenate((squared_loss, linear_loss)))
    return huber_loss

# Example usage for Huber Loss
if __name__ == "__main__":
    # Sample data
    predicted_values = np.array([3.0, -0.5, 2.0, 7.0])
    actual_values = np.array([2.5, 0.0, 2.0, 8.0])
    
    # Calculate cost
    cost_huber = cost_function_huber(predicted_values, actual_values)
    print(f"Huber Loss: {cost_huber}")

Huber Loss: 0.1875


In [4]:
## Code Snippet for cost function in Python -  Binary Cross-Entropy – Binary Classification
import numpy as np
def cost_function_binary_cross_entropy(predicted, actual):
    """
    Calculate the binary cross-entropy loss between predicted and actual values.

    Parameters:
    predicted (np.ndarray): Predicted probabilities (between 0 and 1).
    actual (np.ndarray): Actual binary labels (0 or 1).

    Returns:
    float: Binary cross-entropy loss.
    """
    if len(predicted) != len(actual):
        raise ValueError("Predicted and actual arrays must have the same length.")
    
    # Clip predicted values to avoid log(0)
    predicted = np.clip(predicted, 1e-15, 1 - 1e-15)
    
    bce = -np.mean(actual * np.log(predicted) + (1 - actual) * np.log(1 - predicted))
    return bce

# Example usage for Binary Cross-Entropy
if __name__ == "__main__":
    # Sample data
    predicted_probabilities = np.array([0.9, 0.1, 0.8, 0.6])
    actual_labels = np.array([1, 0, 1, 0])
    
    # Calculate cost
    cost_bce = cost_function_binary_cross_entropy(predicted_probabilities, actual_labels)
    print(f"Binary Cross-Entropy Loss: {cost_bce}")

Binary Cross-Entropy Loss: 0.3375388286260043


In [5]:
## code snippet for cost function in Python - Categorical Cross-Entropy – Multi-class Classification
import numpy as np
def cost_function_categorical_cross_entropy(predicted, actual):
    """
    Calculate the categorical cross-entropy loss between predicted and actual values.

    Parameters:
    predicted (np.ndarray): Predicted probabilities for each class.
    actual (np.ndarray): Actual class labels (one-hot encoded).

    Returns:
    float: Categorical cross-entropy loss.
    """
    if len(predicted) != len(actual):
        raise ValueError("Predicted and actual arrays must have the same length.")
    
    # Clip predicted values to avoid log(0)
    predicted = np.clip(predicted, 1e-15, 1 - 1e-15)
    
    cce = -np.mean(np.sum(actual * np.log(predicted), axis=1))
    return cce

# Example usage for Categorical Cross-Entropy
if __name__ == "__main__":
    # Sample data
    predicted_probabilities = np.array([[0.7, 0.2, 0.1],
                                         [0.1, 0.6, 0.3],
                                         [0.2, 0.3, 0.5]])
    actual_labels = np.array([[1, 0, 0],
                              [0, 1, 0],
                              [0, 0, 1]])
    
    # Calculate cost
    cost_cce = cost_function_categorical_cross_entropy(predicted_probabilities, actual_labels)
    print(f"Categorical Cross-Entropy Loss: {cost_cce}")

Categorical Cross-Entropy Loss: 0.5202159160882228
