# LOSS FUNCTIONS

loss functions are critical for training models, as they quantify how well the model's predictions match the true labels or ground truth. By minimizing the loss function, the model learns to make more accurate predictions.

---

A loss function (also called an objective function or cost function) is a mathematical function used to measure the difference between the predicted output of the model and the actual target (the ground truth). The goal of training a machine learning model is to minimize the loss function, which means that the model’s predictions are getting closer to the true values.

The loss function can be thought of as a measure of error or discrepancy, guiding the learning process.

## Loss Functions for Regression

### Mean Squared Error Loss 

![image.png](attachment:image.png)

In [5]:
import numpy as np

def meansquarederror(ytrue,ypred):
    m = ytrue.shape[0]
    print("total no of training examples is: "+str(m))
    errorvector = ypred-ytrue
    print("shape of error vector: "+str(errorvector.shape))
    error = (1/m)*np.sum(errorvector**2)
    return error
    
    

In [2]:
genearatetrainingsets = np.random.randn(100,1)

In [3]:
genearatetrainingsetspred = np.random.randn(100,1)

In [6]:
meansquarederror(genearatetrainingsets,genearatetrainingsetspred)

total no of training examples is: 100
shape of error vector: (100, 1)


2.1653730427517477

#### advantages 

* **MSE is differentiable**
* **it has i local or global minimum**
* **it converge fast**

#### disadvantages

* **not robust to outliers**

### Mean absolute error loss

![image.png](attachment:image.png)

In [7]:
def meanabsoluteerror(ytrue,ypred):
    m = ytrue.shape[0]
    print("total no of training examples is: "+str(m))
    errorvector = ypred-ytrue
    print("shape of error vector: "+str(errorvector.shape))
    error = (1/m)*np.sum(np.abs(errorvector))
    return error

In [8]:
meanabsoluteerror(genearatetrainingsets,genearatetrainingsetspred)

total no of training examples is: 100
shape of error vector: (100, 1)


1.186828246009862

#### advantages 

* **robust to outliers**

#### disadvantages

* **convergence tack time more than mse `we use subgradient`**



## Huber Loss

![image.png](attachment:image.png)

In [9]:
def huber_loss(y_true, y_pred, delta=1.0):
    y_true = np.array(y_true)
    y_pred = np.array(y_pred)
    
    error = np.abs(y_true - y_pred)
    
    loss = np.where(error <= delta, 0.5 * error**2, delta * (error - 0.5 * delta))
    
    return np.mean(loss)

In [10]:
huber_loss(genearatetrainingsets,genearatetrainingsetspred)

0.7815482239900432

# Loss Functions for Classification

### Binary Cross-Entropy (Log Loss)

Binary cross-entropy is used for binary classification (e.g., predicting if an email is spam or not). It penalizes wrong predictions based on their confidence level. If the model is confident but wrong, the loss is high.

![image.png](attachment:image.png)

### Categorical Cross-Entropy

Categorical cross-entropy is used for multi-class classification problems. It calculates the loss for each class and sums them up. The model tries to minimize the difference between the true and predicted class probabilities.

![image.png](attachment:image.png)

### Sparse Categorical Cross Entropy

parse Categorical Cross-Entropy is a loss function commonly used in classification tasks, especially when dealing with multi-class classification problems. It's a variation of Categorical Cross-Entropy where the target labels are provided as integer values (i.e., class indices) rather than one-hot encoded vectors.

![image.png](attachment:image.png)