## Huber Loss

**Huber Loss**, also known as the *Huber function* or *Huber penalty*, is a loss function that combines the properties of Mean Squared Error (MSE) and Mean Absolute Error (MAE). It is <u>often used in regression problems where outliers may be present in the data</u>. Huber Loss provides a compromise between the robustness of **MAE** and the differentiability of **MSE**.

***
Huber Loss was introduced by **P.J. Huber** in 1964 as a way to overcome the drawbacks of using either **MSE** or **MAE** alone.
***

The formula for Huber Loss is as follows:

$$L_{\delta}(y, \hat y) = \begin{cases} 
\frac{1}{2}(y - \hat y)^2 & \text{if } |y - \hat y)| \leq \delta \\
\delta \cdot (\mid y - \hat y)\mid - \frac{1}{2}\delta^2) & \text{otherwise}
\end{cases}$$

In this formula $y$ represents the true target value, $\hat y)$ represents the predicted value, and <u>$\delta$ is a parameter that determines the threshold for switching from quadratic to linear loss</u>.

- When the absolute difference between the true and predicted values, $|y - \hat y|$, is less than or equal to $\delta$, Huber Loss uses the squared difference $\frac{1}{2}(y - \hat y)^2$ similar to MSE. This ensures that <u>small errors are penalized quadratically</u>, providing smooth and differentiable behavior.

- When the absolute difference is greater than $\delta$, Huber Loss uses the linear difference $\delta|y - \hat y| - \frac{1}{2}\delta^2$ similar to MAE. This linear behavior <u>reduces the influence of outliers</u> by penalizing them linearly instead of quadratically.

- The choice of the parameter $\delta$ determines the transition point between quadratic and linear behavior. Smaller values of $\delta$ make the loss more robust to outliers, but also reduce the influence of genuine errors.

### Additional insights on Huber Loss

1. **Applications**: Huber Loss is <span style="font-size: 11pt; color: green; font-weight: normal">commonly used in regression problems</span>, especially when the data may contain outliers or noise that can significantly affect the model's performance. It provides a compromise between the robustness of MAE and the differentiability of MSE.
    * While Huber Loss reduces the impact of outliers, it <span style="font-size: 11pt; color: orange; font-weight: normal">does not completely eliminate their influence</span>, which may still affect the model's performance.  

2. **Computer vision**: In computer vision tasks such as object detection or image segmentation, Huber Loss can be used to handle noisy or outlier annotations.

3. **Optimization**: Huber Loss is <span style="font-size: 11pt; color: green; font-weight: normal">convex and continuous</span>, making it well-suited for optimization.

4. **Choice of Delta**: When $\delta$ is set to a very large value (approaching infinity), Huber Loss approximates **MSE**. Conversely, when $\delta$ is set to zero, it becomes equivalent to **MAE**.
5. **Differentiability**: Huber Loss <span style="font-size: 11pt; color: green; font-weight: normal">is differentiable everywhere</span>, including at the transition point $|y - f(x)| = \delta$, where the derivative smoothly switches from the derivative of **MSE** to the derivative of **MAE**.

6. **Domain expertise**: The choice of the threshold parameter $\delta$ can be challenging. <span style="font-size: 11pt; color: orange; font-weight: normal">Selecting an appropriate value requires domain knowledge or experimentation</span>.


**Below we will compute Huber Loss with the help of Numpy library.**

### Importing libraries and preparing data

In [1]:
import numpy as np

In [2]:
y_true = [100, 100, 100, 100, 100, 100, 100, 100, 100, 100]
y_pred = [80, 100, 90, 95, 105, 101, 110, 99, 87, 100]

### Compute Huber Loss with Numpy

In [3]:
# Create a custom function to compute Huber Loss
def huber_loss(y_true, y_pred, delta):

    loss = np.zeros_like(y_true)
    
    for idx in range(len(loss)):
        residual = y_true[idx] - y_pred[idx]
        if np.abs(residual) <= delta:
            loss[idx] = 0.5 * (residual**2)
        else:
            loss[idx] = delta * (np.abs(residual) - (0.5 * (delta ** 2)))

    loss = np.mean(loss)
    return loss

# Set delta parameter
delta = 20

huber_loss = huber_loss(y_true, y_pred, delta)

print('Huber Loss:', huber_loss)

Huber Loss: 40.8
