# Metrics and Loss Functions from First Principles

This is a series of math function recreations for practice.

<br>
<br>

In [1]:
import numpy as np

<br>

# Mean Squared Error
AKA **L2** Loss

Notes:
1. Tends to exaggerate outliers (nice for interpretation as a metric, but...)
2. On it's own, it can be poor as a loss function because it will adjust models based on exaggerated noise.
3. You should interpret it by it's shape: If the loss is high, MSE will report it *very* high
4. Linearize it by taking the square root for interpretability and removal of exaggeration of noise/outliers.

<br>

$$
\large\begin{equation}
MAE = \frac{1}{n}\sum_{i=1}^{n}(y_i - \widehat{y_i})^2  \\\\
\end{equation}
$$

<br>

$$

\small\begin{aligned}
Where, \\
\widehat{y_i} &- i^{th}\, predicted\, output \\
y_i &- i^{th}\, actual\, output \\
n &- number\, of\, observations\, in\, sample\, or\, batch \\
\end{aligned}
$$

In [2]:
def mse (list1, list2):
    
    assert(len(list1) == len(list2))
    assert(len(list1) != 0)
    
    n = len(list1)
    inner = [(yi - yhati)**2 for yi, yhati in zip(list1, list2)]
    return sum(inner) / n
    

<br>
<br>

# Mean Absolute Error
AKA **L1** Loss

$$
\large\begin{equation}
MAE = \frac{1}{n}\sum_{i=1}^{n}\mid y_i - \widehat{y_i}\mid  \\\\
\end{equation}
$$

<br>

$$

\small\begin{aligned}
Where, \\
\widehat{y_i} &- i^{th}\, predicted\, output \\
y_i &- i^{th}\, actual\, output \\
n &- number\, of\, observations\, in\, sample\, or\, batch \\
\end{aligned}
$$

In [3]:
def mae (list1, list2):
    
    assert(len(list1) == len(list2))
    assert(len(list1) != 0)
    
    n = len(list1)
    inner = [np.abs(yi - yhati) for yi, yhati in zip(list1, list2)]
    return sum(inner) / n

<br>
<br>

# Huber Loss / Metric
Aka the **Smooth L1** Loss

Huber is basically a combination of L1 and L2 loss functions.  As our loss approaches the minimum, it will use the MSE and as the loss increases, it will switch over to the MAE - thus kind of giving the best of both worlds.


$$
\large\begin{equation}\begin{align*}
Huber \quad & = L_\delta(y, \widehat{y}) 
& = \begin{cases}
\:\frac{1}{2}(y - \widehat{y})^2 \quad\quad\quad\quad &for \;\lvert y-\widehat{y}\rvert \,\le\, \delta \\
\:\delta\lvert y - \widehat{y}\rvert - \frac{1}{2}\delta^2 \quad\quad &for \;\lvert y-\widehat{y}\rvert \,\gt\, \delta
\end{cases}
\end{align*}
\end{equation}
$$

<br>

$$

\small\begin{aligned}
Where, \\
\widehat{y_i} &- i^{th}\, predicted\, output \\
y_i &- i^{th}\, actual\, output \\
n &- number\, of\, observations\, in\, sample\, or\, batch \\
\end{aligned}
$$

<br>
<br>
<br>

# Unit Tests

In [4]:

from numpy.random import default_rng
from sklearn.metrics import mean_squared_error, mean_absolute_error
import unittest

class TestEntireNotebook(unittest.TestCase):
    
    def test_mse(self):
        # Generate random Lists
        rng = default_rng()
        list1 = rng.standard_normal(1000)
        list2 = rng.standard_normal(1000)

        for i in range(100):
            self.assertEqual(
                np.round(mse(list1,list2),5), 
                np.round(mean_squared_error(list1,list2),5)
            )


    def test_mae(self):
        # Generate random Lists
        rng = default_rng()
        list1 = rng.standard_normal(1000)
        list2 = rng.standard_normal(1000)

        for i in range(100):
            self.assertEqual(
                np.round(mae(list1,list2),5), 
                np.round(mean_absolute_error(list1,list2),5)
            )
    
# Run the unit tests
unittest.main(argv=[''], verbosity=2, exit=False)

test_mae (__main__.TestEntireNotebook) ... ok
test_mse (__main__.TestEntireNotebook) ... ok

----------------------------------------------------------------------
Ran 2 tests in 0.158s

OK


<unittest.main.TestProgram at 0x7fbcc05edd00>