In [2]:
run load_libs.py

<IPython.core.display.Javascript object>

Poisson loss is commonly used in [Poisson
regression](https://en.wikipedia.org/wiki/Poisson_regression#:~:text=Poisson%20regression%20assumes%20the%20response,used%20to%20model%20contingency%20tables.).

Definition:


\begin{align}
\mathbb{E}[L] = \iint \left( y(\mathbf{x}) - t \ln y(\mathbf{x}) \right) p(\mathbf{x}, t) d\mathbf{x} dt
\end{align}

To minimize it, set the derviative wrt. $y(\mathbf{x})$ to zero,


\begin{align}
\frac{\partial \mathbb{E}[L]}{\partial y(\mathbf{x})}
= \int \left( 1 - t \frac{1}{y(\mathbf{x})} \right ) p(\mathbf{x}, t)dt
&= 0 \\
\int \left( y(\mathbf{x}) - t \right ) p(\mathbf{x}, t)dt
&= 0 \\
y(\mathbf{x})
&= \mathbb{E}[t|\mathbf{x}]
\end{align}

So when minimizing the possion loss, the model is also trying to predict the
conditional mean of $t$ given $\mathbf{x}$. But different from MSE or log loss,
both $t$ and $y(\mathbf{x}) \in (0, \infty)$.

# Visualization

In [3]:
def poisson_loss(y, t):
    return y - t * np.log(y)

In [62]:
ys = np.concatenate(
    [
        # sample more points close to 0
        np.arange(1e-6, 1, 0.01),
        np.arange(1, 100, 1),
    ]
)


dfs = []
for target in [0.1, 0.5, 1, 2, 5, 10, 20, 30]:
    _df = pd.DataFrame(
        {
            "ys": ys,
            "poisson_loss": poisson_loss(ys, target),
        }
    ).assign(target=target, delta=ys - target)
    dfs.append(_df)

df_plot = pd.concat(dfs)

In [63]:
alt.Chart(df_plot, height=120, width=150).mark_line().encode(
    x=alt.X("delta:Q", title="Δ = y(x) - t"),
    y="poisson_loss",
    
).facet(facet="target", columns=4)

Note, when $y(\mathbf{x}) \rightarrow 0$, the loss goes to $\infty$.