In [1]:
%matplotlib inline


# Generalized Linear Model


## Poisson Regression
Poisson Regression involves regression models in which the response variable is in the form of counts.
For example, the count of number of car accidents or number of customers in line at a reception desk.
The response variables is assumed to follow a Poisson distribution.

The general mathematical equation for Poisson regression is

\begin{align}\log(E(y)) = \beta_0 + \beta_1 X_1+\beta_2 X_2+\dots+\beta_p X_p.\end{align}

With $n$ independent data of the explanatory variables $x$ and the response variable $y$, we can estimate $\beta$ by minimizing the negative log-likelihood function under sparsity constraint:
$$
\arg \min _{\beta \in R^p} L(\beta):=-\frac{1}{n} \sum_{i=1}^n\left\{y_i x_i^T \beta-\exp \left(x_i^T \beta\right)-\log  \left(y!\right)\right\}, \text { s.t. }\|\beta\|_0 \leq s .
$$

Here is Python code for solving sparse poisson regression problem:

In [3]:
import numpy as np
from abess.datasets import make_glm_data
import jax.numpy as jnp
from scope import ScopeSolver
np.random.seed(1)

n = 100
p = 10
s = 3
data = make_glm_data(n=n, p=p, k=s, family="poisson")
X = data.x
y = data.y
# Define function to calculate negative log-likelihood of poisson regression
def poisson_loss(params):
    xbeta = jnp.clip(X @ params, -30, 30)
    return jnp.mean(jnp.exp(xbeta) - y * xbeta) #omit \log y! term


solver = ScopeSolver(p, s)
solver.solve(poisson_loss, jit=True)

print("True support set: ", np.nonzero(data.coef_)[0])
print("True parameters: ", data.coef_)
print("True loss value: ", poisson_loss(data.coef_))
print("Estimated support set: ", np.sort(solver.support_set))
print("Estimated parameters: ", solver.params)
print("Estimated loss value: ", poisson_loss(solver.params))

True support set:  [0 5 9]
True parameters:  [4.70030694 0.         0.         0.         0.         8.30570366
 0.         0.         0.         3.78436768]
True loss value:  0.5956122
Estimated support set:  [3 5 9]
Estimated parameters:  [ 0.          0.          0.         -3.99304304  0.          8.65190727
  0.          0.          0.          4.95582619]
Estimated loss value:  0.5782885
