---
title: Estimating Coefficients for Poisson GLMs from Scratch
date: 2024-09-16
description: Estimating coefficients for count response models from scratch
categories: [Statistical Modeling, Python]
highlight-style: dracula
---


Generalized Linear Models (GLMs) are a broad class of models that extend linear regression to handle a wider range of response variables. They allow the response to have a distribution other than the normal distribution, which makes them useful for various types of data. GLMs assume that the response $Y$ follows a distribution from the exponential family, which includes:

- Normal distribution (for continuous data)
- Binomial distribution (for binary or proportion data)
- Poisson distribution (for count data)
- Gamma distribution (for continuous positive data)

The model defines a linear predictor $\eta$, which is a linear combination of the explanatory variables:

$$
\eta = \beta_{0} + \beta_{1}X_{1} + \beta_{2}X_{2} + \dots + + \beta_{p}X_{p},
$$


where $\beta_{i}$'s are the parameters to be estimated and $X_{i}$ the predictors. 

GLMs use a link function to connect the linear predictor $\eta$ to the mean of the response variable $\mu$. 
The link function ensures that the predictions stay within valid bounds for the target distribution. For any GLM, we have:

- A random component that specifies the distribution of the data.
- A systematic component that relates explanatory variables to the response.
- A link function that connects the linear predictor to the mean of the response variable.


GLM coefficients are typically estimated using Iteratively Reweighted Least Squares (IRLS). The IRLS method solves for parameters in models where the response variable does not follow a normal distribution and where the relationship between the predictors and the response variable is non-linear. In what follows, we demonstrate IRLS for count response data, and compare our results against statsmodels to ensure consistency. 


### Count Response Models






