# Generalized Linear Models

In a generalized linear model (GLM), each outcome $Y$ of the dependent variable is assumed to be generated from a particular distribution from the _exponential family_, a large class of probability distributions that includes the normal, binomial, Poisson and gamma distributions, among others. The mean, $\mu$, of the distribution depends on the independent variables, $X = (X_0, X_1, X_2, \ldots, X_p)$, through:
$$
    \mathrm{E} ( Y \mid X ) = \mu = g^{-1}(X \beta), 
$$
where $\mathrm{E} ( Y \mid X )$ is the expected value of $Y$ conditional on $X$; $X \beta$ is a linear combination of unknown parameters $\beta$; $g$ is the _link_ function.

In this framework, the variance is typically a function, $V$, of the mean:
$$
    \mathrm{Var} ⁡ ( Y \mid X ) = V ⁡( \mu ) = V ⁡ ( g^{-1}(X \beta)) . 
$$
It is convenient if V follows from an exponential family of distributions, but it may simply be that the variance is a function of the predicted value. 

The GLM consists of three elements:

1. A probability distribution from the exponential family.
2. A linear predictor $\eta = X \beta$
3. A link function $g$ such that $\mathrm{E} ( Y \mid X ) = \mu = g^{-1}( \eta )$.

[Adapted from Wikipedia]

# The Exponential Family

The exponential family consists of several well-known distributions, both discrete and continuous. Every member of this family is a maximum entropy distribution for some set of constraints. A pdf or pmf $p( y \mid \theta)$, where $y \in \mathcal{Y} \subseteq R^{m}$ and $\theta \in R^{d}$ is in the exponential family if it is of the form:
$$
    p (y \mid \theta) 
        = \frac{1}{Z(\theta)} h(y) \text{exp} ( \theta^T \phi(y)),
$$
where 
$$
    Z(\theta) = \int_{\mathcal{Y}} h(y) \text{exp} ( \theta^T \phi(y)) dy.
$$