# General Linear Models

## Table of Contents:

1. [General Linear Models](#General-Linear-Models)
1. [Random Component](#Random-Component)

### General Linear Models

In the previous section, we learned how to create a logistic regression by using a logistic link function to bound the output of a multiple linear regression to be between $0$ and $1$.

There are many different types of link functions that change the properties and outputs of multiple linear regression. The two types that we've learned about so far are:
* Gaussian link: $f\left(\widehat{y}_i\right) = \left(\frac{1}{\sqrt{2 \pi \sigma^2}}\right)^n \exp\left({- \frac{\sum_{i = 1}^{n} \left(y_i - \widehat{y}_i \right)^2}{2 \sigma^2}}\right)$
* logistic link: $p \left(\widehat{y}_i \right) = \frac{1}{1 + e^{-\widehat{y}_i}}$

where in both cases, $\widehat{y}_i = \widehat{b}_0 + \widehat{b}_1 x_{1, i} + \widehat{b}_2 x_{2, i} + ... + \widehat{b}_k x_{k, i}$. Both of these link functions come from the **exponential family** of distributions. Together, all examples of the exponential family make up a broad class of models known as the **general linear models** (GLM). Many of the common distributions come from the exponential family and can thus be modelled as a GLM:
* normal
* exponential
* Bernoulli
* Poisson
* gamma
* beta
* $\chi^2$

The general linear model comprises of:
* random components
* linear predictor
* link function

### Random Component

The **random component** consists of observations $y_1, ..., y_n$ drawn from an exponential distribution with the form:

\begin{equation}
f \left(y_i, \theta_i, \phi \right) = exp \left(\frac{y_i \theta_i - b(\theta_i)}{a(\phi)} + c(y_i, \phi) \right)
\end{equation}

where $\theta_i$ is known as a **natural parameter**. This form of distribution makes the log-likelihood easy to find:

\begin{align}
ln \left( f \left(y_i, \theta_i, \phi \right) \right) &= ln \left( exp \left(\frac{y_i \theta_i - b(\theta_i)}{a(\phi)} + c(y_i, \phi) \right) \right) \\
&= \frac{y_i \theta_i - b(\theta_i)}{a(\phi)} + c(y_i, \phi)
\end{align}

The first two derivatives of the log-likelihood are:

\begin{equation}
\frac{\partial \left( ln \left( f \left(y_i, \theta_i, \phi \right) \right) \right)}{\partial \theta_i} = \frac{y_i - \frac{\partial \left(b(\theta_i)\right)}{\partial \theta_i}}{a(\phi)}
\end{equation}

and:

\begin{equation}
\frac{\partial^2 \left( ln \left( f \left(y_i, \theta_i, \phi \right) \right) \right)}{\partial \theta_i^2} = -\frac{1}{a(\phi)} \frac{\partial^2 \left(b(\theta_i)\right)}{\partial \theta_i^2}
\end{equation}

We set the expectation of each derivative to $0$ in order to obtain the least squares results. Setting the expectation of the first derivative to $0$ gives us the mean $(\mu_i)$:

\begin{align}
E \left\{ \frac{\partial \left( ln \left( f \left(y_i, \theta_i, \phi \right) \right) \right)}{\partial \theta_i} \right\} &= E \left\{ \frac{y_i - \frac{\partial \left(b(\theta_i)\right)}{\partial \theta_i}}{a(\phi)} \right\} = 0 \\
0 &= \frac{E \left\{ y_i - \frac{\partial \left(b(\theta_i)\right)}{\partial \theta_i} \right\} }{a(\phi)} \\
0 &= \frac{ E \left\{y_i \right\} - E \left\{ \frac{\partial \left(b(\theta_i)\right)}{\partial \theta_i} \right\}}{a(\phi)} \\
\frac{E \left\{y_i \right\}}{a(\phi)} &= \frac{E \left\{ \frac{\partial \left(b(\theta_i)\right)}{\partial \theta_i} \right\}}{a(\phi)} \\
E \left\{y_i \right\} &= E \left\{ \frac{\partial \left(b(\theta_i)\right)}{\partial \theta_i} \right\} = \frac{\partial \left(b(\theta_i)\right)}{\partial \theta_i} \\
\mu_i &= \frac{\partial \left(b(\theta_i)\right)}{\partial \theta_i} \\
\end{align}

The variance is a bit more complicated, so the derivation steps are skipped:

\begin{equation}
var\{y_i\} = a(\phi)\frac{\partial^2 \left(b(\theta_i)\right)}{\partial \theta_i^2}
\end{equation}