###  Islands: Generalized Linear Models - Chapter 1

[Back to Main Page](0_main_page.ipynb)

<br>

<h1> <center> What are Generalized Linear Models
    ? </center> </h1>  

## Generalized Linear Models

Generalized linear models are extensions of the linear regression model that relax some of the assumptions of linear regression, and as a result can be applied to a wider variety of datasets. Linear regression involves modelling the linear relation between a scores on a quantitative outcome variable ($y$) and scores on a set of predictor variables ($x_1, x_2 ... x_k$). The predictor variables can be of any type (quantitative-continuous, quantitative-discrete, nominal-categorical, ordinal-categorical). Linear regression uses a linear prediction equation of the form:

$\large \hat{y}_i = b_0 + b_1x_{1i} ... + b_kx_{ki} $

where:

$\hat{y_i} $ : is the predicted value of the outcome variable for a given set of predictor scores, for the $i$th observation

$b_0$ : is the intercept term, the predicted value of the outcome variable when all predictors equal 0

$b_1$ : is the slope of the 1st predictor variable

$x_{1i}$ : is the score on the the first predictor variable, for the $i$th observation

$b_k$ : is the slope of the $k$th predictor variable

$x_{ki}$ : is the score on the $k$th predictor variable, for the $i$th observation

Linear regression assumes a normal distribution of the residuals - the differences between the model predictions and the actual datapoints $(y_i - \hat{y_i})$. Generalized linear models relax this (and other) assumptions, and so extend the machinery of linear regression to allow us to use a linear prediction equation to predict a response distribution that is non-normal.

## The 'Conditional Distribution' view of regression models

Linear regression is often introduced via the sum of squares perspective, whereby the parameter estimates (the intercept and slopes) are obtained by minimizing:

$ \large \sum\limits_{i = 1}^{n} (y_i - \hat{y}_i)$

or equivalently:

$ \large \sum\limits_{i = 1}^{n} (y_i - (b_0 + b_1x_{1i} ... + b_kx_{ki})) $

This works for linear regression, but a different perspective is required to understand parameter estimation in generalized linear models. I will refer to this as the 'conditional distribution' perspective. This is easiest to visualize for a linear model containing only one predictor. 

The conditional distribution perspective is that linear regression fits a normal distribution for each level of the predictor, and that the linear regression line runs through the mean of each normal distribution:


![](images/GLM_normal_identity.png)
(Image from: https://blogs.sas.com/content/iml/2015/09/10/plot-distrib-reg-model.html)

If the predictor is useful, then the conditional means will change a lot as the value of the predictor changes (e.g. the predictor gives a lot of information about the value of the outcome variable):

![](images/linear_regression_ML.png)

If the predictor is not very useful, then the conditional means will not change a lot as the value of the predictor changes (e.g. the predictor gives very little information about the value of the outcome variable):

![](images/linear_regression_ML_null.png)

We'll come onto the how these distributions are fit in later pages. For linear regression, the parameters estimates produced via the conditional distribution approach and via the sum of squares approach are equivalent. 

Generalized linear models work by allowing other distributional forms (other than normal distributions) to be fit to the outcome variable.

Some other ways of writing the linear regression are useful here. Because the linear regression model predicts the conditional mean as a function of the predictor variables, we can write the model as:

$\large \mu_i = b_0 + b_1x_{1i} ... + b_kx_{ki} $

Generalized linear models -of which the linear regression model is a special case - predict *some function of the conditional mean*. In the case of linear regression, this is function is just $1 *\mu $, and so is referred to as the *identity link function*. Thus, the general form of generalized linear models is:

$\large f(\mu_i) = b_0 + b_1x_{1i} ... + b_kx_{ki} $

For linear regression, the predictions the model can produce range across the entire real line $(-∞, ∞)$. This would produce nonsensical predictions for certain types of outcome variable. For instance, negative valued predictions do not make sense for binary outcomes that fall into one of two categories, and are dummy-coded as 0 or 1. Nor do negative predictions make sense for count outcomes, which take only whole-number positive values.

The link function in generalized linear models maps the linear prediction equation to a function of the conditional mean which ranges from $(-∞, ∞)$. For instance, when dealing with binary outcome variables, we can use the logit link function $ln(\frac{μ} {1 – μ})$. This maps the linear prediction equation to the log of the odds ratio, and ranges from $(-∞, ∞)$.

The table below shows some of the generalized linear models that are covered by this textbook:

| Model type       | Response distribution             | Range of predicted values               | Link name        | Link function         |
|--------------------------|--------------------------|------------------------|------------------|-----------------------|
| Linear Regression        | Normal Distribution      | real: (-∞, ∞)          | Identity         | 1 * $μ$                   |
| Logistic Regression      | Bernoulli                | Integers: {0, 1}       | Logit          | $ln(\frac{μ} {1 – μ})$ |
| Poisson Regression       | Poisson Distribution     | integers: 0, 1, 2, …   | Log              | $ln(μ) $              |

## The structure of this book

This book will use simulated data, which cleanly illustrates the principles of fitting generalized linear models. Each page will focus on a different type of generalized linear model, and will involve data from a different hypothetical island. Each model will be explained first in the one-predictor case, using a continuous predictor, then in using several predictors, one of which will be categorical. 

Let's visit the islands, the links are below...

## Other Chapters

1. [What are Generalized Linear Models?](1_generalized_linear_models.ipynb)
2. [Linear Regression](2_linear_regression.ipynb)
3. [Poisson Regression](3_poisson_regression.ipynb)
4. [Binary Logistic Regression](4_binary_logistic_regression.ipynb)
5. [Multinomial Logistic Regression](5_multinomial_logistic_regression.ipynb)

***
By [pxr687](99_about_the_author.ipynb) 