# Generalized Linear Models

In [4]:
import matplotlib.pyplot as plt
import numpy as np
import statsmodels.api as sm

Linear regressions have limitations.

As it stands, the algorithm could generate a prediction *anywhere on the real number line*. This *may* be realistic, like if I'm predicting national surpluses/debts.

But what if I'm predicting values of a variable that doesn't take, say, negative values, like temperature in Kelvin? (Or, gulp, housing prices?)

What if I'm predicting values of a variable that takes only integer values, like the number of mouseclicks on my killer ds blog per minute?

What if I'm predicting probabilities? Or something Boolean / Bernoullian?

What if my the shape of my errors changes as a function of the value of the dependent variable?

Am I stuck using linear regression? There's got to be a better way!

The strategy now is to *generalize* the notion of linear regression; regression will become a special case. In particular, we'll keep the idea of the regression best-fit line, but now **we'll allow the model to be constructed from the dependent variable through some (non-trivial) function of the linear predictor**. This function is standardly called the **link function**.

Let's say we've constructed our best-fit line, i.e. our linear predictor, $\hat{L} = \beta_0 + \beta_1x_1 + ... + \beta_nx_n$.

## Logistic Regression

Consider the following transformation: <br/>
$\large\hat{y} = \Large\frac{1}{1 + e^{-\hat{L}}} \large= \Large\frac{1}{1 + e^{-\beta_0 + ... + \beta_nx_n}}$. This is called the **sigmoid function**.

We're imagining that $\hat{L}$ can take any values between $-\infty$ and $\infty$.

$\large\rightarrow$ But what values can $\hat{y}$ take? What does this function even look like?

In [11]:
# Let's plot this function here:



How do we fit a line to our dependent variable if its values are already stored as probabilities? We can use the inverse of the sigmoid function, and just set our regression equation equal to that. The inverse of the sigmoid function is called the **logit function**, and it looks like this:

$\large f(y) = \ln\left(\frac{y}{1 - y}\right)$. Notice that the domain of this function is $(0, 1)$.

$\hspace{110mm}$(Quick proof that logit and sigmoid are inverse functions:

$\hspace{170mm}x = \frac{1}{1 + e^{-y}}$; <br/>
$\hspace{170mm}$so $1 + e^{-y} = \frac{1}{x}$; <br/>
$\hspace{170mm}$so $e^{-y} = \frac{1 - x}{x}$; <br/>
$\hspace{170mm}$so $-y = \ln\left(\frac{1 - x}{x}\right)$; <br/>
$\hspace{170mm}$so $y = \ln\left(\frac{x}{1 - x}\right)$.)

Our regression equation will now look like this:

$\large\ln\left(\frac{y}{1 - y}\right) = \beta_0 + \beta_1x_1 + ... + \beta_nx_n$.

This equation is used for a **logistic regression**: Its characteristic link function is this logit function.

## Poisson Regression

Here's a different sort of regression equation:

$\large\ln(y) = \beta_0 + \beta_1x_1 + ... + \beta_nx_n$. The link function is simply $\ln(y)$ and so we have:

$\large\hat{y} = e^\hat{L} = e^{\beta_0 + ... + \beta_nx_n}$.

The domain, or "support", for a Poisson distribution is {0, 1, 2, ... }. Can you see why?

In [None]:
sm.GLM

Wikipedia has a very helpful page about generalized linear models! <br/> Access it here: https://en.wikipedia.org/wiki/Generalized_linear_model