## Coefficients in logistic regression

We saw that logistic regression is one way of modeling the probability data points belong to one of two classes.
The model used a data transform---the logistic function.
The logistic function transformed the an unbounded linear function $(\beta_{0}+\beta_{1}X + \cdots)$ onto the interval $[0,1]$.

The model looked like the following

$$
p(y_{i}=1|X_{i},\beta) = \frac{ e^{ \beta_{0} + \beta_{1}x_{i1}+ \beta_{2}x_{i2} + \cdots + \beta_{1}x_{in}}}{1+e^{ \beta_{0} + \beta_{1}x_{i1}+ \beta_{2}x_{i2} + \cdots + \beta_{1}x_{in}}}
$$

By manipulating the above function we found we could make the right hand side above look linear

$$
    \log \left(\frac{p}{1-p}\right) =  \beta_{0} + \beta_{1}x_{i1}+ \beta_{2}x_{i2} + \cdots + \beta_{1}x_{in}
$$

where $p = p(y_{i}=1|X_{i},\beta)$

## Goal

Our next goal is to understand and interpret the coefficients $\beta_{i}$.

### Taking a similar approach as linear regression

In multiple linear regression we can interpret $\beta$s the following way.
A change in the variable $X$ by one unit would result in a corresponding change in $Y$ of $\beta$ units.

We can derive the reason for this interpretation and then apply the same procedure to our logistic regression.

Consider the following multiple linear regression

\begin{align}
    y &= \beta_{0} + \beta_{1}x_{1} + \beta_{2}x_{2} + \epsilon\\
    \epsilon &\sim N(0,\sigma^{2})
\end{align}

To understand where the above "1-unit change" interpretation comes from, we can take the difference between two regression models: The above regression model for x values $x_{1}^{*}$ and $x_{2}^{*}$ and a second regression model for x values $x_{1}^{*}$ and $x_{2}^{*}+1$.

The first regression model is

\begin{align}
    y^{1} &= \beta_{0} + \beta_{1}x_{1}^{*} + \beta_{2}x_{2}^{*} + \epsilon\\
    \epsilon &\sim N(0,\sigma^{2})
\end{align}

and the second model is

\begin{align}
    y^{2} &= \beta_{0} + \beta_{1}x_{1}^{*} + \beta_{2}(x_{2}^{*}+1) + \epsilon\\
    \epsilon &\sim N(0,\sigma^{2})
\end{align}


The difference between these two models is

$$
y^{2} - y^{1} = \beta_{0} + \beta_{1}x_{1}^{*} + \beta_{2}(x_{2}^{*}+1) + \epsilon - (\beta_{0} + \beta_{1}x_{1}^{*} + \beta_{2}x_{2}^{*} + \epsilon) = \beta_{2}
$$

When we change the $x_{2}$ value by one unit, the difference in y values equals $\beta_{2}$.

Lets apply the same reasoning to our logistic regression.
Our first logistic regression will take x values $x_{1}^{*}$ and $x_{2}^{*}$ and out second model will take x values $x_{1}^{*}$ and $x_{2}^{*}+1$. 

Model $1$ is 

$$
\log \left(\frac{p^{1}}{1-p^{1}}\right) =  \beta_{0} + \beta_{1}x_{1}^{*}+ \beta_{2}x_{2}^{*}
$$

and the second model is

$$
\log \left(\frac{p^{2}}{1-p^{2}}\right) =  \beta_{0} + \beta_{1}x_{1}^{*}+ \beta_{2}(x_{2}^{*}+1)
$$

The difference between model $2$ and model $1$ equals

$$
\log \left(\frac{p^{2}}{1-p^{2}}\right) - \log \left(\frac{p^{1}}{1-p^{1}}\right) = \beta_{2}
$$

We can make the expression on the left side of the equals simpler, using properties of the logarithm.

$$
\log \left(\frac{ \frac{p^{2}}{1-p^{2}}}{ \frac{p^{1}}{1-p^{1}}}\right) = \beta_{2}
$$

The expression on the left is called the **log odds ratio**.
Often both sides are exponentiated.

$$
\frac{ \frac{p^{2}}{1-p^{2}}}{ \frac{p^{1}}{1-p^{1}}} = e^{\beta_{2}}
$$

The expression on the left is called the **odds ratio**.

## $\beta$

When $\beta$ equals $0$ 

$$
\frac{ \frac{p^{2}}{1-p^{2}}}{ \frac{p^{1}}{1-p^{1}}} = e^{0} = 1
$$

The ratio on the left can equal one only if both $p^2$ and $p^1$ are equal.
When $\beta$ equals $0$ the corresponding explanatory variable has no affect on the probability a data point belongs to group $1$ versus group $0$. 

When $\beta$ is a positive number, for example $\beta=2$, then for a one unit increase in the x value the odds this data point belongs to group 1 increases by $e^{2}$ or roughly 7.3.
This is a bit unsatisfying. 
Odds are more difficult to interpret.
Typically, I like to use the following analogy when interpreting coefficients in logistic regression.

Suppose the original probability of belonging to group $1$ versus $0$ was 50\% $(p^1 = 0.50)$.
Then 

\begin{align}
\frac{ \frac{p^{2}}{1-p^{2}}}{ \frac{0.50}{1-0.50}} &= e^{2}\\ 
  \frac{p^{2}}{1-p^{2}} &= 7.3\\
  p^{2}(1+7.3) &= 7.3\\
  p^2 &= 7.3/(1+7.3) = 88\%
\end{align}

The probability increases from 50\% to 88\%.
