# **Categorical, Continuous, and Multiple Predictors**

---
1. Categorical Predictor
2. Continuous Predictor
3. Multiple Predictors

## **1. Categorical Predictor**
---

### **1.1. Binary Predictor**
---

- Predictor $(X)$ : independent variable, explanatory variable, feature.
- Binary predictor : categorical predictor with only 2 categories.
- Reference cell coding ("zero-one" coding)
  - Assigns the zero value to the lower code for $x$ as a reference, and one to the higher code.
  - $x=0$ as the reference.
- **Interpretation**:

> The estimate of the odds ratio between category 1 and category 0 is $\text{OR}=\exp(\beta_{1})$.

### **1.2. Multicategory Predictor**
---

- Multicategory predictor : categorical predictor with more than 2 categories.
- Reference cell coding:
  - Assigns the zero value as the reference category for $x$ and use indicator/dummy variables for others.
  - Thus, a predictor with $c$ categories will have one reference category and $c-1$ indicator/dummy variables.
- For example, the original label/code for predictor Color $(c)$ with 4 categories:

<center>

|Color $(x)$|Code|
|:--:|:--:|
|Medium Light|1|
|Medium|2|
|Medium Dark|3|
|Dark|4|

</center>

- With reference cell coding, predictor Color become 3 indicator/dummy variable. Color **Dark as reference**:

<center>

|Color|$c_{1}$|$c_{2}$|$c_{3}$|
|:--:|:--:|:--:|:--:|
|Medium Light|1|0|0|
|Medium|0|1|0|
|Medium Dark|0|0|1|
|Dark|0|0|0|

</center>

- Thus, the logit model becomes:

$$
\text{log(odds)} = \beta_{0}+\beta_{1}c_{1}+\beta_{2}c_{2}+\beta_{3}c_{3}
$$

- With reference cell coding,
  - $c_{1}=1$ for color = medium light, $0$ otherwise
  - $c_{2}=1$ for color = medium, $0$ otherwise
  - $c_{3}=1$ for color = medium dark, $0$ otherwise
  - Color is dark when $c_{1}=c_{2}=c_{3}=0$

- **Interpretation**:
> The estimate of the odds ratio between category $k$ and reference category is $\text{OR}(k,\text{reference})=\exp(\beta_{k})$, with $k=1,2,\dots,c$.

- In general, to interpret the odds ratio between two category:
  1. Calculate the logit difference.
  2. Interpret in terms of the odds ratio.
- Thus, the odds ratio between two category, say category $a$ and $b$ is $\text{OR}(a,b)=\exp(\beta_{a}-\beta_{b})$

- **Interpretation**:
> The odds of success at $x$ = category $a$  equals $\exp(\beta_{a}-\beta_{b})$ times the odds of success at $x$ = category $b$.


### **1.3. Ordinal Predictor**
---

- Ordinal predictor : predictor with ordered scale categories, e.g: low, medium, high.
- Ordinal predictors treated in a quantitative manner (continuous scale).
- Thus the code:
  - $x=1$ --> low
  - $x=2$ --> medium
  - $x=3$ --> high

## **2. Continuous Predictor**
---

### **2.1. The Intercept**
---

- Intercept value is the logit value when $x=0$.
$$
\begin{align*}
\text{logit}(\pi(x)) &= \beta_{0}+\beta_{1}(x) \\
\text{logit}(\pi(0)) &= \beta_{0}+\beta_{1}(0) \\
\text{logit}(\pi(0)) &= \beta_{0} \\
\end{align*}
$$
<br>
- For example, the logit model from the horseshoe crab data:
$$
\begin{align*}
\text{logit(Width=0)} &= -12.3508 + 0.4972 (0) \\
\text{logit(Width=0)} &= -12.3508
\end{align*}
$$
  - Interpretation: the estimated odds of a crab having any satellite is $\exp(-12.3508)$ when its width is 0 cm.
  - "Zero width" crab sounds non reasonable.
  - Intercept is not meaningful and difficult to interpret.
- To make it interpretable, we can transform predictor $x$ by centering its value.
  - `c_width = width - np.mean(width)`
  - Thus, `c_width = 0` represents the mean value of width.
  - In general, zero value for the centered predictor represents the average value of the predictor.

### **2.2. The Effect of Continuous Predictor**
---

- Under the assumption that the logit is linear in the continuous predictor $x$, the equation for the logit is:

$$
\text{logit}(\pi(x)) = \beta_{0}+\beta_{1}(x)
$$
<br>
- Slope coefficient, $\beta_{1}$, gives the change in the log odds for an increase of 1 unit in $x$.
- Thus, the odds of success multiply by $\exp(\beta_{1})$ for every 1 unit increase in $x$.
  - But if $x$ in range [0,1], then a change of 1 is too large. Change of 0.01 may be more realistic.
  - In another case, a 1 ml increase in coffee consumption may be too small to be considered important.
Change of 50 or 100 ml may be more realistic.
  - **Solution : use the term “$c$” for the change of $x$.**
- The interpretation for $c$ unit change in $x$:
> The odds of success multiply by $\exp(c\times\beta_{1})$ for every $c$ unit increase in $x$.