**SM339 &#x25aa; Applied Statistics &#x25aa; Spring 2024 &#x25aa; Uhan**

# Lesson 27. Multiple Logistic Regression &ndash; Part 2 

## Overview

In this lesson, we will continue to use the `MedGPA` data, which contains information for 55 medical school applicants from a liberal arts college in the Midwest.

In [None]:
library(Stat2Data)
data(MedGPA)
head(MedGPA)

We will focus on the following variables:

| Variable | Description |
| :- | :- | 
| `GPA` | Applicant's college grade point average |
| `MCAT` | Applicant's MCAT (Medical College Admission Test) score |
| `Sex` | F for female, M for male |
| `Acceptance` | 1 if accepted, 0 if not accepted |

Unless otherwise stated, use a significance level of $\alpha = 0.05$.
Throughout this lesson, let $\pi = P(\mathit{Acceptance} = 1)$.

## Example 1

__Does probability of acceptance differ by sex?__

### a.

Fit a model that uses sex to predict acceptance.

* Note that `Sex` is a categorical variable with two "levels" (categories) 

* We can confirm this by using the `level()` function on the `Sex` variable, like this:

In [None]:
levels(MedGPA$Sex)

* For categorical variables, R uses the first category as the reference category, and automatically creates a binary variable for every other level

* In this case, R uses 'F' as the reference category, and defines the variable

    $$ \mathit{SexM} = \begin{cases}
    1 & \text{if $\mathit{Sex}$ = M}\\
    0 & \text{otherwise}
    \end{cases}$$

* You should find that the fitted model is

    $$ \text{logit}(\hat{\pi}) = 0.5878 - 0.8109 \mathit{SexM} $$

### b.

Interpret the estimated slope of $\mathit{SexM}$, in terms of an odds ratio.

*Write your answer here. Double-click to edit.*

### c.

Provide a 95\% confidence interval for the true odds ratio of acceptance between male and female students.

### d.

Is there significant evidence that the probability of acceptance differs by sex? Justify your answer.

*Write your answer here. Double-click to edit.*

## Example 2

__After accounting for GPA, does probability of acceptance differ by sex?__

### a.

Fit a model that uses GPA and sex to predict acceptance. 

* You should find the fitted model is

    $$ \text{logit}(\hat{\pi}) = -21.0680 + 6.1324 \mathit{GPA} - 1.1697 \mathit{SexM} $$

* Therefore, for male students, the model is
    \begin{align*}
    \text{logit}(\hat{\pi}) & = -21.0680 + 6.1324 \mathit{GPA} - 1.1697 (1)\\
    & = -22.2377 + 6.1324 \mathit{GPA}
    \end{align*}
    and for female students,
    \begin{align*}
    \text{logit}(\hat{\pi}) & = -21.0680 + 6.1324 \mathit{GPA} - 1.1697 (0)\\
    & = -21.0680 + 6.1324 \mathit{GPA}
    \end{align*}

### b.

Compare the fitted model (in probability form) for male students with the fitted model for female students.

In particular, for each fitted model, compute where the midpoint occurs and the slope of the curve at that midpoint. 
Based on your answers, describe how the plots of the two fitted models compare to each other.

*Hint.* See Lesson 24.

*Write your answer here. Double-click to edit.*

* Use the code below to plot the fitted models (in probability form) for male students and female students and confirm your answer to part b

In [None]:
# Create x values for plots of fitted models
xx <- seq(from = 0.0, to = 5.0, by = 0.01)  

# Create y values for fitted model for male students
male.yy <- predict(fit, newdata = data.frame(GPA = xx, Sex = 'M'), type = 'response')

# Create y values for fitted model for female students
female.yy <- predict(fit, newdata = data.frame(GPA = xx, Sex = 'F'), type = 'response')

# Create plots of fitted models, starting with an empty plot as a starting point
plot(NULL, xlim = c(2, 5), ylim = c(0, 1), xlab = 'GPA', ylab = 'P(Acceptance = 1)')
lines(xx, male.yy, col = 'red')
lines(xx, female.yy, col = 'black')

### c.

Is the model useful overall? Conduct an appropriate test to decide.

*Write your answer here. Double-click to edit.*

### d.

Answer the motivating question for this example. Justify your answer.

*Write your answer here. Double-click to edit.*

### e.

Estimate the odds ratio of acceptance for male students with a 3.5 GPA versus female students with a 3.5 GPA.

### f.

Interpret the coefficient of $\mathit{SexM}$ in terms of an odds ratio. 

*Write your answer here. Double-click to edit.*

### h.

Estimate the odds ratio of acceptance for male students with a 3.8 GPA versus female students with a 3.5 GPA.

Note that in this part, you are being asked to compare odds under conditions in which __two variables differ__: $\mathit{GPA}$ and $\mathit{SexM}$. So in this case, it isn't as simple as looking at $e^{\hat{\beta}_1}$ or $e^{\hat{\beta_2}}$.

### i.

Interpret the coefficient of $\mathit{GPA}$ in terms of an odds ratio.

*Write your answer here. Double-click to edit.*

## Example 3

__Does the slope of GPA differ by sex?__

In other words, is the odds ratio of acceptance for a one-unit increase in GPA different for male students versus female students?

### a.

To allow the slope of GPA to differ by sex, we introduce an interaction term:

$$ \text{logit}(\pi) = \beta_0 + \beta_1 \mathit{GPA} + \beta_{2} \mathit{SexM} + \beta_3 (\mathit{GPA} \times \mathit{SexM}) $$

Fit this model.

- You should find the fitted model is

$$ \text{logit}(\hat{\pi}) = -24.385 + 7.083 \mathit{GPA} + 4.901 \mathit{SexM} - 1.709 (\mathit{GPA} \times \mathit{SexM}) $$

- Therefore, for male students, the model is
    \begin{align*}
    \text{logit}(\hat{\pi}) & = -24.385 + 7.083 \mathit{GPA} + 4.901 (1) - 1.709 (\mathit{GPA} \times (1))\\
    & = -19.484 + 5.374 \mathit{GPA}
    \end{align*}
    and for female students,
    \begin{align*}
    \text{logit}(\hat{\pi}) & = -24.385 + 7.083 \mathit{GPA} + 4.901 (0) - 1.709 (\mathit{GPA} \times (0))\\
    & = -24.385 + 7.083 \mathit{GPA}
    \end{align*}

### b.

Plot the fitted model (in probability form) for male students and the fitted model for female students.

### c.

Estimate the odds ratio of acceptance for a unit increase in GPA for females.

### d.

Estimate the odds ratio of acceptance for unit increase in GPA for males.

### e.

Answer the motivating question for this example. Justify your answer.

*Write your answer here. Double-click to edit.*

### f.

Use the fitted model to estimate the probability of acceptance for a male with a 3.5 GPA.

### g.

Should we drop both of the predictors that include sex?
Use a likelihood ratio test to compare the current model to the model with only GPA as a predictor.

*Write your answer here. Double-click to edit.*