# Background

# Set environment

In [49]:
library("tidyverse")
library("nnet")
library(IRdisplay)

# Data

In [52]:
source("./data/cars.R")
display(cars)

set.seed(0)
display(cars_n1 %>% sample_n(10))

Unnamed: 0,women,men
1:no/little,40,65
2:important,47,47
3:very-important,63,38


Unnamed: 0,response,gender
270,3:very-important,men
80,2:important,women
111,3:very-important,women
171,1:no/little,men
269,3:very-important,men
60,2:important,women
265,3:very-important,men
277,3:very-important,men
193,1:no/little,men
184,1:no/little,men


**Notation:**

There are three groups $J = 3$

$\pi_j$ is the probability of each group

- $\pi_1 = P(1:no/little)$

- $\pi_2 = P(2:important)$

- $\pi_3 = P(3:very important)$

**model:**

$$log \big( \frac{\pi_2}{\pi_1} \big) = \beta_{02} + \beta_{12} * x$$

$$log \big( \frac{\pi_3}{\pi_1} \big) = \beta_{03} + \beta_{13} * x$$

$$OR_2 = \frac{
    \text{generalized odds(j = 2 vs j = 1) for x = 1}
}{
    \text{generalized odds(j = 2 vs j = 1) for x = 0}
} = \frac{
    \Big( \frac{\pi_2}{\pi_1} \Big)_{x = 1}
}{
    \Big( \frac{\pi_2}{\pi_1} \Big)_{x = 0}
} = e^{\beta_{12}}$$

$$OR_3 = \frac{
    \text{generalized odds(j = 3 vs j = 1) for x = 1}
}{
    \text{generalized odds(j = 3 vs j = 1) for x = 0}
} = \frac{
    \Big( \frac{\pi_3}{\pi_1} \Big)_{x = 1}
}{
    \Big( \frac{\pi_3}{\pi_1} \Big)_{x = 0}
} = e^{\beta_{13}}$$


**The data**

| Category         | women | men |   
|------------------|-------|-----|   
| 1:no/little      |  40   |  65 |   
| 2:important      |  47   |  47 |   
| 3:very-important |  63   |  38 |   

**Calculation**  

Total = 40 + 47 + 63 + 65 + 47 + 38 = 300

P(women) = (40 + 47 + 63) / 300 = 1/2  


- P(1:no/little | women) = 40 / 150 = 0.2667
- P(2:important | women) = 47 / 150 = 0.3133
- P(3:very-impo | women) = 63 / 150 = 0.4200

P(men) = (40 + 47 + 63) / 300 = 1/2  

- P(1:no/little | men) = 65 / 150 = 0.4333
- P(2:important | men) = 47 / 150 = 0.3133
- P(3:very-impo | men) = 38 / 150 = 0.2533


**generalized odds**

| Category         | women (x=0)       | men (x=1)        |   
|------------------|-------------------|------------------|   
| 1:no/little      |  1                |  1               |   
| 2:important      |  0.3133 / 0.2667  |  0.3133 / 0.4333 |   
| 3:very-important |  0.4200 / 0.2667  |  0.2533 / 0.4333 |   


| Category         | women (x=0) | men (x=1) |   
|------------------|-------------|-----------|   
| 1:no/little      |  1          |  1        |   
| 2:important      |  1.1747     |  0.7231   |   
| 3:very-important |  1.5748     |  0.5846   |   

**generalized odds ratio**

j-th odds for pattern 1 (x = 1; men) divided by j-th odds for pattern 2 (x = 0; women)


$OR_{j = 2} = \frac{
    \text{odds of 2:important | x = 1}
}{
    \text{odds of 2:important | x = 0}
} = \frac{0.7231}{1.1747} = 0.6156$

$OR_{j = 3} = \frac{
    \text{odds of 3:very important | x = 1}
}{
    \text{odds of 3:very important | x = 0}
} = \frac{0.5846}{1.5748} = 0.3712$

**From the model**

$OR_{j = 2} = e^{\beta_{12}} = 0.6156 \Rightarrow \beta_{12} = -0.485$

$OR_{j = 3} = e^{\beta_{13}} = 0.3712 \Rightarrow \beta_{13} = -0.991$

$log \big( \frac{\pi_2}{\pi_1} \big) = \beta_{02} + \beta_{12} * x \Rightarrow \beta_{02} = log(1.1747) = 0.161$

$log \big( \frac{\pi_3}{\pi_1} \big) = \beta_{03} + \beta_{13} * x \Rightarrow \beta_{03} = log(1.5748) = 0.454$


# Perform nominal logistics regression

In [69]:
cars = data.frame(
    count    = c(40, 47, 63, 65, 47, 38),
    gender   = c("women", "women", "women", "men", "men", "men"),
    response = c("1:no/little", "2:important", "3:very-important", "1:no/little", "2:important", "3:very-important"))
    
cars$response <- relevel(cars$response, ref = "1:no/little")
cars$gender   <- relevel(cars$gender,   ref = "women")

display(cars)

count,gender,response
40,women,1:no/little
47,women,2:important
63,women,3:very-important
65,men,1:no/little
47,men,2:important
38,men,3:very-important


In [80]:
cars$response <- relevel(cars$response, ref = "1:no/little")
cars$gender   <- relevel(cars$gender,   ref = "women")
test          <- multinom(response ~ gender, weights = count, data = cars)

# weights:  9 (4 variable)
initial  value 329.583687 
final  value 323.140601 
converged


In [81]:
test_sum = summary(test)
test_sum

Call:
multinom(formula = response ~ gender, data = cars, weights = count)

Coefficients:
                 (Intercept)  gendermen
2:important        0.1612636 -0.4855074
3:very-important   0.4542540 -0.9910557

Std. Errors:
                 (Intercept) gendermen
2:important        0.2151200 0.2879893
3:very-important   0.2021706 0.2873557

Residual Deviance: 646.2812 
AIC: 654.2812 

In [72]:
test$deviance

In [55]:
chisq.test(cars)


	Pearson's Chi-squared test

data:  cars
X-squared = 12.14, df = 2, p-value = 0.002311
