# Contingency tables vs logistic regression

In this notebook we will look at a simple example of logistic regression and compare the results with what we get from an analysis of a contingency table. The example data comes from the [SOM Survey](https://www.gu.se/en/som-institute/the-som-surveys) performed annually by Göteborgs universitet. We will look at two variables from the 2015 survey and will treat the 1499 respondents as a random sample of the population. The dataset we will work with has been slightly edited.

Start by loading the datasets.

In [None]:
options(repr.plot.width=14, repr.plot.height=8)
suppressMessages(require(dplyr))
suppressMessages(require(ggplot2))
data <- readRDS("data_from_som2015.rds")
names(data)

Cross-tabulate the data

In [None]:
## table w margins
tab <- table(data$sex,data$faith)
xtab <- addmargins(tab)
print("observed")
print(xtab)
print("proportions")
ptab <- prop.table(tab,1)
print(ptab)

In [None]:
## compute the OR based on the contingency table
odds_men <- ptab[1,2]/(1-ptab[1,2])
print(paste("odds for men: ", round(odds_men,2)))
odds_women <- ptab[2,2]/(1-ptab[2,2])
print(paste("odds for women: ", round(odds_women,2)))
odds_ratio <- odds_women/odds_men

In [None]:
## next we fit a logistic regression model
mod <- glm(as.factor(faith)~1+sex,data=data,family=binomial)
summary(mod)

In [None]:
## let's look at the coefficients exponentiated
print(round(exp(mod$coefficients),3))

Compare the value of the coefficient for sex with the odds ratio calculated from the contingency table: 


In [None]:
print(round(odds_ratio,3))
sum(mod$coefficients)