# Number of Cigarettes Smoked and Lung Cancer

**Date:** 2021-12-01

**Reference:** M249, Book 1, Part 2

In [1]:
suppressPackageStartupMessages(library(tidyverse))
library(R249)

## Summary

## Get the data

In [2]:
(dat <- as_tibble(read.csv(file = "..\\..\\data\\smoking2.csv")))

count,dose,outcome
<int>,<chr>,<chr>
32,50+,case
13,50+,control
136,25-49,case
71,25-49,control
196,15-24,case
190,15-24,control
250,5-14,case
293,5-14,control
35,0,case
82,0,control


## Prepare the data

Cast the `dose`, `outcome` columns to factors.

In [3]:
labdose <- c("0", "5-14", "15-24", "25-49", "50+")
labout <- c("control", "case")
(sorteddat <- dat %>%
    mutate(dose = factor(dat$dose, labdose)) %>%
    mutate(outcome = factor(dat$outcome, labout)) %>%
    arrange(dose, outcome))

count,dose,outcome
<int>,<fct>,<fct>
82,0,control
35,0,case
293,5-14,control
250,5-14,case
190,15-24,control
196,15-24,case
71,25-49,control
136,25-49,case
13,50+,control
32,50+,case


Pull the `count` column as a vector and initilise a matrix.

In [4]:
datmat <- sorteddat$count %>%
    matrix(nrow = 5, ncol = 2, byrow = TRUE, dimnames = list(labdose, labout))
datmat

Unnamed: 0,control,case
0,82,35
5-14,293,250
15-24,190,196
25-49,71,136
50+,13,32


## Dose-specific odds

Calculate the odds and log(odds) of each dose.

In [5]:
odds(datmat)

Unnamed: 0,odds,log(odds)
0,,
5-14,0.8532423,-0.15871169
15-24,1.0315789,0.03109059
25-49,1.915493,0.64997501
50+,2.4615385,0.90078655


## Dose-specific odds ratios

Calculate the dose-specific odds ratio for each dose.

In [6]:
oddsratio(datmat)

Unnamed: 0,oddsratio,stderr,lcb,ucb
0,,,,
5-14,1.999025,0.2194983,1.300112,3.073658
15-24,2.416842,0.2261233,1.551571,3.764652
25-49,4.487726,0.2494073,2.75252,7.316818
50+,5.767033,0.385927,2.706767,12.287232


## Chi-squared test for no linear trend

In [7]:
chisq_lineartrend(datmat)

Unnamed: 0,chisq,pval
result,43.83024,3.581287e-11
