# Stratified Analysis: Maritial Status and Drink-Driving

## Summary

## Introduction

## Analysis

In [1]:
from opyn.generic.pandasloader import PandasLoader
from opyn.stats import epidemeology as epyn
from pandas.api.types import CategoricalDtype

### Load the data

In [2]:
f = "drinkdriving"
pdloader = PandasLoader()
# pdloader.get_description(f)

In [3]:
dat = pdloader.get(f)
dat

Unnamed: 0,count,exposure,outcome,level
0,4,over 100mg,case,married
1,5,over 100mg,control,married
2,5,under 100mg,case,married
3,103,under 100mg,control,married
4,10,over 100mg,case,not married
5,3,over 100mg,control,not married
6,5,under 100mg,case,not married
7,43,under 100mg,control,not married


Copy the dataframe to preserve the immutable nature of the data, and then recode the `exposure`, `outcome`, and `level` columns as ordered categorical data.

In [4]:
copieddat = dat.copy(deep=True)
copieddat["exposure"] = copieddat["exposure"].astype(
    CategoricalDtype(["under 100mg", "over 100mg"], True)
)
copieddat["outcome"] = copieddat["outcome"].astype(
    CategoricalDtype(["control", "case"], True)
)
copieddat["level"] = copieddat["level"].astype(
    CategoricalDtype(["married", "not married"], True)
)

Sort the dataframe to ensure the data is as expected.

In [5]:
sorteddat = copieddat.sort_values(by=["level", "exposure", "outcome"])

Extract the `count` column as a `2x2x2` `ndarray`.

In [6]:
resarr = sorteddat["count"].to_numpy().reshape((2, 2, 2))
resarr

array([[[103,   5],
        [  5,   4]],

       [[ 43,   5],
        [  3,  10]]], dtype=int64)

It is this new reshaped `ndarray` that we will pass to the various functions for analysis.

### Stratum-specific odds ratio

In [7]:
epyn.oddsratio(resarr[0])  # married

Unnamed: 0,oddsratio,stderr,lower,upper
Exposed1 (-),1.0,0.0,,
Exposed2 (+),16.48,0.812225,3.354211,80.969975


In [8]:
epyn.oddsratio(resarr[1])  # not married

Unnamed: 0,oddsratio,stderr,lower,upper
Exposed1 (-),1.0,0.0,,
Exposed2 (+),28.666667,0.810302,5.856619,140.31607


### Unadjusted odds ratio

In [9]:
epyn.crude_oddsratio(resarr)

Unnamed: 0,oddsratio,stderr,lower,upper
Exposed1 (-),1.0,0.0,,
Exposed2 (+),25.55,0.550707,8.682174,75.188827


### Tarone's test of homogeneity

In [10]:
epyn.test_equalodds(resarr)

Unnamed: 0,chisq,pval
result,0.235575,0.627421


### Adjusted odds ratio

In [11]:
epyn.adjusted_oddsratio(resarr)

Unnamed: 0,oddsratio,stderr,lower,upper
result,23.00061,0.57413,7.465154,70.866332


### Test of no association

In [12]:
epyn.test_nullodds(resarr)

Unnamed: 0,chisq,pval
result,40.511971,1.954151e-10


## Discussion