# Dose-Response Analysis: Arsenic Exposure and Cancer

## Summary

## Introduction

## Analysis

In [1]:
from opyn.generic.pandasloader import PandasLoader
from opyn.stats import epidemeology as epyn
from pandas.api.types import CategoricalDtype

### Load the data

In [2]:
f = "arseniccancer"
pdloader = PandasLoader()
# pdloader.get_description(f)

In [3]:
dat = pdloader.get(f)
dat

Unnamed: 0,count,dose,outcome
0,12,>=100,case
1,9,>=100,control
2,36,15-100,case
3,66,15-100,control
4,45,0.25-15,case
5,104,0.25-15,control
6,14,<0.25,case
7,35,<0.25,control


Copy the dataframe to preserve the immutable nature of the data, and then recode the `exposure`, `outcome`, and `level` columns as ordered categorical data.

In [4]:
copieddat = dat.copy(deep=True)
doses = ["<0.25", "0.25-15", "15-100", ">=100"]
catdoses = CategoricalDtype(doses, True)
catoutcomes = CategoricalDtype(["control", "case"], True)
copieddat["dose"] = copieddat["dose"].astype(catdoses)
copieddat["outcome"] = copieddat["outcome"].astype(catoutcomes)

Sort the dataframe to ensure the data is as expected.

In [5]:
sorteddat = copieddat.sort_values(by=["dose", "outcome"])
sorteddat

Unnamed: 0,count,dose,outcome
7,35,<0.25,control
6,14,<0.25,case
5,104,0.25-15,control
4,45,0.25-15,case
3,66,15-100,control
2,36,15-100,case
1,9,>=100,control
0,12,>=100,case


Extract the `count` column as a `2x2x2` `ndarray`.

In [6]:
resarr = sorteddat["count"].to_numpy().reshape((4, 2))
resarr

array([[ 35,  14],
       [104,  45],
       [ 66,  36],
       [  9,  12]], dtype=int64)

### Dose-specific odds ratio

In [7]:
epyn.oddsratio(resarr)

Unnamed: 0,oddsratio,stderr,lower,upper
Exposed1 (-),1.0,0.0,,
Exposed2 (+),1.081731,0.363094,0.530949,2.203869
Exposed3 (+),1.363636,0.37806,0.64997,2.860907
Exposed4 (+),3.333333,0.542627,1.150783,9.65526


### Dose-specific odds and log-odds

In [8]:
epyn.doseexposure_odds(resarr)

Unnamed: 0,odds,log-odds
Exposed1,0.4,-0.916291
Exposed2,0.432692,-0.837728
Exposed3,0.545455,-0.606136
Exposed4,1.333333,0.287682


### Chi-squared test of no linear trend

In [9]:
epyn.chisq_lineartrend(resarr)

TypeError: weighted_means() takes 2 positional arguments but 3 were given

## Discussion