In [1]:
import pymc as pm
import numpy as np
import arviz as az

%load_ext lab_black
%load_ext watermark

# Lister

Adapted from [Unit 10: lister.odc](https://raw.githubusercontent.com/areding/6420-pymc/main/original_examples/Codes4Unit10/lister.odc).

Data can be found [here](https://raw.githubusercontent.com/areding/6420-pymc/main/data/r.txt).

Associated lecture video: Unit 10 lesson 1

## Problem statement

Sir Joseph Lister (1827--1912), Professor of Surgery at Glasgow University, influenced by Pasteur's ideas, found that a wound wrapped in bandages treated by Carbolic acid (phenol) would often not become infected.

Here are Lister's data on treating open fractures and amputations:

Period    Carbolic acid used    Results

1864--1866     No       Treated   34 patients, 15 died and 19 recovered

1867--1870     Yes     Treated   40 patients,   6 died and 34 recovered


Estimate and interpret the risk difference,  risk ratio, and   odds ratio


notes:
- again with the Wishart! explain here instead, since this one is actually first
- will need to talk about LKJ distribution and equivalence
- possibly useful paper:
    - https://arxiv.org/pdf/1809.04746.pdf


In [3]:
n1 = 34
y1 = 15
n2 = 40
y2 = 6
mu0 = np.array([-0.5, -1.5])
S = np.array([[0.1, 0], [0, 0.1]])

In [4]:
with pm.Model() as m:
    chol, corr, stds = pm.LKJCholeskyCov(
        "chol", n=2, eta=1, sd_dist=pm.HalfCauchy.dist(1, shape=2), compute_corr=True
    )

    mu = pm.MvNormal("mu", mu0, chol=chol)

    p1 = mu[0]
    p2 = mu[1]

    pm.Binomial("y1", n=n1, logit_p=p1, observed=y1)
    pm.Binomial("y2", n=n2, logit_p=p2, observed=y2)

    pm.Deterministic("rd", p1 - p2)
    pm.Deterministic("rr", p1 / p2)
    pm.Deterministic("or", (p1 / (1 - p1)) / (p1 / (1 - p2)))
    trace = pm.sample(3000)

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [chol, mu]


  self.fn()
  self.fn()
  self.fn()
  self.fn()
Sampling 4 chains for 1_000 tune and 3_000 draw iterations (4_000 + 12_000 draws total) took 83 seconds.
There were 491 divergences after tuning. Increase `target_accept` or reparameterize.
The acceptance probability does not match the target. It is 0.6736, but should be close to 0.8. Try to increase the number of tuning steps.
There were 658 divergences after tuning. Increase `target_accept` or reparameterize.
The acceptance probability does not match the target. It is 0.6244, but should be close to 0.8. Try to increase the number of tuning steps.
There were 381 divergences after tuning. Increase `target_accept` or reparameterize.
There were 509 divergences after tuning. Increase `target_accept` or reparameterize.
The acceptance probability does not match the target. It is 0.6889, but should be close to 0.8. Try to increase the number of tuning steps.


In [5]:
az.summary(trace)

  (between_chain_variance / within_chain_variance + num_samples - 1) / (num_samples)


Unnamed: 0,mean,sd,hdi_3%,hdi_97%,mcse_mean,mcse_sd,ess_bulk,ess_tail,r_hat
mu[0],-0.362,0.273,-0.855,0.203,0.008,0.008,1189.0,1041.0,1.0
mu[1],-1.635,0.328,-2.358,-1.104,0.007,0.005,2485.0,2905.0,1.0
chol[0],0.836,1.551,0.017,2.312,0.029,0.021,520.0,259.0,1.01
chol[1],0.033,1.09,-1.419,1.316,0.047,0.038,1888.0,1446.0,1.01
chol[2],0.671,1.177,0.023,1.864,0.04,0.028,189.0,53.0,1.01
"chol_corr[0, 0]",1.0,0.0,1.0,1.0,0.0,0.0,12000.0,12000.0,
"chol_corr[0, 1]",-0.037,0.576,-0.977,0.876,0.017,0.012,1146.0,662.0,1.01
"chol_corr[1, 0]",-0.037,0.576,-0.977,0.876,0.017,0.012,1146.0,662.0,1.01
"chol_corr[1, 1]",1.0,0.0,1.0,1.0,0.0,0.0,631.0,458.0,1.0
chol_stds[0],0.836,1.551,0.017,2.312,0.029,0.021,520.0,259.0,1.01


In [1]:
%watermark -n -u -v -iv -p aesara,aeppl

UsageError: Line magic function `%watermark` not found.
