title | output |
---|---|
Bayesian Categorical Data Analysis |
github_document |
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
options(digits = 2)
set.seed(0)
ggplot2::theme_set(ggplot2::theme_minimal(base_size = 12))
This is a set of tools for Bayesian analysis of categorical data, specifically 2×2 contingency tables.
Use:
beta_binom()
for analysis using the Beta-Binomial modelprint()
,tidy()
,glance()
, andplot()
to view the resultspresent_bbfit()
if you want a nicely formatted table of summaries to include in a presentation or reportupdate()
if you have additional data
est_multinom()
for estimating multinomial cell probabilities
You can install the development version from GitHub with:
# install.packages("remotes")
remotes::install_github("bearloga/BCDA")
beta_binom()
works with a table of the following format:
Success | Failure | |
---|---|---|
Group 1 | n11 | n12 |
Group 2 | n21 | n22 |
and uses the Beta-Binomial model to estimate:
Success | Failure | |
---|---|---|
Group 1 | p1 | 1 - p1 |
Group 2 | p2 | 1 - p2 |
which allows inference on the difference (p1 - p2
), the risk ratio (RR), and the odds ratio (OR), defined as follows:
RR = p1 / p2
-- how much more likely success in group 1 is, relative to success in group 2OR = (p1/(1-p1)) / (p2/(1-p2))
-- the ratio of the odds -- wherep1/(1-p1)
is the odds of success in group 1 andp2/(1-p2)
is the odds of success in group 2
All examples will use the following fake data:
fake_data <- matrix(c(200, 150, 250, 300), nrow = 2, byrow = TRUE)
colnames(fake_data) <- c('Safe' ,'Dangerous')
rownames(fake_data) <- c('Animals', 'Plants')
knitr::kable(fake_data)
Note that beta_binom()
uses the Jeffreys prior by default and that this README was knit with the seed set to 0 at the start for reproducibility.
library(BCDA)
(fit <- beta_binom(fake_data))
The credible intervals above are calculated using quantiles. If we have the coda package installed, we can also obtain the high posterior density intervals:
print(fit, interval_type = "HPD")
plot(fit)
The package includes a variety of functions for looking at the results from fitting a beta_binom()
model. To aid in functional programming, we implemented the tidy()
and glance()
verbs:
library(magrittr) # for %>%
fit %>%
tidy %>%
knitr::kable()
When knitting R Markdown to HTML, you can use the {gt} package for creating tables:
library(gt)
fit %>%
tidy %>%
gt(rowname_col = "term") %>%
tab_header(md("Results of `beta_binom()`"))
glance()
is used to
Construct a single row summary "glance" of a model, fit, or other object
fit %>%
glance
This is perfectly okay in an interactive data analysis scenario, but not when presenting the results in a report. glance()
is actually a special case of the present_bbfit()
function which generates all those nicely formatted credible intervals but outputs a Markdown/LaTeX-formatted table by default:
present_bbfit(fit)
The point estimates include credible intervals by default but these can be turned off:
present_bbfit(fit, conf_interval = FALSE, digits = 3)
Since the underlying code uses tidy()
to compute the summaries, we can specify a particular credible level and the type of interval we want (e.g. highest posterior density):
present_bbfit(fit, conf_level = 0.89, interval_type = "HPD")
It also supports multiple models, which can be provided as a named or an unnamed list. See the example below.
In Bayesian statistics, we can reuse a previously computed posterior as a prior if we have additional data, allowing us to update the parameter estimates as new data becomes available. Suppose we collect 40 observations from 2 groups (20 per group) on the first day of the A/B test, and 10 observations per day for the next 2 weeks. Here we see what happens when we update the posterior with additional data on a daily basis:
fit_2 <- update(fit, x = c(100, 200), n = c(400, 600))
present_bbfit(list("Day 1" = fit, "Day 2" = fit_2))
Other packages for Bayesian analysis of A/B tests include: LearnBayes (GPL), conting (GPL), bandit (GPL), testr (MIT).
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.