 --- title: "Bayesian Modeling Exercises" --- ## Bayesian Workflow 1. Develop a Data Generating Process (DGP) 1. Generate _fake_ data from DGP 1. Fit a model to the DGP 1. Perform posterior checks 1. Sensitivity analysis if required 1. Fit your real data! 1. Make inferences on your real data ## Specifying a Few Distributions ### The Binomial Trial ```{r binom-setup} n <- 1000 gender <-rep(x = 0:1, length.out = n) income <- rnorm(n, 0, 1) mu <- gender * 1.5 + income * 2 ``` The we make it random! ```{r binom-dat} y <- rbinom(n,1, prob = plogis(mu)) dat <- data.frame(gender =gender, income = income, y = y) ``` Then we can create a model using `brms` ```{r create-model} library(brms) (model <- bf(y ~ gender + income)) ``` We can treat `model` just like any other R object. Additionally, (and this is the handy part of doing it this way), we can see what priors are available with the `get_prior` function ```{r get-priors} get_prior(model, dat, bernoulli()) ``` We can then set our priors ```{r set-prior} my_priors <- c( prior(normal(0, 0.5), class = "b", coef = "gender"), prior(normal(0, 0.5), class = "b", coef = "income")) ``` And fit our new model! ```{r binom-fit} fit_binom <- brm(model, my_priors, data = dat, family = bernoulli(), inits = 1000, cores = 2, # These are important! chains = 2, seed = 1234, refresh = 0) # Setting a seet is important for reprocibility ``` Other features `adapt_delta` and `max_treedepth` may need to be altered depending on convergence. ### Diagnostics We check our trace plots ```{r trace-plot} library(tidybayes) library(bayesplot) mcmc_trace(as.matrix(fit_binom)) ``` And the centering of our parameters ```{r pairs-plot} pairs(fit_binom) ``` And our course our values ```{r summary-fit} summary(fit_binom) ``` ### Time Series in `brms` Let's make a simple time series data set ```{r ts-dat} n <- 100 x1 <- rnorm(n) x2 <- rnorm(n) y <- vector(length = n) y[1] <- 5*x1[1] + 2*x2[1] for(i in 2:n){ y[i] <-5*x1[i] + y[i-1] - 2*x2[i] } ts_dat <- data.frame(x1, x2, y) ``` Now we have to specify an additional term. ```{r} library(brms) fit_ar1 <- brm(y ~ x1 + x2, data = ts_dat, autocor = cor_ar(p = 1), # Tell me more? inits = 1000, cores = 2, # These are important! chains = 2, seed = 1234, refresh = 0) # Setting a seet is important for reprocibility ``` A PPC ```{r} library(bayesplot) color_scheme_set("darkgray") pp_check(fit_ar1) ``` ```{r} summary(fit_ar1) ``` ## Clinical Trial Suppose we have some people in the experiment and we give them some kind of treatment. Then we measure the impact over time. See [here](https://michaeldewittjr.com/dewitt_blog/posts/2018-09-24-the-power-of-fake-data-simulations/) for more details. ```{r} J <- 50 # number of people in the experiment N_per_person <- 10 # number of measurements per person person_id <- rep(1:J, rep(N_per_person, J)) index <- rep(1:N_per_person, J) time <- index - 1 # time of measurements, from 0 to 9 N <- length(person_id) a <- rnorm(J, 0, 1) b <- rnorm(J, 1, 1) theta <- 1 sigma_y <- 1 z <- sample(rep(c(0,1), J/2), J) y_pred <- a[person_id] + b[person_id]*time + theta*z[person_id]*time y <- rnorm(N, y_pred, sigma_y) z_full <- z[person_id] exposure <- z_full*time data_1 <- data.frame(time, person_id, exposure, y) ``` And then we can fit the data ```{r} fit_1 <- brm(y ~ (1 + time | person_id) + time + exposure, data=data_1) ``` And see if we can recover our effect. ```{r} summary(fit_1) ```
