Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
…Stan-Tutorial

Need to merge some changes of my own.
  • Loading branch information
Kevin Van Horn committed Mar 25, 2019
2 parents bbee953 + 9bffc2b commit 5d42fc5
Show file tree
Hide file tree
Showing 143 changed files with 1,048 additions and 893 deletions.
Binary file added 1_lm/images/fuzzycaterpillar.jpg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion 1_lm/lm.R
Expand Up @@ -21,7 +21,7 @@ if (FALSE) {
install.packages("MASS")
install.packages("rstan")
install.packages("bayesplot")
install.pacakges("MCMCpack")
install.packages("MCMCpack")
}

# ========= Load and inspect computer conjoint study ==========
Expand Down
18 changes: 12 additions & 6 deletions 1_lm/lm.Rmd
@@ -1,6 +1,6 @@
---
title: "Stan Tutorial"
subtitle: "Module 2: Linear Regression Model"
subtitle: "Module 1: Linear Regression Model"
author: "Kevin Van Horn and Elea McDonnell Feit"
date: "June 25, 2017"
output: ioslides_presentation
Expand All @@ -17,7 +17,7 @@ require(rstan)
In this module, we will run a linear model (standard regression) using Stan.

This will allow us to:
- Understand how to do Bayesian inference with Stan without getting bogged down in details.
- Understand how to do Bayesian inference with Stan without getting bogged down in modeling details.

Running a linear model in Stan isn't something we would normally do in our own work, but it gives us a simple starting point that most people are familar with. We will add details and complexity in subsequent modules.

Expand Down Expand Up @@ -95,7 +95,7 @@ That's pretty straightforward!
## Stan `parameters`
- intercept `beta0`
- a vector of `K` parameters that multiplies the product attributes in `X` which is called `beta`
- a variance parameters `sigma`
- a variance parameter `sigma`
*Don't type this into R!*
```{stan, eval=FALSE, echo=TRUE, output.var="dummy"}
parameters {
Expand Down Expand Up @@ -175,7 +175,7 @@ knitr::include_graphics("images/Bayesian_Inference.png")

## Running the model
Now that we have a Stan model and data in Stan format, we can ask Stan to produce random draws from the posterior for us.
```{r, eval=TRUE, echo=TRUE}
```{r, eval=TRUE, echo=TRUE, cache=TRUE}
library(rstan) ## load interface to Stan
m.stan <- stan(model_code = lm.stan, data = cc.standata,
iter = 1000, chains=1)
Expand Down Expand Up @@ -238,11 +238,17 @@ plot(m.stan, pars=c("beta0", "beta"))
- The color has almost no effect on the rating (`beta[8]`).

## Assessing convergence with traceplots
There is some risk that the Stan sampler did not reach the proper posterior. If this happens, we can usually detect it by ploting the draws in order over time (called a *traceplot*.)
There is some risk that the Stan sampler did not reach the proper posterior. If this happens, we can usually detect it by ploting the draws in order over time (called a *traceplot*.)
```{r, eval=TRUE, echo=TRUE}
plot(m.stan, plotfun="trace", pars=c("beta0", "beta", "sigma"))
```

## Fuzzy caterpillar
Traceplots should look like fuzzy caterpillars.
```{r, fig.align="center"}
knitr::include_graphics("images/fuzzycaterpillar.jpg")
```

## Histograms of posterior draws
Histograms of the draws approximate posterior densities.
```{r, eval=TRUE, echo=TRUE}
Expand Down Expand Up @@ -292,7 +298,7 @@ get_stanmodel(m.stan)
- Testing the model using synthetic data
- Comparing inference from `stan()` to `lm()`
+ OLS regression in R using `lm()`
+ Bayesian regression using `MCMCregress()' from `MCMCPack`
+ Bayesian regression using `MCMCregress()` from `MCMCPack`

## A note on hierarchical models
- Lenk et al. (1996) used this data set to illustrate a *hierarchical* linear model and that is a more approrpriate model for this data.
Expand Down
438 changes: 229 additions & 209 deletions 1_lm/lm.html

Large diffs are not rendered by default.

4 changes: 4 additions & 0 deletions 1_lm/lm_cache/html/__packages
@@ -0,0 +1,4 @@
base
ggplot2
StanHeaders
rstan
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Empty file.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added 1_lm/lm_files/figure-html/unnamed-chunk-21-1.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added 1_lm/lm_files/figure-html/unnamed-chunk-22-1.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added 1_lm/lm_files/figure-html/unnamed-chunk-23-1.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added 1_lm/lm_files/figure-html/unnamed-chunk-24-1.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added 1_lm/lm_files/figure-html/unnamed-chunk-25-1.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added 1_lm/lm_files/figure-html/unnamed-chunk-3-1.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
27 changes: 19 additions & 8 deletions 2_mnl/mnl.Rmd
Expand Up @@ -17,6 +17,8 @@ This will allow us to:
- Start to learn the Stan model syntax.
- Go through the Bayesian inference process more carefully and completely.

Think of Module 1 as the "Quick Start Guide," while Module 2 is a more comprehensive "How-To Guide".

# Stan model syntax basics

## Required Stan model components
Expand Down Expand Up @@ -61,7 +63,7 @@ Specific `vector` and `matrix` types must be used when you want to do matrix ope
**Vectors and Matrices**
```{stan, eval=FALSE, output.var="stanjunk",}
vector[25] myvector; // column vector of length 25
row_vector[10] myrowvector; // row vector of lenght 25
row_vector[10] myrowvector; // row vector of length 25
matrix[10, 25] mymatrix;
```
- You can multiply a `vector` by a `matrix` or a `matrix` by a `matrix` (assuming dimenions match).
Expand Down Expand Up @@ -109,7 +111,7 @@ matrix<lower=c1, upper=c2>[rows, cols] varname[dim1, dim2, ...];
- While these are the main data types we use regularly, there are many other data types that Stan provides.
- We'll leave it to you to discover those in the context of specific examples where they are useful.

## Specifing probability distributions
## Specifying probability distributions
The core of any Stan model is the `model` block, where you define probability distributions for the observables. For instance, in our linear model, we defined y as normally distributed.
```{stan, eval=FALSE, output.var="stanjunk"}
y ~ normal(beta0 + X*beta, sigma);
Expand Down Expand Up @@ -156,9 +158,18 @@ myarrayofmatrices[1, 2, 3]; // element in second row and third column
- You specify a prior for a parameter by defining a distribution for the parameter in the `model` block.
- Since the NUTS algorithm used by default in Stan is gradient-based, there is no benefit to sticking to conjugate priors (as in Gibbs sampling).

## Parameters
- All parameters in a Stan model must be continuous, i.e. you can't have an `int` type parameter.
- This means you can't use models with latent discrete variables.
+ For instance you can't estimate membership variables in a mixture model.
- This is due to the fact that the MCMC algorithm that Stan uses to generate draws from the posterior relies on gradients.
- Discrete observables (like we have in the mnl) are fine!

## Learning more about Stan syntax
The best way to learn about the Stan language is to read the [Stan Modeling Language:
User’s Guide and Reference Manual](file:///C:/Users/emf75/Downloads/stan-reference-2.15.0%20(2).pdf), which is fairly easy to read and introduces much of the syntex in the context of specific examples.
- The best way to learn about the Stan language is to read the [Stan Modeling Language:
User’s Guide and Reference Manual](https://github.com/stan-dev/stan/releases/download/v2.16.0/stan-reference-2.16.0.pdf), which is fairly easy to read and introduces much of the syntax in the context of specific examples.
- If you want to learn about how to use Stan from R, the [rstan vignette](https://cran.r-project.org/web/packages/rstan/vignettes/rstan.html) will walk you through a typical analysis with rstan, very similar to this tutorial.
- There is a very active [Stan forum](http://discourse.mc-stan.org/) board, but since Stan does change over time, make sure the post you are looking at is recent.

# Stan code for mnl

Expand Down Expand Up @@ -250,7 +261,7 @@ str(d1)

## Run Stan model on synthetic data
To run the model, we call the `stan()` function. Note that we pass filename `mnl.stan` to the stan function. Make sure that file is in the working directory (or provide a full path)!
```{r}
```{r, cache=TRUE}
test.stan <- stan(file="mnl.stan", data=d1, iter=1000, chains=4)
summary(test.stan)$summary
```
Expand Down Expand Up @@ -415,7 +426,7 @@ str(X)
```

## Create the list object for `stan()`
```{r}
```{r, cache=TRUE}
choc.standata <- list(N=length(Y), C=3, K=9, Y=Y, X=X)
rm(Y, X, n, R, S, r, s, choc.contrasts, scenario, choc.coded)
```
Expand Down Expand Up @@ -484,11 +495,11 @@ beta.mean
```

## Computing shares from point estimates (2)
Compute shares.
Compute shares from the point estimates. **This is bad practice as it does not account for our uncertainty in beta!**
```{r}
shares.mnl.point(beta.mean, choc.standata$X[[1]])
```
**This is bad practice as it does not account for our uncertainty in beta!**


## Posterior draws of `beta`
To understand the full posterior of the shares, we should use all of the posterior draws for beta, which represent out uncertainty in `beta`.
Expand Down

0 comments on commit 5d42fc5

Please sign in to comment.