<a href="https://colab.research.google.com/github/stan-dev/example-models/blob/case-study%2Fstan-cloud/knitr/cloud-compute-2020/CmdStanR_Example_Notebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## CmdStanR Jupyter Notebook

This notebook demonstrates how to install the [CmdStanR](https://mc-stan.org/cmdstanr/) toolchain on a Google Colab instance and verify the installation by running the Stan NUTS-HMC sampler on the example model and data which are included with CmdStan.  Each code block in this notebook updates the R environment, therefore you must step through this notebook cell by cell.

Step 1: install CmdStanR and only the packages that it directly depends on.

In [0]:
# Preliminary setup
install.packages('versions')
library(versions)
install.versions('rlang','0.4.5')
# Install package CmdStanR from GitHub
library(devtools)
if(!require(cmdstanr)){
  devtools::install_github("stan-dev/cmdstanr", dependencies=c("Depends", "Imports"))
  library(cmdstanr)
}

Step 2: download and untar the CmdStan binary for Google Colab instances.

In [0]:
# Install CmdStan binaries
if (!file.exists("cmdstan-2-22-1.tgz")) {
  system("wget https://storage.googleapis.com/cmdstan-2-22-tgz/cmdstan-2-22-1.tgz", intern=T)
  system("tar zxf cmdstan-2-22-1.tgz", intern=T)
}
list.files("cmdstan-2.22.1")

Step 3: Register the CmdStan install location.

In [0]:
# Set cmdstan_path to CmdStan installation
set_cmdstan_path("cmdstan-2.22.1")

In [0]:
# helper function
print_file <- function(file, nlines=-1L) {
  cat(paste(readLines(file, n=nlines), "\n", sep=""), sep="")
}

The CmdStan installation includes a simple example program `bernoulli.stan` and test data `bernoulli.data.json`.  These are in the CmdStan installation directory `examples/bernoulli`.

The program `bernoulli.stan` takes a vector `y` of length `N` containing binary outcomes and uses a bernoulli distribution to estimate `theta`, the chance of success.

In [0]:
stan_file <- file.path(cmdstan_path(), "examples", "bernoulli", "bernoulli.stan")
print_file(stan_file)

The data file `bernoulli.data.json` contains 10 observations, split between 2 successes (1) and 8 failures (0).

In [0]:
data_file <- file.path(cmdstan_path(), "examples", "bernoulli", "bernoulli.data.json")
print_file(data_file)

The following code test that the CmdStanR toolchain is properly installed by compiling the example model, fitting it to the data, and obtaining a summary of estimates of the posterior distribution of all parameters and quantities of interest.

In [0]:
# Compile example model bernoulli.stan
mod <- cmdstan_model(stan_file)

# Condition on example data bernoulli.data.json, obtains a sample from the posterior
fit <- mod$sample(data = data_file, seed=123)

# Print a summary of the posterior sample
options(digits = 2)
fit$summary()