/
ex2revdbayes.Rmd
239 lines (195 loc) · 8.91 KB
/
ex2revdbayes.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
---
title: "EVA 2021 revdbayes exercises 2"
subtitle: "EV inference using revdbayes"
output: html_notebook
---
## Getting started
* Install [RStudio Desktop](https://www.rstudio.com/products/rstudio/download/#download) (if you do not already have it) and open it
* Save [ex2revdbayes.Rmd](https://raw.githubusercontent.com/paulnorthrop/revdbayes/master/vignettes/ex2revdbayes.Rmd) to your computer (there is also a link to it above)
* Open `ex2revdbayes.Rmd` in Rstudio
* The R code is arranged in *chunks* highlighted in grey
* You can run the code in a chunk by clicking on the **green triangle** on the top right of the chunk
* There is information about Exercises 2 (and Exercises 1) in the [EVA2021 revdbayes slides](https://paulnorthrop.github.io/revdbayes/articles/EVA2021revdbayes.html)
## Aims
* Compare ROU approaches for sampling from a GP posterior
* Perform posterior predictive checking and inference
* Consider how to sample efficiently from a PP posterior
* Compare `revdbayes` and `evdbayes` for sampling from a GEV posterior
There is further information in the [Introduction to revdbayes](https://paulnorthrop.github.io/revdbayes/articles/revdbayes-a-vignette.html) vignette.
```{r packages, message=FALSE}
# We need the evdbayes, revdbayes, ggplot2, bayesplot and coda packages
pkg <- c("evdbayes", "revdbayes", "ggplot2", "bayesplot", "coda")
pkg_list <- pkg[!pkg %in% installed.packages()[, "Package"]]
install.packages(pkg_list)
# Load all packages
invisible(lapply(pkg, library, character.only = TRUE))
```
```{r setup}
# Simulation sample size
n <- 10000
```
## GP model for threshold excesses
### GP priors
```{r setPriors}
# Information about the set_prior() function
# You can choose from a set of options, or create your own
?set_prior
# Set the threshold
u <- quantile(gom, probs = 0.65)
# Set an 'uninformative' prior for the GP parameters
fp <- set_prior(prior = "flat", model = "gp", min_xi = -1)
```
### GP posteriors
```{r posteriorSampling}
# Information about the rpost function
?rpost
# Information about the gom (significant wave heights) data
?gom
# Sample on (sigma_u, xi) scale
gp1 <- rpost(n = n, model = "gp", prior = fp, thresh = u, data = gom,
rotate = FALSE)
# Rotation
gp2 <- rpost(n = n, model = "gp", prior = fp, thresh = u, data = gom)
# Box-Cox transformation (after transformation to positivity)
gp3 <- rpost(n = n, model = "gp", prior = fp, thresh = u, data = gom,
rotate = FALSE, trans = "BC")
# Box-Cox transformation and then rotation
gp4 <- rpost(n = n, model = "gp", prior = fp, thresh = u, data = gom,
trans = "BC")
```
### Comparing approaches
```{r GPplots}
# Samples and posterior density contours
plot(gp1, main = paste("mode relocation only, pa = ", round(gp1$pa, 3)))
plot(gp2, ru_scale = TRUE, main = paste("rotation, pa = ", round(gp2$pa, 3)))
plot(gp3, ru_scale = TRUE, main = paste("Box-Cox, pa = ", round(gp3$pa, 3)))
plot(gp4, ru_scale = TRUE, main = paste("Box-Cox and rotation, pa = ",
round(gp4$pa, 3)))
```
``rpost_rcpp()`` is faster than ``rpost()``. See the [rusting faster](https://cran.r-project.org/web/packages/revdbayes/vignettes/revdbayes-b-using-rcpp-vignette.html) vignette.
```{r faster}
gp2 <- rpost_rcpp(n = n, model = "gp", prior = fp, thresh = u, data = gom)
```
Suggestion: increase the threshold and see how the appearance of the posterior changes.
## Posterior predictive checking
There is further information in the [Posterior predictive EV inference](https://paulnorthrop.github.io/revdbayes/articles/revdbayes-c-predictive-vignette.html) vignette.
```{r GPcheck}
# nrep = 50 asks for 50 fake replicates of the original data
# For each of 50 posterior samples of the parameters a dataset is simulated
gpg <- rpost(n = n, model = "gp", prior = fp, thresh = u, data = gom,
nrep = 50)
# Information about pp_check.evpost
?pp_check.evpost
# Compare real and fake datasets
pp_check(gpg, type = "multiple", subtype = "dens") + ggtitle("GP kernel density estimates")
# Compare real and fake summary statistics
pp_check(gpg, stat = "max")
```
* Would you be able to spot the real dataset or summary statistic if it was not highlighted?
* Option: you could play with the arguments
- `type` and/or `subtype` for the first plot
- `stat` for the second plot
## Posterior predictive EV inference
### Binomial-GP model for threshold exceedances
Modelling threshold excesses is only part of the story. We also need to model the proportion of observations that exceed the threshold.
```{r BinGP}
bp <- set_bin_prior(prior = "jeffreys")
# We need to provide the mean number of observations per year
# The data cover a period of 105 years
npy_gom <- length(gom)/105
bgpg <- rpost(n = 1000, model = "bingp", prior = fp, thresh = u, data = gom,
bin_prior = bp, npy = npy_gom, nrep = 50)
```
We can make predictive inferences about the largest value $M_N$ to be observed over a time horizon of $N$ years.
```{r predBinGP}
# Information about pp_check.evpost
?predict.evpost
# Predictive density of the largest value in `n_years' years
plot(predict(bgpg, type = "d", n_years = 200))
# Predictive intervals (equi-tailed and shortest possible)
i_bgpp <- predict(bgpg, n_years = 200, level = c(95, 99), hpd = TRUE)
plot(i_bgpp, which_int = "both")
i_bgpp$short
```
```{r}
# The predictive 100, 200 and 500 year return levels
predict(bgpg, type = "q", n_years = 1, x = c(0.99, 0.995, 0.998))$y
```
See [Northrop and Attalides (2017)](https://doi.org/10.1111/rssc.12159) for an analysis of these data using an informative prior.
## PP model for threshold exceedances
We compare three ways to sample from a PP posterior distribution
1. Parameterise in terms of annual maxima
2. [Wadsworth et al. (2010)](https://doi.org/10.1214/10-AOAS333): parameterise to make $\mu$ orthogonal to $(\sigma, \xi)$
3. Empirical rotation to reduce association
```{r PPsetup}
# Information about the rainfall data
?revdbayes::rainfall
# Set a threshold and use the prior from the evdbayes user guide
rthresh <- 40
prrain <- evdbayes::prior.quant(shape = c(38.9,7.1,47), scale = c(1.5,6.3,2.6))
```
```{r PPsampling}
# 1. Number of blocks = number of years of data (54)
r1 <- rpost(n = n, model = "pp", prior = prrain, data = rainfall,
thresh = rthresh, noy = 54, rotate = FALSE)
plot(r1)
# 2. Number of blocks = number of threshold excesses (use_noy = FALSE)
n_exc <- sum(rainfall > rthresh, na.rm = TRUE)
r2 <- rpost(n = n, model = "pp", prior = prrain, data = rainfall,
thresh = rthresh, noy = 54, use_noy = FALSE, rotate = FALSE)
plot(r2, ru_scale = TRUE)
# 3. # Rotation about maximum a posteriori estimate (MAP)
r3 <- rpost(n = n, model = "pp", prior = prrain, data = rainfall,
thresh = rthresh, noy = 54)
plot(r3, ru_scale = TRUE)
c(r1$pa, r2$pa, r3$pa)
```
* Based on the plots, which approach do you expect to have the largest probability of acceptance?
* Is this what you see in the values of `r1$pa`, `r2$pa` and `r3$pa`?
## GEV model for block maxima
```{r GEVprior}
# Information about the portpirie data
?revdbayes::portpirie
# Use the prior from the evdbayes user guide
mat <- diag(c(10000, 10000, 100))
pn <- set_prior(prior = "norm", model = "gev", mean = c(0, 0, 0), cov = mat)
```
### revdbayes
```{r revdbayes}
mat <- diag(c(10000, 10000, 100))
gevp <- rpost_rcpp(n = n, model = "gev", prior = pn, data = portpirie)
# Information about plot.evpost
?plot.evpost
# Can use the plots from the bayesplot package
plot(gevp, use_bayesplot = TRUE, fun_name = "dens")
plot(gevp, use_bayesplot = TRUE, pars = "xi", prob = 0.95)
```
### evdbayes
```{r evdbayes}
# evdbayes has a function ar.choice to help set the proposal SDs
# It does this by searching for values that achieve approximately
# target values for the acceptance rates (default 0.4 for all parameters)
?ar.choice
# Initialise evdbayes' Markov chain at the estimated posterior mean
init <- colMeans(gevp$sim_vals)
prop.sd.auto <- ar.choice(init = init, prior = pn, lh = "gev",
data = portpirie, psd = rep(0.01, 3),
tol = rep(0.02, 3))$psd
post <- posterior(n, init = init, prior = pn, lh = "gev",
data = portpirie, psd = prop.sd.auto)
for (i in 1:3) plot(post[, i], ylab = c("mu","sigma","xi")[i])
```
```{r coda}
post_for_coda <- coda::mcmc(post)
# Assuming no burn-in period
burnin <- 0
post_for_coda <- window(post_for_coda, start = burnin + 1)
# Trace plots and KDEs
plot(post_for_coda)
# The sampled values are autocorrelated
coda::acfplot(post_for_coda)
# Effective sample sizes (10,000 for revdbayes)
coda::effectiveSize(post_for_coda)
```
* In practice one would run multiple chains and use the Gelman-Rubin convergence diagnostic, e.g. `gelman.diag` in the `coda` package.
* The [Introduction to revdbayes](https://paulnorthrop.github.io/revdbayes/articles/revdbayes-a-vignette.html) shows that the posterior samples from `evdbayes` and `revdbayes` are in agreement.