-
Notifications
You must be signed in to change notification settings - Fork 301
Expand file tree
/
Copy pathpriors_demo.Rmd
More file actions
365 lines (284 loc) · 15 KB
/
priors_demo.Rmd
File metadata and controls
365 lines (284 loc) · 15 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
---
title: "Generating Priors based on Data and / or Expert knowledge"
output: html_vignette
vignette: >
%\VignetteIndexEntry{Generating Priors based on Data and / or Expert knowledge}
%\VignetteEngine{knitr::rmarkdown}
---
Fitting priors to data (point estimates, all grass species in GLOPNET).
-------------------------
### Example: Specific Leaf Area based on all grasses
First we are going to set up initial values.
```{r echo=FALSE, warning=FALSE, results='hide', message=FALSE}
library(PEcAn.priors)
# library(DEoptim)
library(dplyr)
set.seed(1)
iter <-10000 #jags chain and misc simulation length; 1k for test, 10k for real
inits <- list(list(.RNG.seed = 1,
.RNG.name = "base::Mersenne-Twister"),
list(.RNG.seed = 2,
.RNG.name = "base::Mersenne-Twister"),
list(.RNG.seed = 3,
.RNG.name = "base::Mersenne-Twister"),
list(.RNG.seed = 4,
.RNG.name = "base::Mersenne-Twister"))
```
### MLE Fit to glopnet Specific Leaf Area data
The [`PEcAn.priors::fit.dist`](https://github.com/PecanProject/pecan/blob/main/modules/priors/R/priors.R) helps to choose the best fit parameter distribution given some sample datasets.
```{r echo=FALSE, warning=FALSE, results='hide', message=FALSE, cache=TRUE, fig.width=8}
# devtools::install("pecanproject/pecan/modules/priors")
glopnet.data <- read.csv(system.file('extdata/glopnet.csv', package = 'PEcAn.priors')) # Wright et. al. 2004 supplementary file
library(ggplot2)
ggplot(glopnet.data, aes(color = GF)) +
geom_boxplot(aes(x = BIOME, y = 1000/10^log.LMA)) +
ylab("Specific Leaf Area") +
theme_bw()
ggplot(glopnet.data %>% subset(GF == 'G'),aes(x = BIOME, y = 1000/10^log.LMA)) +
geom_boxplot() +
geom_point(alpha = 0.25) +
ylab("Specific Leaf Area") +
theme_bw()
```
```{r}
glopnet.grass <- glopnet.data[which(glopnet.data$GF == 'G'), ] # GF = growth form; G=grass
## turnover time (tt)
glopnet.grass$tt <- 12/10^(glopnet.grass$log.LL)
ttdata <- data.frame(tt = glopnet.grass$tt[!is.na(glopnet.grass$tt)])
## specific leaf area
##glopnet.grass$sla <- 1000/ (0.48 * 10^glopnet.grass$log.LMA)
glopnet.grass$sla <- 1000/ (10^glopnet.grass$log.LMA)
sladata <- data.frame(sla = glopnet.grass$sla[!is.na(glopnet.grass$sla)])
```
The `fit.dist` function takes a vector of point estimates (in this case 125 observations of Specific Leaf Area from GLOPNET database are stored in `sladata`.
First, it prints out the fits of a subset of distributions (the 'f' distribution could not be fit). Second, it prints the
```{r}
dists <- c('gamma', 'lognormal','weibull', 'f')
fit.dist(sladata, dists )
```
```{r}
prior.dists <- rbind('SLA' = fit.dist(sladata, dists ),
'leaf_turnover_rate' = fit.dist(ttdata, dists))
slaprior <- with(prior.dists['SLA',], pr.dens(distribution, a, b))
ttprior <- with(prior.dists['leaf_turnover_rate',], pr.dens(distribution, a, b))
```
The `priorfig` function visualizes the chosen prior (line), with its mean and 95\%CI (dots) as well as the data used to generate the figure.
```{r}
prior.figures <- list()
prior.figures[['SLA']] <- priorfig(
priordata = sladata,
priordensity = slaprior,
trait = data.frame(id = "SLA", figid = "SLA", units="1"))
prior.figures[['leaf_turnover_rate']] <- priorfig(
priordata = ttdata,
priordensity = ttprior,
trait = data.frame(id = "leaf_turnover_rate", figid = "left_turnover_rate", units="yr-1"))
prior.figures
```
Fitting priors to data with uncertainty estimates (generating posterior predictive distribution of an unobserved C4 grass species based on values collected from many PFTs.
-------------------------
### Vcmax data from Wullschleger (1993)
### Classify Wullschleger Data into Functional Types ###########
#### 1. query functional types from BETY
This code is not run here - data are provided in .csv file below. This code requires connection to the BETYdb MySQL server.
```{r eval=FALSE}
wullschleger.species <- vecpaste(unique(wullschleger.data$AcceptedSymbol))
## load functional type data from BETY
con <- function() {query.bety.con(username = "bety", password = "bety",
host = "localhost", dbname = "bety")}
functional.data <- query.bety(paste("select distinct AcceptedSymbol, scientificname, GrowthHabit, Category from species where AcceptedSymbol in (", wullschleger.species, ") and GrowthHabit is not NULL and Category is not NULL;"), con())
write.csv(functional.data, 'inst/extdata/wullschleger_join_usdaplants.csv')
```
Picking up with the Wullschleger dataset joined to USDA Plants functional type classifications ...
```{r pft_wullschleger}
wullschleger.data <- read.csv(system.file('extdata/wullschleger1993updated.csv', package = 'PEcAn.priors'))
functional.data <- read.csv(system.file('extdata/wullschleger_join_usdaplants.csv', package = 'PEcAn.priors'))
subshrubs <- rownames(wullschleger.data) %in% c(grep('Shrub',wullschleger.data$GrowthHabit), grep('Subshrub', wullschleger.data$GrowthHabit))
########## 2. Merge functional type information into Wullschleger data
wullschleger.data <- merge(wullschleger.data, functional.data, by = 'AcceptedSymbol')
########## 3. Classify by functional type
## grass: any Graminoid Monocot
grass <- with(wullschleger.data, GrowthHabit == 'Graminoid' & Category == 'Monocot')
## forbs/herbs = forb
forb <- with(wullschleger.data, GrowthHabit == 'Forb/herb' | GrowthHabit == 'Vine, Forb/herb')
## woody = Shrubs, Subshrubs or Trees, add category to a few with missing information,
woody <- with(wullschleger.data, scientificname %in% c('Acacia ligulata', 'Acacia mangium', 'Arbutus unedo', 'Eucalyptus pauciflora', 'Malus', 'Salix') | rownames(wullschleger.data) %in% c(grep('Shrub', GrowthHabit), grep('Subshrub', GrowthHabit), grep('Tree', GrowthHabit)))
## gymnosperms
gymnosperm <- wullschleger.data$Category == 'Gymnosperm'
## ambiguous is both woody and herbaceous, will drop
ambiguous <- wullschleger.data$GrowthHabit %in% c("Subshrub, Shrub, Forb/herb, Tree", "Tree, Subshrub, Shrub", "Tree, Shrub, Subshrub, Forb/herb", "Subshrub, Forb/herb")
wullschleger.data$functional.type[grass] <- 1
wullschleger.data$functional.type[forb] <- 3
wullschleger.data$functional.type[woody & !gymnosperm] <- 4
wullschleger.data$functional.type[woody & gymnosperm] <- 5
wullschleger.data <- subset(wullschleger.data, !ambiguous)
############# Estimating SE and n ##################################
##verr, jerr: the "asymptotic" errors for Vcmax, Jmax using SAS nlim
##vucl,vlcl,jlcl,jucl: are upper and lower confidence limits of 95%CI
##Calculate SD as (1/2 the 95%CI)/1.96
wullschleger.data$vsd <- (wullschleger.data$vucl-wullschleger.data$vlcl)/(2*1.96)
##Calculate effective N as (SE/SD)^2 + 1
wullschleger.data$neff <- (wullschleger.data$vse/wullschleger.data$vsd)^2 + 1
wullschleger.data$se <- sqrt(wullschleger.data$vsd*sqrt(wullschleger.data$neff))
## recode species to numeric
species <- unique(wullschleger.data$scientificname)
sp <- rep(NA, nrow(wullschleger.data))
for(i in seq(species)){
sp[wullschleger.data$scientificname == species[i]] <- i
}
############# Scale values to 25C ##################################
wullschleger.data$corrected.vcmax <- PEcAn.utils::arrhenius.scaling(
wullschleger.data$vcmax,
old.temp = wullschleger.data$temp,
new.temp = 25)
############# Create data.frame for JAGS model ##################################
wullschleger.data <- data.frame(Y = wullschleger.data$corrected.vcmax,
obs.prec = 1 / (sqrt(wullschleger.data$neff) * wullschleger.data$se),
sp = sp,
ft = wullschleger.data$functional.type,
n = wullschleger.data$neff)
## Summarize data by species
wullschleger.vcmax <- wullschleger.data %>%
group_by(sp) %>%
summarise(Y = mean(Y),
obs.prec = 1/sqrt(sum((1/obs.prec)^2)),
ft = mean(ft), # identity
n = sum(n)) %>%
dplyr::select(-sp)
```
##### Add data from C4 species #########
Few measurements of Vcmax for C4 species were available.
###### Miscanthus: Wang D, Maughan MW, Sun J, Feng X, Miguez FE, Lee DK, Dietze MC, 2011. Impact of canopy nitrogen allocation on growth and photosynthesis of miscanthus (Miscanthus x giganteus). Oecologia, in press
```{r}
dwang.vcmax <- c(19.73, 40.35, 33.02, 21.28, 31.45, 27.83, 9.69, 15.99, 18.88, 11.45, 15.81, 27.61, 13, 21.25, 22.01, 10.37, 12.37, 22.8, 12.24, 15.85, 21.93, 23.48, 31.51, 23.16, 18.55, 17.06, 20.27, 30.41)
dwang.vcmax <- data.frame(Y = mean(dwang.vcmax),
obs.prec = 1/sd(dwang.vcmax),
ft = 2, #C4 Grass
n = length(dwang.vcmax))
```
###### Muhlenbergia glomerata
Kubien and Sage 2004. Low-temperature photosynthetic performance of a C4 grass and a co-occurring C3 grass native to high latitudes. Plant, Cell & Environment DOI: 10.1111/j.1365-3040.2004.01196.x
Data from figure 5 in Kubien and Sage (2004), average across plants grown at 14/10 degrees and 26/22 degrees
```{r}
kubien.vcmax <- data.frame(Y = mean(23.4, 24.8),
obs.prec = 1/sqrt(2.6^2 + 2.5^2),
n = 8,
ft = 2)
```
###### Zea Mays (Corn)
Massad, Tuzet, Bethenod 2007. The effect of temperature on C4-type leaf photosynthesis parameters. Plant, Cell & Environment 30(9) 1191-1204. DOI: 10.1111/j.1365-3040.2007.01691.x
data from fig 6a
```{r}
massad.vcmax.data <- data.frame(vcmax = c(17.1, 17.2, 17.6, 18, 18.3, 18.5, 20.4, 22.9, 22.9, 21.9, 21.8, 22.7, 22.3, 25.3, 24.4, 25.5, 25.5, 24.9, 31.2, 30.8, 31.6, 31.7, 32.5, 34.1, 34.2, 33.9, 35.4, 36, 36, 37.5, 38.2, 38.1, 37.4, 37.7, 25.2, 25.5),
temp = c(20.5, 24.6, 21, 19, 21.9, 15, 19.6, 14.3, 20.8, 23.4, 24.8, 25.9, 24.1, 22.8, 27.9, 31.7, 35.5, 39.3, 37.9, 42.4, 41.5, 48.7, 33.3, 33.3, 31.5, 39.1, 38.8, 43.3, 50, 38.4, 35.7, 34.4, 31.9, 33.9, 32.1, 33.7))
massad.vcmax <- with(
massad.vcmax.data,
PEcAn.utils::arrhenius.scaling(
old.temp = temp,
new.temp = 25,
observed.value = vcmax)
)
massad.vcmax <- data.frame(Y = mean(massad.vcmax),
obs.prec = 1/(sd(massad.vcmax)),
n = length(massad.vcmax),
ft = 2)
##### Combine all data sets
all.vcmax.data <- rbind(wullschleger.vcmax,
dwang.vcmax,
## xfeng.vcmax,
kubien.vcmax,
massad.vcmax)
```
##### take a look at the raw data by functional type:
```{r}
ggplot(data = all.vcmax.data, aes(factor(ft), Y)) +
geom_boxplot() +
geom_point() +
xlab('Plant Functional Type') +
ylab('Vcmax') +
theme_bw()
```
##### Add unobserved C4 species so JAGS calculates posterior predictive distribution
```{r}
vcmax.data <- rbind(all.vcmax.data,
data.frame(Y = NA, obs.prec = NA, ft = 2, n = 1))
```
##### Write and Run Meta-analysis
```{r}
writeLines(con = "vcmax.prior.bug",
text = "model{
for (k in 1:length(Y)){
Y[k] ~ dnorm(beta.ft[ft[k]], tau.y[k])T(0,)
tau.y[k] <- prec.y*n[k]
u1[k] <- n[k]/2
u2[k] <- n[k]/(2*prec.y)
obs.prec[k] ~ dgamma(u1[k], u2[k])
}
for (f in 1:5){
beta.ft[f] ~ dnorm(0, tau.ft)
}
tau.ft ~ dgamma(0.1, 0.1)
prec.y ~ dgamma(0.1, 0.1)
sd.y <- 1 / sqrt(prec.y)
## estimating posterior predictive for new C4 species
pi.pavi <- Y[length(Y)]
diff <- beta.ft[1] - beta.ft[2]
}")
library(rjags)
j.model <- jags.model(file = "vcmax.prior.bug",
data = vcmax.data,
n.adapt = 500,
n.chains = 4,
inits = inits)
mcmc.object <- coda.samples(model = j.model, variable.names = c('pi.pavi', 'beta.ft', 'diff'),
n.iter = iter)
mcmc.o <- window(mcmc.object, start = iter/2, thin = 10)
pi.pavi <- data.frame(vcmax = unlist(mcmc.o[,'pi.pavi']))
vcmax.dist <- fit.dist(pi.pavi, n = sum(!is.na(vcmax.data$Y)))
prior.dists <- rbind(prior.dists, 'Vcmax' = vcmax.dist)
vcmax.density <- with(vcmax.dist, pr.dens(distribution, a, b), xlim = c(0,50))
######### Vcmax Prior Plot
vcmax.c4 <- all.vcmax.data$ft == 2
vcmax.data <- all.vcmax.data[,c('ft', 'Y')]
vcmax.data$mean <- vcmax.data$Y
vcmax.data$se <- 1/(sqrt(all.vcmax.data$n) * all.vcmax.data$obs.prec)
vcmax.data <- vcmax.data[,c('ft', 'mean', 'se')]
#vcmax.mean <- all.vcmax.data$Y[!c4,]
#vcmax.se <- with(all.vcmax.data[!c4,], 1/(sqrt(n) * obs.prec))#[reorder]
#c4.mean <- all.vcmax.data$Y)[c4]
#c4.se <- with(all.vcmax.data, 1 / (sqrt(n) * obs.prec) )[c4]
prior.figures[['Vcmax']] <- priorfig(priordensity = vcmax.density,
trait = data.frame(id = 'Vcmax', figid = 'Vcmax', units = 'umol CO2 m-2 s-1'),
xlim = range(c(0, max(vcmax.data$mean)))) +
geom_point(data = subset(vcmax.data, ft != 2),
aes(x=mean, y = (sum(vcmax.c4) + 1:sum(!vcmax.c4))/3000), color = 'grey60') +
geom_segment(data = subset(vcmax.data, ft != 2),
aes(x = mean - se, xend = mean + se,
y = (sum(vcmax.c4) + 1:sum(!vcmax.c4))/3000,
yend = (sum(vcmax.c4) + 1:sum(!vcmax.c4))/3000),
color = 'grey60') +
## darker color for C4 grasses
geom_point(data = subset(vcmax.data, ft == 2),
aes(x=mean, y = 1:sum(vcmax.c4)/2000), size = 3) +
geom_segment(data = subset(vcmax.data, ft == 2),
aes(x = mean - se, xend = mean + se,
y = 1:sum(vcmax.c4)/2000, yend = 1:sum(vcmax.c4)/2000))
print(prior.figures[['Vcmax']])
```
Fitting priors to expert constraint.
## Other Examples / Approaches
### Examples
* Estimating priors for the DALEC model in [models/dalec/inst/DALEC_priors.R](https://github.com/PecanProject/pecan/blob/main/models/dalec/inst/DALEC_priors.R)
### Package `rriskDistributions`
The [rriskDistributions](http://cran.r-project.org/web/packages/rriskDistributions/index.html) useful for estimating priors ...
as well as individual functions for each distribution such as `get.<somedist>.par` that provide nice diagnostic plots (e.g. compare chosen points to cdf) for example, to compute the parameters of a Gamma distribution that fits a median of 2.5 and has a 95%CI of [1, 10]:
```{r eval=FALSE}
library(rriskDistributions)
get.gamma.par(p = c(0.025, 0.50, 0.975), q = c(1, 2.5, 10), tol = 0.1)
```
The function `fit.pecr` provides a GUI to explore the fits of a range of distributions, e.g.:
```{r eval=FALSE}
fit.perc(p = c(0.1, 0.5, 0.9), q = c(30, 60, 90), tolConv = 0.1)
```
produces this:
