/
particle_filter_problem_set.qmd
196 lines (129 loc) · 5.85 KB
/
particle_filter_problem_set.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
---
title: "Particle filter"
format:
html:
embed-resources: true
editor: visual
---
## Problem set
You will need the following packages
```{r}
#| message: false
library(tidyverse)
library(lubridate)
```
### Overview
This exercise involves the following objectives
1. Modify the particle filter examples to apply to a new model. The model is provided in this assignment
2. Use out put from prior model run to initialize the particle filter
3. Run the particle filter without observations to forecast
4. Run the particle filter to assimilate new observations and forecast
5. Evaluate how the forecast depends on data assimilation
#### Data
The data for this exercise is aboveground biomass vs. age for a single forest plot. The data has two columns: age (in years) and biomass (in gC/m2)
#### Model
We are predicting the aboveground biomass using the following model
biomass = previous biomass + constant growth - mortality rate \* previous biomass
The constant growth is the parameter `g` below (in units of gC/m2/yr) and mortality rate is the parameter `u`below (proportion of biomass per year).
### Part 1: Fit model to historical data (Already done for you!)
This step is already done for you.
Here is the data for ages 1 through 50 for the plot (starting in 1950-01-01). It was measured every 5 years.
```{r}
plot_data <- read_csv("https://raw.githubusercontent.com/frec-5174/eco4cast-in-R-book/main/data/PF_data1.csv", show_col_types = FALSE)
```
```{r}
#| warning: false
#| fig-cap: Time-series of forest biomass data
#| label: fig-biomass-timeseries
ggplot(plot_data, aes(x = datetime, y = biomass)) +
geom_point() +
labs(x = "age", y = "aboveground biomass (gC/m2)") +
theme_bw()
```
The following model was used to estimate the posterior distributions of the parameters using a Bayesian framework (i.e., "batch" parameter and state estimation)
```
sim_dates <- seq(as_date("1950-01-01"), length.out = 50, by = "1 year")
num_particles <- ????
biomass <- array(NA, dim = c(length(sim_dates), num_particles))
biomass[1, ] <- WHAT ARE YOU INITIAL VALUES FOR THE FIRST TIME STEP?
for(t in 2:length(sim_dates)){
for(i in 1:num_particles){
biomass_predicted <- biomass[t-1, i] + param$g - param$u * biomass[t-1, i]
biomass[t, i] <- biomass_predicted + rnorm(1, mean = 0 , sd = param$sd_add)
}
}
```
The MCMC chain has posterior distributions for the parameters (`g`, `u`, `sd_add`) and biomass at age 50 (`age50_biomass`)
```{r}
df <- read_csv("https://github.com/frec-5174/eco4cast-in-R-book/raw/main/data/PF_MCMC.csv", show_col_types = FALSE)
```
```{r}
df |>
pivot_longer(-c(".draw"), names_to = "parameter_or_state", values_to = "value") |>
ggplot() +
geom_histogram(aes(x = value)) +
facet_wrap(~parameter_or_state, scales = "free") +
theme_bw()
```
### Part 2: Forecast using PF
Now you will use the MCMC chain to determine the mean parameter values and the initial condition at age 50 for the particle filter.
Using the lecture material create a particle filter that uses the forest growth model to simulate the aboveground biomass of the forest for age 50 through 70.
#### Step 1: Set up PF
**Question 1**
Following the code in the PF lecture set up the particle filter.
Be sure to:
- use the mean values for `g`, `u`, and `sd_add` from the MCMC chain as the parameter values
- use the distribution of the biomass at age 50 in the MCMC chain as your initial state for the PF (e.g., generate `num_particles` samples from the `age50_biomass` column in the csv provided above).
- The standard deviation for the observations (`sd_data`) is 200.
```{r}
#ADD CODE TO SET UP PF HERE
```
#### Step 2: Run particle filter without assimilating data
Write the code and run the particle filter based on the examples from the lecture. Do not assimilate any data in this run (e.g, use `y <- rep(NA, num_particles)` for your observations)
```{r}
#ADD PF CODE HERE
```
#### Step 3: Visualize particle filter output
Generate a plot that visualizes the output of the PF (see examples from the lecture). Your plot must have age on the x-axis and biomass on the y-axis with different lines for the particles.
```{r}
# ADD VISUALIZATION CODE HERE
```
#### Step 4: Save PF output
use this code to save your PF output as the object `initial_forecast`
```{r}
```
### Part 3:
Now we have new data!
```{r}
#| warning: false
new_data <- read_csv("https://github.com/frec-5174/eco4cast-in-R-book/raw/main/data/PF_data2.csv", show_col_types = FALSE)
ggplot(new_data, aes(x = datetime, y = biomass)) +
geom_point() +
labs(x = "age", y = "aboveground biomass (gC/m2)") +
theme_bw()
```
#### Step 1: Repeat the PF setup using data assimilation
Using the new data, repeat the PF set up in Part 2 Step 1. You will be starting at age 50 just like above but assimilating the data.
```{r}
#ADD CODE TO SET UP PF HERE
```
#### Step 2: Run particle filter using the new data
Using the new data, run the the PF again. This will be the same code as in Part 2 Step 2 (just copy and paste), except that data will be assimilated.
```{r}
#COPY AND PASTE PF CODE FROM ABOVE
```
#### Step 3: Visualize PF output
Generate a plot that visualizes the output of the PF (see examples from the lecture). Your plot must have age on the x-axis and biomass on the y-axis with different lines for the particles. Your observations from the new data must be on the plot.
```{r}
#ADD VISUALIZATION CODE HERE
```
#### Step 4: Save output
```{r}
```
### Part 4:
Combine the two PF run and evaluate how data assimilation influence the forecast of the last 10 years (age 60 to 70). Produce a plot with the mean and 90% CI for the initial_forecast and assimilated_forecast on the same plot. Include the observations in the plot.
```{r}
#ADD CODE TO COMPARE THE TWO PF OUTPUTS
```
### Part 5:
How did assimilating data influence your forecast for ages 60 to 70? Consider both the mean and uncertainty in your answer.