Closed
Description
I am trying to get a prediction interval for a simple drift model. I observe some small differences between the numbers produced by fable
and the numbers that I would get by following the procedure from the reference that the docs point to:
https://otexts.com/fpp3/prediction-intervals.html
Suppose I generate a random series like this:
set.seed(123)
s0 <- rgamma(1,15,1)
slope <- rgamma(1,2,1)
slen <- 20
line <- s0 + slope * seq(1, slen)
s <- line * runif(slen, min=.7, max=1.3)
print(s)
[1] 11.45998 19.38955 27.64921 26.48457 28.24855 41.17304 34.62969 43.08775 44.09223
[10] 34.70767 60.59940 44.23600 40.24937 52.73816 79.06283 80.72730 76.69836 78.13231
[19] 97.72110 86.03193
According to the reference, I should be able to generate a forecast following a random walk with drift model as follows:
est_slope <- (s[slen] - s[1]) / (slen - 1)
seq_fc <- seq(1, 10)
forecasted <- s[slen] + est_slope * seq_fc
forecasted
[1] 89.95677 93.88161 97.80645 101.73129 105.65613 109.58097 113.50581 117.43064
[9] 121.35548 125.28032
So far, so good - I get the same result from fable:
tbl <- data.frame(
series = s,
day = as.Date("2020-01-01") + seq(0, slen-1)
)
tbl <- as_tsibble(tbl, index="day")
tbl %>%
model(pred = RW(series ~ drift())) %>%
forecast(h = 10) %$%
.mean
[1] 89.95677 93.88161 97.80645 101.73129 105.65613 109.58097 113.50581 117.43064
[9] 121.35548 125.28032
Now, according to the book, I should be able to calculate an 80% prediction interval with the following formula:
fitted <- s[1] + est_slope * seq(0, slen-1)
res_std <- sqrt(sum((s - fitted)^2) / (slen - 1))
alpha <- qnorm(0.9)
sd_multiplier <- sqrt(seq_fc * (1 + seq_fc / slen))
upper_pi_textbook <- forecasted + alpha * res_std * sd_multiplier
upper_pi_textbook
[1] 101.6944 110.8717 119.0827 126.8274 134.2930 141.5723 148.7186 155.7655 162.7355
[10] 169.6443
But this doesn't match with fable's:
upper_pi_fable <- tbl %>%
model(pred = RW(series ~ drift())) %>%
forecast(h = 10) %>%
hilo(80) %>%
mutate(upper = `80%`$upper) %$%
upper
upper_pi_fable
[1] 105.2335 116.0474 125.6243 134.6085 143.2401 151.6377 159.8690 167.9765 175.9887
[10] 183.9256
Full code:
library(magrittr)
library(dplyr)
library(fable)
#### random data
set.seed(123)
s0 <- rgamma(1,15,1)
slope <- rgamma(1,2,1)
slen <- 20
line <- s0 + slope * seq(1, slen)
s <- line * runif(slen, min=.7, max=1.3)
#### textbook's formula
est_slope <- (s[slen] - s[1]) / (slen - 1)
seq_fc <- seq(1, 10)
forecasted <- s[slen] + est_slope * seq_fc
fitted <- s[1] + est_slope * seq(0, slen-1)
res_std <- sqrt(sum((s - fitted)^2) / (slen - 1))
alpha <- qnorm(0.9)
sd_multiplier <- sqrt(seq_fc * (1 + seq_fc / slen))
upper_pi_textbook <- forecasted + alpha * res_std * sd_multiplier
### fable
tbl <- data.frame(
series = s,
day = as.Date("2020-01-01") + seq(0, slen-1)
)
tbl <- as_tsibble(tbl, index="day")
upper_pi_fable <- tbl %>%
model(pred = RW(series ~ drift())) %>%
forecast(h = 10) %>%
hilo(80) %>%
mutate(upper = `80%`$upper) %$%
upper
Metadata
Metadata
Assignees
Labels
No labels