Skip to content

RW + drift prediction interval differs from textbook #368

Closed
@david-cortes

Description

@david-cortes

I am trying to get a prediction interval for a simple drift model. I observe some small differences between the numbers produced by fable and the numbers that I would get by following the procedure from the reference that the docs point to:
https://otexts.com/fpp3/prediction-intervals.html

Suppose I generate a random series like this:

set.seed(123)
s0 <- rgamma(1,15,1)
slope <- rgamma(1,2,1)
slen <- 20
line <- s0 + slope * seq(1, slen)
s <- line * runif(slen, min=.7, max=1.3)
print(s)
 [1] 11.45998 19.38955 27.64921 26.48457 28.24855 41.17304 34.62969 43.08775 44.09223
[10] 34.70767 60.59940 44.23600 40.24937 52.73816 79.06283 80.72730 76.69836 78.13231
[19] 97.72110 86.03193

According to the reference, I should be able to generate a forecast following a random walk with drift model as follows:

est_slope <- (s[slen] - s[1]) / (slen - 1)
seq_fc <- seq(1, 10)
forecasted <- s[slen] + est_slope * seq_fc
forecasted
 [1]  89.95677  93.88161  97.80645 101.73129 105.65613 109.58097 113.50581 117.43064
 [9] 121.35548 125.28032

So far, so good - I get the same result from fable:

tbl <- data.frame(
    series = s,
    day = as.Date("2020-01-01") + seq(0, slen-1)
)
tbl <- as_tsibble(tbl, index="day")
tbl %>%
    model(pred = RW(series ~ drift())) %>%
    forecast(h = 10) %$%
    .mean
 [1]  89.95677  93.88161  97.80645 101.73129 105.65613 109.58097 113.50581 117.43064
 [9] 121.35548 125.28032

Now, according to the book, I should be able to calculate an 80% prediction interval with the following formula:

fitted <- s[1] + est_slope * seq(0, slen-1)
res_std <- sqrt(sum((s - fitted)^2) / (slen - 1))
alpha <- qnorm(0.9)
sd_multiplier <- sqrt(seq_fc * (1 + seq_fc / slen))
upper_pi_textbook <- forecasted + alpha * res_std * sd_multiplier
upper_pi_textbook
 [1] 101.6944 110.8717 119.0827 126.8274 134.2930 141.5723 148.7186 155.7655 162.7355
[10] 169.6443

But this doesn't match with fable's:

upper_pi_fable <- tbl %>%
    model(pred = RW(series ~ drift())) %>%
    forecast(h = 10) %>%
    hilo(80) %>%
    mutate(upper = `80%`$upper) %$%
    upper
upper_pi_fable
 [1] 105.2335 116.0474 125.6243 134.6085 143.2401 151.6377 159.8690 167.9765 175.9887
[10] 183.9256

Full code:

library(magrittr)
library(dplyr)
library(fable)

#### random data
set.seed(123)
s0 <- rgamma(1,15,1)
slope <- rgamma(1,2,1)
slen <- 20
line <- s0 + slope * seq(1, slen)
s <- line * runif(slen, min=.7, max=1.3)

#### textbook's formula
est_slope <- (s[slen] - s[1]) / (slen - 1)
seq_fc <- seq(1, 10)
forecasted <- s[slen] + est_slope * seq_fc

fitted <- s[1] + est_slope * seq(0, slen-1)
res_std <- sqrt(sum((s - fitted)^2) / (slen - 1))
alpha <- qnorm(0.9)
sd_multiplier <- sqrt(seq_fc * (1 + seq_fc / slen))
upper_pi_textbook <- forecasted + alpha * res_std * sd_multiplier


### fable
tbl <- data.frame(
    series = s,
    day = as.Date("2020-01-01") + seq(0, slen-1)
)
tbl <- as_tsibble(tbl, index="day")

upper_pi_fable <- tbl %>%
    model(pred = RW(series ~ drift())) %>%
    forecast(h = 10) %>%
    hilo(80) %>%
    mutate(upper = `80%`$upper) %$%
    upper

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions