New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
forecast.TSLM does not use t distribution for prediction intervals #343
Comments
Because we assume normality when creating combination forecasts and in reconciliation, it would be problematic to use t distributions in fables by default. So it is probably best to add some comment in the documentation for forecast() to say we use a normal approximation. |
Thanks Rob, that sounds good to me. |
I've updated the docs and added a library(fpp3)
#> ── Attaching packages ──────────────────────────────────────────── fpp3 0.4.0 ──
#> ✓ tibble 3.1.3 ✓ tsibble 1.0.1
#> ✓ dplyr 1.0.7 ✓ tsibbledata 0.3.0
#> ✓ tidyr 1.1.3 ✓ feasts 0.2.2
#> ✓ lubridate 1.7.10 ✓ fable 0.3.1.9000
#> ✓ ggplot2 3.3.5.9000
#> ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
#> x lubridate::date() masks base::date()
#> x dplyr::filter() masks stats::filter()
#> x tsibble::intersect() masks base::intersect()
#> x tsibble::interval() masks lubridate::interval()
#> x dplyr::lag() masks stats::lag()
#> x tsibble::setdiff() masks base::setdiff()
#> x tsibble::union() masks base::union()
as_tsibble(USAccDeaths) %>%
model(lm = TSLM(log(value) ~ trend() + season())) %>%
forecast(approx_normal = FALSE) %>%
hilo()
#> # A tsibble: 24 x 6 [1M]
#> # Key: .model [1]
#> .model index value .mean `80%`
#> <chr> <mth> <dist> <dbl> <hilo>
#> 1 lm 1979 Jan t(t(59, 8.9, 0.055)) 7620. [7083.734, 8171.471]80
#> 2 lm 1979 Feb t(t(59, 8.8, 0.055)) 6899. [6413.748, 7398.606]80
#> 3 lm 1979 Mar t(t(59, 8.9, 0.055)) 7639. [7101.707, 8192.203]80
#> 4 lm 1979 Apr t(t(59, 9, 0.055)) 7841. [7289.238, 8408.530]80
#> 5 lm 1979 May t(t(59, 9.1, 0.055)) 8645. [8036.810, 9270.895]80
#> 6 lm 1979 Jun t(t(59, 9.1, 0.055)) 9087. [8447.703, 9744.882]80
#> 7 lm 1979 Jul t(t(59, 9.2, 0.055)) 9909. [9211.088, 10625.489]80
#> 8 lm 1979 Aug t(t(59, 9.1, 0.055)) 9237. [8587.107, 9905.692]80
#> 9 lm 1979 Sep t(t(59, 9, 0.055)) 8237. [7657.235, 8833.035]80
#> 10 lm 1979 Oct t(t(59, 9, 0.055)) 8517. [7917.291, 9133.023]80
#> # … with 14 more rows, and 1 more variable: 95% <hilo>
as_tsibble(USAccDeaths) %>%
model(lm = TSLM(log(value) ~ trend() + season())) %>%
forecast(approx_normal = TRUE) %>%
hilo()
#> # A tsibble: 24 x 6 [1M]
#> # Key: .model [1]
#> .model index value .mean `80%`
#> <chr> <mth> <dist> <dbl> <hilo>
#> 1 lm 1979 Jan t(N(8.9, 0.003)) 7620. [7089.402, 8164.937]80
#> 2 lm 1979 Feb t(N(8.8, 0.003)) 6899. [6418.880, 7392.690]80
#> 3 lm 1979 Mar t(N(8.9, 0.003)) 7639. [7107.390, 8185.653]80
#> 4 lm 1979 Apr t(N(9, 0.003)) 7841. [7295.070, 8401.807]80
#> 5 lm 1979 May t(N(9.1, 0.003)) 8645. [8043.240, 9263.482]80
#> 6 lm 1979 Jun t(N(9.1, 0.003)) 9087. [8454.462, 9737.091]80
#> 7 lm 1979 Jul t(N(9.2, 0.003)) 9908. [9218.459, 10616.993]80
#> 8 lm 1979 Aug t(N(9.1, 0.003)) 9237. [8593.978, 9897.772]80
#> 9 lm 1979 Sep t(N(9, 0.003)) 8237. [7663.362, 8825.973]80
#> 10 lm 1979 Oct t(N(9, 0.003)) 8516. [7923.626, 9125.721]80
#> # … with 14 more rows, and 1 more variable: 95% <hilo> Created on 2021-08-26 by the reprex package (v2.0.0) |
On a related issue @robjhyndman, when we Line 347 in 9b46cd8
|
|
I also noticed this. I was thinking of using
👍 |
When I compared the output of predict.lm to forecast.TSLM the prediction intervals are wider. Looking at the code, it appears predict.lm uses the t distribution (as I'd expect) while forecast.TSLM uses the normal distribution.
Perhaps this is a convention in the time series world but if so could it be documented?
Happy to provide a MWE if that helps.
The text was updated successfully, but these errors were encountered: