Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

forecast() doesn't back transform box_cox transformations if the lambda value isn't the same for all key groups #103

Closed
markfairbanks opened this issue Aug 2, 2019 · 4 comments

Comments

@markfairbanks
Copy link

markfairbanks commented Aug 2, 2019

The following code doesn't automatically back transform a box_cox transformation:

UKLungDeaths <- as_tsibble(cbind(mdeaths, fdeaths), pivot_longer = TRUE)

UKLungDeaths %>%
  left_join(UKLungDeaths %>% features(value, features = guerrero)) %>%
  model(ETS(box_cox(value, lambda_guerrero))) %>%
  forecast(h = 12)

Is there a different way to specify different lambda vals for different key groups?

@mitchelloharawild
Copy link
Member

Your actually touching on a really powerful feature within the transformations in fable - the ability to use other data and expressions when transforming the response.

library(fable)
library(dplyr)
library(feasts)
library(tsibble)
UKLungDeaths <- as_tsibble(cbind(mdeaths, fdeaths), pivot_longer = TRUE)

bc_lambda <- UKLungDeaths %>% 
  features(value, features = guerrero)

train_data <- UKLungDeaths %>%
  left_join(bc_lambda)
#> Joining, by = "key"

test_data <- new_data(UKLungDeaths, 12) %>% 
  left_join(bc_lambda)
#> Joining, by = "key"

train_data %>%
  model(
    ets = ETS(box_cox(value, first(lambda_guerrero)))
  ) %>%
  forecast(test_data) %>% 
  autoplot(UKLungDeaths)

Created on 2019-08-05 by the reprex package (v0.3.0)

The code that you have written allows the lambda value in the box cox transformation to change over time - only that it doesn't as you've kept this value constant. To simplify the matter, I've set the transformation to use only the first value of lambda (as they are all the same anyway). This is important as otherwise fable wouldn't be able to identify the response between value and lambda_guerrero, so it would model the transformed data (without backtransformation). By only using the first value (now length 1) the response can be determined automatically. You can also specify the response using resp(), which is useful in other scenarios: resp(GDP)/Population.

When forecasting ahead, fable also needs to know what the future values of lambda will be (as the original transformation was sourced from the modelling data). This can be provided in the new_data object. Using the horizon interface (h=12) is a convenient way to set up a future dataset, however it cannot include additional required information (such as exogenous regressors, or data for transformations like lambda in this case).

@markfairbanks
Copy link
Author

That all makes perfect sense - thank you for the in depth response.

@Tim-TU
Copy link

Tim-TU commented Jun 22, 2020

Hi,

I'm trying to get the accuracy table for the trainingset of the example above.

For this reason I'm using the function fabletools::accuracy().

I get an error that the object 'lambda_guerrero' can not be found.

Example Code is attached:

## accucracy table for box_cox_transformed response on training set: 

library(fable)
#> Lade nötiges Paket: fabletools
library(dplyr)
#> 
#> Attache Paket: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(feasts)
library(tsibble)

UKLungDeaths <- as_tsibble(cbind(mdeaths, fdeaths), pivot_longer = TRUE)

# not box_cox_transformed: 
mod_train <- UKLungDeaths %>%
  model(
    ets = ETS(value), 
    arima = ARIMA(value)
  )

accuracy(mod_train)
#> # A tibble: 4 x 10
#>   key     .model .type        ME  RMSE   MAE    MPE  MAPE  MASE     ACF1
#>   <chr>   <chr>  <chr>     <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl>    <dbl>
#> 1 fdeaths ets    Training  -7.96  65.7  45.3 -2.25   7.85 0.666  0.0463 
#> 2 fdeaths arima  Training -16.8   69.8  42.2 -4.01   7.50 0.620  0.0803 
#> 3 mdeaths ets    Training -13.4  155.  109.  -1.32   6.80 0.627  0.137  
#> 4 mdeaths arima  Training   9.71 147.   92.8  0.126  6.16 0.533 -0.00284



# with box_cox(): 
bc_lambda <- UKLungDeaths %>% 
  features(value, features = guerrero)

train_data_bc <- UKLungDeaths %>%
  left_join(bc_lambda, by = "key")

mod_train_bc <- train_data_bc %>%
  model(
    ets = ETS(box_cox(value, first(lambda_guerrero))), 
    arima = ARIMA(box_cox(value, first(lambda_guerrero)))
  )

accuracy(mod_train_bc)
#> Error: Problem with `mutate()` input `fit`.
#> x Objekt 'lambda_guerrero' nicht gefunden
#> i Input `fit` is `map(fit, accuracy, measures = measures, ...)`.

# Session Info: 
sessionInfo()
#> R version 4.0.1 (2020-06-06)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 10240)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252   
#> [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C                   
#> [5] LC_TIME=German_Germany.1252    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] tsibble_0.9.1    feasts_0.1.4     dplyr_1.0.0      fable_0.2.1     
#> [5] fabletools_0.2.0
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.4.6         urca_1.3-0           progressr_0.6.0     
#>  [4] pillar_1.4.4         compiler_4.0.1       highr_0.8           
#>  [7] tools_4.0.1          digest_0.6.25        lattice_0.20-41     
#> [10] nlme_3.1-148         lubridate_1.7.9      evaluate_0.14       
#> [13] lifecycle_0.2.0      tibble_3.0.1.9000    gtable_0.3.0        
#> [16] anytime_0.3.7        pkgconfig_2.0.3      rlang_0.4.6         
#> [19] cli_2.0.2            yaml_2.2.1           xfun_0.15           
#> [22] stringr_1.4.0        knitr_1.28           generics_0.0.2      
#> [25] vctrs_0.3.1          grid_4.0.1           tidyselect_1.1.0    
#> [28] glue_1.4.1           R6_2.4.1             fansi_0.4.1         
#> [31] distributional_0.1.0 rmarkdown_2.3        purrr_0.3.4         
#> [34] farver_2.0.3         ggplot2_3.3.2        tidyr_1.1.0         
#> [37] magrittr_1.5         scales_1.1.1         ellipsis_0.3.1      
#> [40] htmltools_0.5.0      assertthat_0.2.1     colorspace_1.4-1    
#> [43] utf8_1.1.4           stringi_1.4.6        munsell_0.5.0       
#> [46] crayon_1.3.4

Created on 2020-06-22 by the reprex package (v0.3.0)

Thanks in advance,
Tim

@mitchelloharawild
Copy link
Member

Closing as this is an unrelated issue. A new issue has been opened with a MRE for this here: #301

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants