Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quantile() with multiple probabilities doesn't work for named distributions #79

Closed
Bisaloo opened this issue May 25, 2022 · 2 comments
Closed
Assignees

Comments

@Bisaloo
Copy link

Bisaloo commented May 25, 2022

quantile() fails when trying to get quantiles from a fbl_ts column with distributions. I was initially thinking it was affecting all tibbles but it seems restricted to fbl_ts, as shown in the reprex below where it's working for tibble and tsibble

library(distributional)

# Works with plain tibble
df <- tibble::tibble(
  name = paste0("dist", 1:5), 
  date = seq(as.Date("2022-05-20"), as.Date("2022-05-24"), by = 1),
  dist = dist_normal(1:5,1)
)

df |> 
  dplyr::mutate(q = quantile(dist, c(0.1, 0.9)))
#> # A tibble: 5 × 4
#>   name  date          dist q        
#>   <chr> <date>      <dist> <list>   
#> 1 dist1 2022-05-20 N(1, 1) <dbl [2]>
#> 2 dist2 2022-05-21 N(2, 1) <dbl [2]>
#> 3 dist3 2022-05-22 N(3, 1) <dbl [2]>
#> 4 dist4 2022-05-23 N(4, 1) <dbl [2]>
#> 5 dist5 2022-05-24 N(5, 1) <dbl [2]>

# Works with tsibble
df |> 
  tsibble::as_tsibble(key = name, index = date) |> 
  dplyr::mutate(q = quantile(dist, c(0.1, 0.9)))
#> # A tsibble: 5 x 4 [1D]
#> # Key:       name [5]
#>   name  date          dist q        
#>   <chr> <date>      <dist> <list>   
#> 1 dist1 2022-05-20 N(1, 1) <dbl [2]>
#> 2 dist2 2022-05-21 N(2, 1) <dbl [2]>
#> 3 dist3 2022-05-22 N(3, 1) <dbl [2]>
#> 4 dist4 2022-05-23 N(4, 1) <dbl [2]>
#> 5 dist5 2022-05-24 N(5, 1) <dbl [2]>

# Doesn't work with fbl_ts
library(fable)
#> Loading required package: fabletools
library(tsibbledata)

aus_production |>
  dplyr::select(Quarter, Cement) |>
  model(stl = NAIVE(Cement)) |>
  forecast(h = "5 years") |>
  dplyr::mutate(quantiles = quantile(Cement, c(0.1, 0.9)))
#> Error in `dplyr::mutate()`:
#> ! Problem while computing `quantiles = quantile(Cement, c(0.1, 0.9))`.
#> Caused by error in `FUN()`:
#> ! attempt to set 'colnames' on an object with less than two dimensions

#> Backtrace:
#>      ▆
#>   1. ├─dplyr::mutate(...)
#>   2. ├─dplyr:::mutate.data.frame(...)
#>   3. │ └─dplyr:::mutate_cols(.data, dplyr_quosures(...), caller_env = caller_env())
#>   4. │   ├─base::withCallingHandlers(...)
#>   5. │   └─mask$eval_all_mutate(quo)
#>   6. ├─stats::quantile(Cement, c(0.1, 0.9))
#>   7. ├─distributional:::quantile.distribution(Cement, c(0.1, 0.9))
#>   8. │ └─distributional:::dist_apply(x, quantile, p = p, ...)
#>   9. │   └─base::lapply(out, `colnames<-`, dn)
#>  10. │     └─base FUN(X[[i]], ...)
#>  11. │       └─base::stop("attempt to set 'colnames' on an object with less than two dimensions")
#>  12. └─base::.handleSimpleError(...)
#>  13.   └─dplyr h(simpleError(msg, call))
#>  14.     └─rlang::abort(...)

Created on 2022-05-25 by the reprex package (v2.0.1.9000)

The issue comes from

dn <- dimnames(x)

As shown in the 2nd reprex below, fbl_ts keep a dimname even after being pulled, which throws dist_apply() off:

library(fable)
#> Loading required package: fabletools
library(tsibbledata)

aus_production |>
  dplyr::select(Quarter, Cement) |>
  model(stl = NAIVE(Cement)) |>
  forecast(h = "5 years") |> 
  dplyr::pull(Cement) |> 
  dimnames()
#> [1] "Cement"

dplyr::starwars |> 
  dplyr::pull(height) |> 
  dimnames()
#> NULL

Created on 2022-05-25 by the reprex package (v2.0.1.9000)

Therefore, commenting

dn <- dimnames(x)

solves this specific issue but it probably causes problems elsewhere.

@mitchelloharawild
Copy link
Owner

MRE:

library(distributional)
x <- y <- dist_normal()
quantile(x, c(0.05, 0.95))
#> [[1]]
#> [1] -1.644854  1.644854
dimnames(y) <- "response"
quantile(y, c(0.05, 0.95))
#> Error in FUN(X[[i]], ...): attempt to set 'colnames' on an object with less than two dimensions

Created on 2022-09-02 by the reprex package (v2.0.1)

@mitchelloharawild mitchelloharawild self-assigned this Sep 1, 2022
@mitchelloharawild mitchelloharawild changed the title quantile() doesn't work on fbl_ts columns quantile() with multiple probabilities doesn't work for named distributions Sep 1, 2022
@mitchelloharawild
Copy link
Owner

Fixed, thanks!

library(distributional)
x <- y <- dist_normal()
quantile(x, c(0.05, 0.95))
#> [[1]]
#> [1] -1.644854  1.644854
dimnames(y) <- "response"
quantile(y, c(0.05, 0.95))
#> [[1]]
#> [1] -1.644854  1.644854

x <- y <- dist_multivariate_normal(mu = list(c(1,2)), 
                                   sigma = list(matrix(c(4,2,2,3), ncol=2)))
quantile(x, c(0.05, 0.95))
#> [[1]]
#>           [,1]       [,2]
#> [1,] -2.289707 -0.8489701
#> [2,]  4.289707  4.8489701
dimnames(y) <- c("a", "b")
quantile(y, c(0.05, 0.95))
#> [[1]]
#>              a          b
#> [1,] -2.289707 -0.8489701
#> [2,]  4.289707  4.8489701

Created on 2022-09-02 by the reprex package (v2.0.1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants