Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantile variant that returns a tibble #24

Open
hadley opened this issue Feb 8, 2019 · 2 comments
Open

Quantile variant that returns a tibble #24

hadley opened this issue Feb 8, 2019 · 2 comments

Comments

@hadley
Copy link
Member

hadley commented Feb 8, 2019

tibble::as_tibble(as.list(quantile(1:5)))
#> # A tibble: 1 x 5
#>    `0%` `25%` `50%` `75%` `100%`
#>   <dbl> <dbl> <dbl> <dbl>  <dbl>
#> 1     1     2     3     4      5

Created on 2019-02-08 by the reprex package (v0.2.1.9000)

Will need to think carefully about how the columns should be named.

@DavisVaughan
Copy link
Member

A two column tidy tibble may make sense

# devtools::install_github("tidyverse/dplyr", ref = "across_simpler")
library(dplyr, warn.conflicts = FALSE)
library(tibble)
library(tidyr)

tidy_quantile <- function(x) {
  enframe(quantile(x, probs = c(0, .5, 1)), name = "percentile")
}

# ---

# One column
iris %>%
  group_by(Species) %>%
  summarise(tidy_quantile(Sepal.Length))
#> # A tibble: 9 x 3
#>   Species    percentile value
#>   <fct>      <chr>      <dbl>
#> 1 setosa     0%           4.3
#> 2 setosa     50%          5  
#> 3 setosa     100%         5.8
#> 4 versicolor 0%           4.9
#> 5 versicolor 50%          5.9
#> 6 versicolor 100%         7  
#> 7 virginica  0%           4.9
#> 8 virginica  50%          6.5
#> 9 virginica  100%         7.9

# Multiple columns with a return value in the most useful format
iris %>%
  pivot_longer(-Species, names_to = "measure") %>%
  group_by(Species, measure) %>%
  summarise(tidy_quantile(value))
#> # A tibble: 36 x 4
#> # Groups:   Species [3]
#>    Species measure      percentile value
#>    <fct>   <chr>        <chr>      <dbl>
#>  1 setosa  Petal.Length 0%           1  
#>  2 setosa  Petal.Length 50%          1.5
#>  3 setosa  Petal.Length 100%         1.9
#>  4 setosa  Petal.Width  0%           0.1
#>  5 setosa  Petal.Width  50%          0.2
#>  6 setosa  Petal.Width  100%         0.6
#>  7 setosa  Sepal.Length 0%           4.3
#>  8 setosa  Sepal.Length 50%          5  
#>  9 setosa  Sepal.Length 100%         5.8
#> 10 setosa  Sepal.Width  0%           2.3
#> # … with 26 more rows

Created on 2019-11-09 by the reprex package (v0.3.0.9000)

@hadley
Copy link
Member Author

hadley commented Nov 29, 2019

Could possibly have quantile_wide() and quantile_long()? Or have some special data frame subclass so that you could could t() it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants