Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature tidyselect #188

Merged
merged 25 commits into from
May 22, 2019
Merged

Feature tidyselect #188

merged 25 commits into from
May 22, 2019

Conversation

jgabry
Copy link
Member

@jgabry jgabry commented May 7, 2019

Closes #161
Closes #183

This PR allows the pars argument to MCMC plots be a list of quosures, as returned by dplyr::vars(). Internally bayesplot will call tidyselect::vars_select() if pars is a list of quosures. If pars is just a character vector then the behavior is unchanged from previous versions of bayesplot, which preserves backwards compatibility.

Example usage

x <- example_mcmc_draws()
mcmc_hist(x, pars = vars(-contains("beta"))
mcmc_hist(x, pars = vars(alpha, sigma))

Important changes for review

The important changes for review are in helpers-mcmc.R

if (rlang::is_quosures(pars)) {
pars <- tidyselect_parameters(complete_pars = parameter_names(x),
pars_list = pars)
} else {
pars <- select_parameters(complete_pars = parameter_names(x),
explicit = pars,
patterns = regex_pars)
}

and the new file tidy-params.R, which is mostly for examples of tidy parameter selection but also contains this new internal function:

tidyselect_parameters <- function(complete_pars, pars_list) {
helpers <- tidyselect::vars_select_helpers
pars_list <- lapply(pars_list, rlang::env_bury, !!! helpers)
selected <- tidyselect::vars_select(.vars = complete_pars, !!! pars_list)
if (!length(selected)) {
stop("No parameters were found matching those names.", call. = FALSE)
}
return(unname(selected))
}

@jgabry jgabry added the feature label May 7, 2019
@jgabry jgabry requested a review from tjmahr May 7, 2019 18:12
@codecov-io
Copy link

codecov-io commented May 7, 2019

Codecov Report

Merging #188 into master will increase coverage by 0.02%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #188      +/-   ##
==========================================
+ Coverage   99.33%   99.35%   +0.02%     
==========================================
  Files          30       31       +1     
  Lines        4212     4502     +290     
==========================================
+ Hits         4184     4473     +289     
- Misses         28       29       +1
Impacted Files Coverage Δ
R/helpers-shared.R 100% <ø> (ø) ⬆️
R/mcmc-scatterplots.R 99.24% <ø> (-0.21%) ⬇️
R/helpers-mcmc.R 99.08% <100%> (+0.1%) ⬆️
R/tidy-params.R 100% <100%> (ø)
R/ppc-test-statistics.R 100% <0%> (ø) ⬆️
R/mcmc-traces.R 100% <0%> (ø) ⬆️
R/ppc-loo.R 100% <0%> (ø) ⬆️
R/mcmc-distributions.R 100% <0%> (ø) ⬆️
R/mcmc-recover.R 100% <0%> (ø) ⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a947141...a1c1eec. Read the comment docs.

@jgabry
Copy link
Member Author

jgabry commented May 19, 2019

Given that this PR makes it possible to use the tidyselect helpers (e.g., starts_with(), num_range(), etc.) to select parameters, does it make sense to provide some additional helpers that are convenient?

For example, if parameters are named "beta1", "beta2", ..., "beta10" and we want to select just 2 through 7 then the num_range() tidyselect helper works great:

mcmc_hist(..., pars = vars(num_range("beta", 2:7)))

but num_range() doesn't work if the parameters have brackets in their names like "beta[1]", "beta[2]", ..., "beta[10]", which is far more common. But we could conceivably export a helper function like this:

param_range <- function(prefix, range) {
  nms <- paste0(prefix, "[", range, "]")
  param_matches <- match(nms, table = tidyselect::peek_vars())
  param_matches[!is.na(param_matches)]
}

which would enable

mcmc_hist(..., pars = vars(param_range("beta", 2:7)))

@tjmahr
Copy link
Collaborator

tjmahr commented May 20, 2019

num_range() doesn't work if the parameters have brackets in their names like "beta[1]", "beta[2]", ..., "beta[10]", which is far more common. But we could conceivably export a helper function like this:

This is great.

@jgabry
Copy link
Member Author

jgabry commented May 20, 2019

Ok cool I’ll add that to the PR.

@tjmahr
Copy link
Collaborator

tjmahr commented May 20, 2019

I was worried about mixed models where you get a lot going on the brackets, so I tried to make a general one. This would require glue, but glue is dependency-less and it is required by most of the tidyverse packages that we already depend on.

library(tidyverse)
#> Registered S3 methods overwritten by 'ggplot2':
#>   method         from 
#>   [.quosures     rlang
#>   c.quosures     rlang
#>   print.quosures rlang
d <- structure(list(b_Intercept = c(0.25, 0.13, 0.77, 0.67, 0.44, 
0.33), sd_condition__Intercept = c(1.01, 1.01, 0.78, 0.77, 0.89, 
0.76), sigma = c(0.63, 0.54, 0.62, 0.59, 0.47, 0.44), `r_condition[A,Intercept]` = c(-0.2, 
0.16, -0.82, -0.28, -0.53, -0.25), `r_condition[B,Intercept]` = c(0.98, 
0.53, 0.44, 0.15, 0.74, 0.41), `r_condition[C,Intercept]` = c(1.33, 
1.79, 0.83, 1.5, 1.15, 1.54), `r_condition[A,Slope]` = c(0.86, 
0.78, 0.19, -0.14, 0.87, 0.49), `r_condition[B,Slope]` = c(-1.18, 
-0.96, -1.66, -1.62, -1.53, -1.1), lp__ = c(-51.66, -51.15, -53.32, 
-56.79, -56.48, -54.96)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -6L))
  
param_glue <- function(pattern, ...) {
  dots <- as.list(expand.grid(...))
  nms <- as.character(glue::glue_data(dots, pattern))

  param_matches <- match(nms, tidyselect::peek_vars())
  param_matches[!is.na(param_matches)]
}

d
#> # A tibble: 6 x 9
#>   b_Intercept sd_condition__I~ sigma `r_condition[A,~ `r_condition[B,~
#>         <dbl>            <dbl> <dbl>            <dbl>            <dbl>
#> 1        0.25             1.01  0.63            -0.2              0.98
#> 2        0.13             1.01  0.54             0.16             0.53
#> 3        0.77             0.78  0.62            -0.82             0.44
#> 4        0.67             0.77  0.59            -0.28             0.15
#> 5        0.44             0.89  0.47            -0.53             0.74
#> 6        0.33             0.76  0.44            -0.25             0.41
#> # ... with 4 more variables: `r_condition[C,Intercept]` <dbl>,
#> #   `r_condition[A,Slope]` <dbl>, `r_condition[B,Slope]` <dbl>, lp__ <dbl>

d %>% 
  select(
    param_glue(
      "r_condition[{level},Intercept]", 
      level = c("A", "B"))
  )
#> # A tibble: 6 x 2
#>   `r_condition[A,Intercept]` `r_condition[B,Intercept]`
#>                        <dbl>                      <dbl>
#> 1                      -0.2                        0.98
#> 2                       0.16                       0.53
#> 3                      -0.82                       0.44
#> 4                      -0.28                       0.15
#> 5                      -0.53                       0.74
#> 6                      -0.25                       0.41

d %>% 
  select(
    param_glue(
      "r_condition[{level},{type}]", 
      level = c("A", "B"), 
      type = c("Intercept", "Slope"))
    )
#> # A tibble: 6 x 4
#>   `r_condition[A,In~ `r_condition[B,In~ `r_condition[A,S~ `r_condition[B,S~
#>                <dbl>              <dbl>             <dbl>             <dbl>
#> 1              -0.2                0.98              0.86             -1.18
#> 2               0.16               0.53              0.78             -0.96
#> 3              -0.82               0.44              0.19             -1.66
#> 4              -0.28               0.15             -0.14             -1.62
#> 5              -0.53               0.74              0.87             -1.53
#> 6              -0.25               0.41              0.49             -1.1

Created on 2019-05-20 by the reprex package (v0.3.0)

@jgabry
Copy link
Member Author

jgabry commented May 20, 2019

Good idea!

@jgabry
Copy link
Member Author

jgabry commented May 20, 2019

I’ll incorporate that.

@jgabry
Copy link
Member Author

jgabry commented May 21, 2019

I think this is ready to go.

@tjmahr
Copy link
Collaborator

tjmahr commented May 22, 2019

I expanded the documentation to tidy-params to include all the select() gotchas I could think of. Everything else seems fine.

@jgabry
Copy link
Member Author

jgabry commented May 22, 2019

Awesome, thanks!

@jgabry jgabry merged commit 98da77c into master May 22, 2019
@jgabry jgabry deleted the feature-tidyselect branch May 22, 2019 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

'tidy' parameter selection Exclude parameters (eg: "lp__")
3 participants