Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vars() seems required for rows_*() but not other functions #416

Closed
6 tasks done
Aariq opened this issue Jun 16, 2022 · 6 comments
Closed
6 tasks done

vars() seems required for rows_*() but not other functions #416

Aariq opened this issue Jun 16, 2022 · 6 comments

Comments

@Aariq
Copy link

Aariq commented Jun 16, 2022

Prework

  • Read and agree to the code of conduct and contributing guidelines.
  • If there is already a relevant issue, whether open or closed, comment on the existing thread instead of posting a new issue.
  • Post a minimal reproducible example so the maintainer can troubleshoot the problems you identify. A reproducible example is:
    • Runnable: post enough R code and data so any onlooker can create the error on their own computer.
    • Minimal: reduce runtime wherever possible and remove complicated details that are irrelevant to the issue at hand.
    • Readable: format your code according to the tidyverse style guide.

Description

Not entirely sure this is a bug, but it is inconsistent. The examples for pointblank use vars() to select columns, but it seems like c() works identically for most functions except rows_distinct() and rows_complete().

Reproducible example

library(pointblank)
agent <- 
  create_agent(
    tbl = small_table
  )

x <- agent %>%
  col_vals_lt(vars(a), value = 10)
x <- agent %>%
  col_vals_lt(c(a), value = 10)

x <- agent %>%
  rows_distinct(vars(a))
x <- agent %>%
  rows_distinct(c(a))
#> Error in rlang::eval_tidy(columns): object 'a' not found

x <- agent %>%
  rows_complete(vars(a))
x <- agent %>%
  rows_complete(c(a))
#> Error in rlang::eval_tidy(columns): object 'a' not found

Created on 2022-06-16 by the reprex package (v2.0.1)

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.2.0 (2022-04-22)
#>  os       macOS Big Sur/Monterey 10.16
#>  system   x86_64, darwin17.0
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       America/New_York
#>  date     2022-06-16
#>  pandoc   2.17.1.1 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version     date (UTC) lib source
#>  assertthat    0.2.1       2019-03-21 [1] CRAN (R 4.2.0)
#>  blastula      0.3.2       2020-05-19 [1] CRAN (R 4.2.0)
#>  cli           3.3.0       2022-04-25 [1] CRAN (R 4.2.0)
#>  crayon        1.5.1       2022-03-26 [1] CRAN (R 4.2.0)
#>  DBI           1.1.2       2021-12-20 [1] CRAN (R 4.2.0)
#>  digest        0.6.29      2021-12-01 [1] CRAN (R 4.2.0)
#>  dplyr         1.0.9       2022-04-28 [1] CRAN (R 4.2.0)
#>  ellipsis      0.3.2       2021-04-29 [1] CRAN (R 4.2.0)
#>  evaluate      0.15        2022-02-18 [1] CRAN (R 4.2.0)
#>  fansi         1.0.3       2022-03-24 [1] CRAN (R 4.2.0)
#>  fastmap       1.1.0       2021-01-25 [1] CRAN (R 4.2.0)
#>  fs            1.5.2       2021-12-08 [1] CRAN (R 4.2.0)
#>  generics      0.1.2       2022-01-31 [1] CRAN (R 4.2.0)
#>  glue          1.6.2       2022-02-24 [1] CRAN (R 4.2.0)
#>  highr         0.9         2021-04-16 [1] CRAN (R 4.2.0)
#>  htmltools     0.5.2       2021-08-25 [1] CRAN (R 4.2.0)
#>  knitr         1.39        2022-04-26 [1] CRAN (R 4.2.0)
#>  lifecycle     1.0.1       2021-09-24 [1] CRAN (R 4.2.0)
#>  magrittr      2.0.3       2022-03-30 [1] CRAN (R 4.2.0)
#>  pillar        1.7.0       2022-02-01 [1] CRAN (R 4.2.0)
#>  pkgconfig     2.0.3       2019-09-22 [1] CRAN (R 4.2.0)
#>  pointblank  * 0.10.0.9000 2022-06-09 [1] Github (rich-iannone/pointblank@2cf8431)
#>  purrr         0.3.4       2020-04-17 [1] CRAN (R 4.2.0)
#>  R6            2.5.1       2021-08-19 [1] CRAN (R 4.2.0)
#>  reprex        2.0.1       2021-08-05 [1] CRAN (R 4.2.0)
#>  rlang         1.0.2       2022-03-04 [1] CRAN (R 4.2.0)
#>  rmarkdown     2.14        2022-04-25 [1] CRAN (R 4.2.0)
#>  rstudioapi    0.13        2020-11-12 [1] CRAN (R 4.2.0)
#>  sessioninfo   1.2.2       2021-12-06 [1] CRAN (R 4.2.0)
#>  stringi       1.7.6       2021-11-29 [1] CRAN (R 4.2.0)
#>  stringr       1.4.0       2019-02-10 [1] CRAN (R 4.2.0)
#>  tibble        3.1.7       2022-05-03 [1] CRAN (R 4.2.0)
#>  tidyselect    1.1.2       2022-02-21 [1] CRAN (R 4.2.0)
#>  utf8          1.2.2       2021-07-24 [1] CRAN (R 4.2.0)
#>  vctrs         0.4.1       2022-04-13 [1] CRAN (R 4.2.0)
#>  withr         2.5.0       2022-03-03 [1] CRAN (R 4.2.0)
#>  xfun          0.31        2022-05-10 [1] CRAN (R 4.2.0)
#>  yaml          2.3.5       2022-02-21 [1] CRAN (R 4.2.0)
#> 
#>  [1] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

Expected result

I would have expected either vars() or c() to work in the columns argument for rows_distinct() as either one works for many other functions in pointblank.

@Aariq
Copy link
Author

Aariq commented Jun 16, 2022

maybe related to #414

@Aariq Aariq changed the title vars() seems required for rows_distinct() but not other functions vars() seems required for rows_*() but not other functions Jun 16, 2022
@rich-iannone rich-iannone added this to the v0.11.0 milestone Jun 18, 2022
@rich-iannone
Copy link
Member

Thanks for submitting this issue and the previous one! Agree that it’s inconsistent and I should be able to address this soon.

@rich-iannone rich-iannone modified the milestones: v0.11.0, v0.12.0 Jun 23, 2022
@Aariq
Copy link
Author

Aariq commented Jul 14, 2022

I think I noticed that the segments argument also requires vars() and doesn't work with c()

@rich-iannone
Copy link
Member

Thanks @Aariq , I made a new issue to track this. Eventually I do want to get all these column/value inconsistencies fixed up.

Maybe off topic a bit but the place where things are non-ideal is in the value argument in many of the validation functions. It can take either literal values or even columns (where vars() might be best). I have to think a bit more on the UX implications of this doing further work on that, but perhaps there is nothing else that could be done with that particular arg without breaking a lot.

@marianschmidt
Copy link

Since vars() is flagged as superseeded in the linked dplyr documentation, maybe better break it now than later.

For all expectations culumn only works with tidyselection helpers or column names, whereas value gives an error when using tidyselect ("must be used within a selecting function") and only works with vars().

@yjunechoe
Copy link
Collaborator

Thanks for the patience! This is now supported in devel. columns = vars(a) and columns = c(a) are identical sans the captured user expression:

agent <- 
  create_agent(
    tbl = small_table
  )

x1 <- agent %>%
  rows_distinct(vars(a))
x2 <- agent %>%
  rows_distinct(c(a))

waldo::compare(x1, x2)
#> `old$validation_set$columns_expr`: "vars(a)"
#> `new$validation_set$columns_expr`: "c(a)"

x3 <- agent %>%
  rows_complete(vars(a))
x4 <- agent %>%
  rows_complete(c(a))

waldo::compare(x3, x4)
#> `old$validation_set$columns_expr`: "vars(a)"
#> `new$validation_set$columns_expr`: "c(a)"

We now encourage using c() instead of vars() for column selection (like dplyr::select()). Please file an issue if any vars()->c() conversion breaks your existing code!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants