Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation and general cleanup for #493, #495 #496

Merged
merged 46 commits into from
Oct 31, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
f25e659
bug fix tidyselect for validation steps inside serially()
yjunechoe Oct 29, 2023
7f91c61
first pass of vars()-stripping in examples
yjunechoe Oct 29, 2023
10bfbae
fix example typo
yjunechoe Oct 29, 2023
2f1046d
document()
yjunechoe Oct 29, 2023
2f603fe
fix serially() example in yaml section
yjunechoe Oct 29, 2023
7494a42
allow passing dots down into eval_select()
yjunechoe Oct 29, 2023
908a6ae
fix bug in col_exists() allowing any arbitrary errors in column selec…
yjunechoe Oct 29, 2023
050f89d
test various column selection failure behaviors of col_exists()
yjunechoe Oct 29, 2023
6a59cec
tidyselect 0-column selection in col_exists() should fail gracefully
yjunechoe Oct 29, 2023
9dacc71
bring back old behavior of error when no `columns` provided to `col_e…
yjunechoe Oct 29, 2023
e3f9675
resolve_columns() passes down validation call context
yjunechoe Oct 29, 2023
2e889dc
fix typo in test
yjunechoe Oct 29, 2023
72a030e
lintr
yjunechoe Oct 29, 2023
ee61368
clean up yaml_agent_string() mutually exclusive arg logic
yjunechoe Oct 29, 2023
bde99bb
default to c()-expr when writing columns to yaml
yjunechoe Oct 30, 2023
644c948
read c()-expr from yaml as language not character
yjunechoe Oct 30, 2023
751d06a
change some yaml tests to expect writing to c()
yjunechoe Oct 30, 2023
44d510e
test columns c()-expr roundtrip
yjunechoe Oct 30, 2023
f42385a
test defaulting to c() for wrapping columns in yaml
yjunechoe Oct 30, 2023
895eee9
remove vars() from yaml section in docs
yjunechoe Oct 30, 2023
a0d696d
more cleanup of vars() in docs
yjunechoe Oct 30, 2023
98d2718
document()
yjunechoe Oct 30, 2023
07c6636
enquo() columns only once
yjunechoe Oct 30, 2023
22e5b23
document generic glue and multi-length vector support in label
yjunechoe Oct 30, 2023
883078c
point to Label section for more info
yjunechoe Oct 30, 2023
12002d0
give tidy-select to columns argument signature and reference dplyr se…
yjunechoe Oct 30, 2023
56d82ff
update Column Names section
yjunechoe Oct 30, 2023
dbd424d
add Labels section
yjunechoe Oct 30, 2023
2748a5c
wording
yjunechoe Oct 30, 2023
701ce84
keep e.g. style
yjunechoe Oct 30, 2023
045aa70
explicit everything() default for columns arg in formals
yjunechoe Oct 30, 2023
aefc633
repeat for expect and test
yjunechoe Oct 30, 2023
f787627
separate out yaml tests for columns
yjunechoe Oct 30, 2023
7bfa3e7
prune NULL=everything() code
yjunechoe Oct 30, 2023
a3db90a
rows*() functions write column exprs to yaml
yjunechoe Oct 30, 2023
da379f9
edit tests to expect column exprs from rows* functions
yjunechoe Oct 30, 2023
559be0c
test everything() round-tripping
yjunechoe Oct 30, 2023
8ae7d5a
document()
yjunechoe Oct 30, 2023
dfad342
document everything() default for columns
yjunechoe Oct 30, 2023
7b6029f
update column names section
yjunechoe Oct 30, 2023
de6ec55
document()
yjunechoe Oct 30, 2023
732ef04
col_exists inherits tidyselect column signature
yjunechoe Oct 30, 2023
7bb6050
add Labels section to individual functions
yjunechoe Oct 30, 2023
43356ec
document()
yjunechoe Oct 30, 2023
f5a6225
add NEWS item for tidyselect in columns
yjunechoe Oct 30, 2023
51099b9
remove reference to vars()
yjunechoe Oct 30, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@

## New features

* Full `{tidyselect}` support for the `columns` argument of all validation functions. You can now use the full range of familiar column-selection expressions that you could also use in `dplyr::select()`. This also begins a process of deprecation:
- `vars()` for selecting columns will continue to work, but `c()` now supersedes `vars()`.
- If passing an *external vector* of column names, it should be wrapped in `all_of()`.

* The `label` argument of validation functions now exposes the following string variables via `{glue}` syntax:

- `"{.step}"`: The validation step name
Expand Down
14 changes: 7 additions & 7 deletions R/action_levels.R
Original file line number Diff line number Diff line change
Expand Up @@ -154,10 +154,10 @@
#' actions = al
#' ) %>%
#' col_vals_gt(
#' columns = vars(a), value = 2
#' columns = a, value = 2
#' ) %>%
#' col_vals_lt(
#' columns = vars(d), value = 20000
#' columns = d, value = 20000
#' ) %>%
#' interrogate()
#' ```
Expand Down Expand Up @@ -188,11 +188,11 @@
#' actions = al
#' ) %>%
#' col_vals_gt(
#' columns = vars(a), value = 2,
#' columns = a, value = 2,
#' actions = warn_on_fail(warn_at = 0.5)
#' ) %>%
#' col_vals_lt(
#' columns = vars(d), value = 20000
#' columns = d, value = 20000
#' ) %>%
#' interrogate()
#' ```
Expand Down Expand Up @@ -220,7 +220,7 @@
#' ```r
#' small_table %>%
#' col_vals_gt(
#' columns = vars(a), value = 2,
#' columns = a, value = 2,
#' actions = warn_on_fail(warn_at = 2)
#' )
#' ```
Expand Down Expand Up @@ -256,7 +256,7 @@
#'
#' ```r
#' small_table %>%
#' col_vals_gt(columns = vars(a), value = 2)
#' col_vals_gt(columns = a, value = 2)
#' ```
#'
#' ```
Expand All @@ -273,7 +273,7 @@
#' ```r
#' small_table %>%
#' col_vals_gt(
#' columns = vars(a), value = 2,
#' columns = a, value = 2,
#' actions = stop_on_fail(stop_at = 1)
#' )
#' ```
Expand Down
6 changes: 3 additions & 3 deletions R/all_passed.R
Original file line number Diff line number Diff line change
Expand Up @@ -77,9 +77,9 @@
#' ```r
#' agent <-
#' create_agent(tbl = tbl) %>%
#' col_vals_gt(columns = vars(a), value = 3) %>%
#' col_vals_lte(columns = vars(a), value = 10) %>%
#' col_vals_increasing(columns = vars(a)) %>%
#' col_vals_gt(columns = a, value = 3) %>%
#' col_vals_lte(columns = a, value = 10) %>%
#' col_vals_increasing(columns = a) %>%
#' interrogate()
#' ```
#'
Expand Down
11 changes: 11 additions & 0 deletions R/col_count_match.R
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,17 @@
#' depending on the situation (the first produces a warning, the other
#' `stop()`s).
#'
#' @section Labels:
#'
#' `label` may be a single string or a character vector that matches the number
#' of expanded steps. `label` also supports `{glue}` syntax and exposes the
#' following dynamic variables contextualized to the current step:
#'
#' - `"{.step}"`: The validation step name
#'
#' The glue context also supports ordinary expressions for further flexibility
#' (e.g., `"{toupper(.step)}"`) as long as they return a length-1 string.
#'
#' @section Briefs:
#'
#' Want to describe this validation step in some detail? Keep in mind that this
Expand Down
75 changes: 48 additions & 27 deletions R/col_exists.R
Original file line number Diff line number Diff line change
Expand Up @@ -34,14 +34,6 @@
#'
#' @inheritParams col_vals_gt
#'
#' @param columns *The target columns*
#'
#' `vector<character>|vars(<columns>)`` // **required**
#'
#' One or more columns from the table in focus. This can be
#' provided as a vector of column names using `c()` or bare column names
#' enclosed in [vars()].
#'
#' @return For the validation function, the return value is either a
#' `ptblank_agent` object or a table object (depending on whether an agent
#' object or a table was passed to `x`). The expectation function invisibly
Expand Down Expand Up @@ -69,12 +61,17 @@
#'
#' @section Column Names:
#'
#' If providing multiple column names, the result will be an expansion of
#' validation steps to that number of column names (e.g., `vars(col_a, col_b)`
#' will result in the entry of two validation steps). Aside from column names in
#' quotes and in `vars()`, **tidyselect** helper functions are available for
#' specifying columns. They are: `starts_with()`, `ends_with()`, `contains()`,
#' `matches()`, and `everything()`.
#' `columns` may be a single column (as symbol `a` or string `"a"`) or a vector
#' of columns (`c(a, b, c)` or `c("a", "b", "c")`). `{tidyselect}` helpers
#' are also supported, such as `contains("date")` and `where(is.double)`. If
#' passing an *external vector* of columns, it should be wrapped in `all_of()`.
#'
#' When multiple columns are selected by `columns`, the result will be an
#' expansion of validation steps to that number of columns (e.g.,
#' `c(col_a, col_b)` will result in the entry of two validation steps).
#'
#' Previously, columns could be specified in `vars()`. This continues to work,
#' but `c()` offers the same capability and supersedes `vars()` in `columns`.
#'
#' @section Actions:
#'
Expand All @@ -89,6 +86,18 @@
#' depending on the situation (the first produces a warning, the other
#' `stop()`s).
#'
#' @section Labels:
#'
#' `label` may be a single string or a character vector that matches the number
#' of expanded steps. `label` also supports `{glue}` syntax and exposes the
#' following dynamic variables contextualized to the current step:
#'
#' - `"{.step}"`: The validation step name
#' - `"{.col}"`: The current column name
#'
#' The glue context also supports ordinary expressions for further flexibility
#' (e.g., `"{toupper(.step)}"`) as long as they return a length-1 string.
#'
#' @section Briefs:
#'
#' Want to describe this validation step in some detail? Keep in mind that this
Expand All @@ -113,7 +122,7 @@
#' ```r
#' agent %>%
#' col_exists(
#' columns = vars(a),
#' columns = a,
#' actions = action_levels(warn_at = 0.1, stop_at = 0.2),
#' label = "The `col_exists()` step.",
#' active = FALSE
Expand All @@ -125,7 +134,7 @@
#' ```yaml
#' steps:
#' - col_exists:
#' columns: vars(a)
#' columns: c(a)
#' actions:
#' warn_fraction: 0.1
#' stop_fraction: 0.2
Expand Down Expand Up @@ -164,7 +173,7 @@
#' ```r
#' agent <-
#' create_agent(tbl = tbl) %>%
#' col_exists(columns = vars(a)) %>%
#' col_exists(columns = a) %>%
#' interrogate()
#' ```
#'
Expand All @@ -185,7 +194,7 @@
#' The behavior of side effects can be customized with the `actions` option.
#'
#' ```{r}
#' tbl %>% col_exists(columns = vars(a))
#' tbl %>% col_exists(columns = a)
#' ```
#'
#' ## C: Using the expectation function
Expand All @@ -194,7 +203,7 @@
#' time. This is primarily used in **testthat** tests.
#'
#' ```r
#' expect_col_exists(tbl, columns = vars(a))
#' expect_col_exists(tbl, columns = a)
#' ```
#'
#' ## D: Using the test function
Expand All @@ -203,7 +212,7 @@
#' us.
#'
#' ```{r}
#' tbl %>% test_col_exists(columns = vars(a))
#' tbl %>% test_col_exists(columns = a)
#' ```
#'
#' @family validation functions
Expand All @@ -229,23 +238,35 @@ col_exists <- function(
preconditions <- NULL
values <- NULL

# Get `columns` as a label
columns_expr <-
rlang::as_label(rlang::quo(!!enquo(columns))) %>%
gsub("^\"|\"$", "", .)

# Capture the `columns` expression
columns <- rlang::enquo(columns)
# Get `columns` as a label
columns_expr <- as_columns_expr(columns)
# Require columns to be specified
if (rlang::quo_is_missing(columns)) {
stop('argument "columns" is missing, with no default')
}
if (rlang::quo_is_null(columns)) {
columns <- rlang::quo(tidyselect::everything())
}

# Resolve the columns based on the expression
## Only for `col_exists()`: error gracefully if column not found
columns <- tryCatch(
expr = resolve_columns(x = x, var_expr = columns, preconditions = NULL),
error = function(cnd) cnd$i %||% NA_character_
expr = resolve_columns(x = x, var_expr = columns, preconditions = NULL,
allow_empty = FALSE),
error = function(cnd) cnd$i %||% cnd
)
if (rlang::is_error(columns)) {
cnd <- columns
# tidyselect 0-column selection should contextualize attempted column
if (is.null(cnd$parent)) {
columns <- columns_expr
} else {
# Evaluation errors should be chained and rethrown
rlang::abort("Evaluation error in `columns`", parent = cnd$parent)
}
}

if (is_a_table_object(x)) {

Expand Down
48 changes: 31 additions & 17 deletions R/col_is_character.R
Original file line number Diff line number Diff line change
Expand Up @@ -63,12 +63,17 @@
#'
#' @section Column Names:
#'
#' If providing multiple column names, the result will be an expansion of
#' validation steps to that number of column names (e.g., `vars(col_a, col_b)`
#' will result in the entry of two validation steps). Aside from column names in
#' quotes and in `vars()`, **tidyselect** helper functions are available for
#' specifying columns. They are: `starts_with()`, `ends_with()`, `contains()`,
#' `matches()`, and `everything()`.
#' `columns` may be a single column (as symbol `a` or string `"a"`) or a vector
#' of columns (`c(a, b, c)` or `c("a", "b", "c")`). `{tidyselect}` helpers
#' are also supported, such as `contains("date")` and `where(is.double)`. If
#' passing an *external vector* of columns, it should be wrapped in `all_of()`.
#'
#' When multiple columns are selected by `columns`, the result will be an
#' expansion of validation steps to that number of columns (e.g.,
#' `c(col_a, col_b)` will result in the entry of two validation steps).
#'
#' Previously, columns could be specified in `vars()`. This continues to work,
#' but `c()` offers the same capability and supersedes `vars()` in `columns`.
#'
#' @section Actions:
#'
Expand All @@ -84,6 +89,18 @@
#' 1)` or `action_levels(stop_at = 1)` are good choices depending on the
#' situation (the first produces a warning, the other will `stop()`).
#'
#' @section Labels:
#'
#' `label` may be a single string or a character vector that matches the number
#' of expanded steps. `label` also supports `{glue}` syntax and exposes the
#' following dynamic variables contextualized to the current step:
#'
#' - `"{.step}"`: The validation step name
#' - `"{.col}"`: The current column name
#'
#' The glue context also supports ordinary expressions for further flexibility
#' (e.g., `"{toupper(.step)}"`) as long as they return a length-1 string.
#'
#' @section Briefs:
#'
#' Want to describe this validation step in some detail? Keep in mind that this
Expand All @@ -108,7 +125,7 @@
#' ```r
#' agent %>%
#' col_is_character(
#' columns = vars(a),
#' columns = a,
#' actions = action_levels(warn_at = 0.1, stop_at = 0.2),
#' label = "The `col_is_character()` step.",
#' active = FALSE
Expand All @@ -120,7 +137,7 @@
#' ```yaml
#' steps:
#' - col_is_character:
#' columns: vars(a)
#' columns: c(a)
#' actions:
#' warn_fraction: 0.1
#' stop_fraction: 0.2
Expand Down Expand Up @@ -159,7 +176,7 @@
#' ```r
#' agent <-
#' create_agent(tbl = tbl) %>%
#' col_is_character(columns = vars(b)) %>%
#' col_is_character(columns = b) %>%
#' interrogate()
#' ```
#'
Expand All @@ -181,7 +198,7 @@
#'
#' ```{r}
#' tbl %>%
#' col_is_character(columns = vars(b)) %>%
#' col_is_character(columns = b) %>%
#' dplyr::slice(1:5)
#' ```
#'
Expand All @@ -191,7 +208,7 @@
#' time. This is primarily used in **testthat** tests.
#'
#' ```r
#' expect_col_is_character(tbl, columns = vars(b))
#' expect_col_is_character(tbl, columns = b)
#' ```
#'
#' ## D: Using the test function
Expand All @@ -200,7 +217,7 @@
#' us.
#'
#' ```{r}
#' tbl %>% test_col_is_character(columns = vars(b))
#' tbl %>% test_col_is_character(columns = b)
#' ```
#'
#' @family validation functions
Expand All @@ -226,13 +243,10 @@ col_is_character <- function(
preconditions <- NULL
values <- NULL

# Get `columns` as a label
columns_expr <-
rlang::as_label(rlang::quo(!!enquo(columns))) %>%
gsub("^\"|\"$", "", .)

# Capture the `columns` expression
columns <- rlang::enquo(columns)
# Get `columns` as a label
columns_expr <- as_columns_expr(columns)

# Resolve the columns based on the expression
columns <- resolve_columns(x = x, var_expr = columns, preconditions = NULL)
Expand Down
Loading
Loading