New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bad column selections fail gracefully at interrogation #499
Conversation
…ht as no column selection failure
I'm feeling pretty good about the coverage of bad column selection behaviors so far! I borrowed an existing (now outdated) batch test setup to run various column selection scenarios for all validation functions, organized into three groups:
All column selection failures now show up as 💥 in the report, and skipping behavior has been completely decoupled from agent %>%
col_vals_lt(columns = stop("Oh no!"), value = 5) %>%
interrogate()
#> Error in `col_vals_lt()`:
#> ! Problem while evaluating `stop("Oh no!")`.
#> Caused by error:
#> ! Oh no! The consequence of this PR is documented in agent <- create_agent(tbl = small_table[, c("a", "b", "c")])
mixed_cols <- c("a", "z")
select_exprs <- rlang::quos(
empty = ,
null = NULL,
exists = a,
nonexistent = z,
mixed = c(a, z),
mixed_all = all_of(mixed_cols),
mixed_any = any_of(mixed_cols),
empty_tidyselect = starts_with("z")
) Current behaviors summarized below:
|
empty/NULL | exists | nonexistent | mixed | mixed_all | mixed_any | empty_tidyselect | |
---|---|---|---|---|---|---|---|
n_steps | 1 | 1 | 1 | 2 | 2 | 1 | 1 |
column | NA | a | z | c("a", "z") | c("a", "z") | a | NA |
eval_error | TRUE | FALSE | TRUE | TRUE | TRUE | FALSE | TRUE |
row_*()
functions
empty/NULL | exists | nonexistent | mixed | mixed_all | mixed_any | empty_tidyselect | |
---|---|---|---|---|---|---|---|
n_steps | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
column | a, b, c | a | z | a, z | a, z | a | NA |
eval_error | FALSE | FALSE | TRUE | TRUE | TRUE | FALSE | TRUE |
col_exists()
function
empty/NULL | exists | nonexistent | mixed | mixed_all | mixed_any | empty_tidyselect | |
---|---|---|---|---|---|---|---|
n_steps | 1 | 1 | 1 | 2 | 2 | 1 | 1 |
column | NA | a | z | c("a", "z") | c("a", "z") | a | NA |
eval_error | TRUE | FALSE | FALSE | FALSE | FALSE | FALSE | FALSE |
The PR just tidies up expected behaviors and doesn't introduce any new features. As long as the above looks good, I think that should finally (let's hope!) wrap up the tidyselect integration in columns
! (I'll need to check up on the yaml stuff again after this PR, but that should be a simple follow up)
This is fantastic! The use of I'm not sure what could/should be done about the |
I agree! I actually went out of my way to make I just added a
I've edited the table in my prev comment to reflect this change. One minor aesthetic thing I'd also like to tackle while I'm on this topic is the fact that if 0 columns are selected, the report doesn't tell the user how they got there (it just shows an empty cell for Mini proposalA step like this that fails to select a column from an expression create_agent(small_table) %>%
col_is_logical(starts_with("z")) %>%
interrogate() Currently renders 💥 with nothing to show for in But could be neat and informative if it instead showed something like this: |
Thanks for making these changes. There is now a lot more consistency and the lack thereof (before this PR) definitely tripped up a few users! Also I really like the proposal for the reporting change. Having the report be more informative is definitely a good thing! |
Done! Now if the user attempts a dynamic column selection but none are found, the report will display the expression instead (colored red if that's part of the eval error). Some examples at
|
Whoa!! Yes, this is very good. Very, very good. The use of red for the explodey state is inspired! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LFGTM!
Super good work here! As ever, feel free to merge whenever! |
Oops didn't realize I dismissed your review by pushing doc/news updates 😅 - thanks for catching that I'll make new PRs for completeness of tidyselect coverage (for |
Following up on #497, this PR ensures that:
columns
cause the step to 💥, unless default is present (ex:everything()
inrows*()
functions)small_table %>% ... starts_with("z")
also cause the step to 💥small_table %>% ... all_of("z")
cause the step to 💥 and will also show attempted columnz
in reportLuckily, this required minimal changes to the code. The lifecycle of 0-column selection 💥 is now the following:
resolve_columns()
returnsNA
for 0-column selections, both in the case ofcolumns=NULL/<empty>
and tidyselecting 0-columns. From this point onwards, the two are treated as the same (we can always later recover how 0-column was selected by looking at$validation_set$columns_expr
).create_validation_step()
writesNA
for the$column
column of the validation stepinterrogate()
, theNA
is read in as-is at each step, and passed down to individualinterrogate_*()
functions. Previously, it would use this info to skip certain steps (I mistook this for a feature that should be applied to all failure-to-select cases, hence introducing the skipping bug!)get_column_as_sym_at_idx()
internal called by theinterrogate_*()
functions allowNA
to pass through as-is. Previously, it'd turnNA
into"NA"
NA
is caught insidecolumn_validity_has_columns()
called at the top of eachinterrogate_*()
function, and the error thrown from there is signaled to the report with a "... yielded no columns" evaluation failure.Current behavior (will write them into tests later):
NULL/missing booms
Selecting non-existent column booms
And shows the column attempted to select
And when tidyselect returns no matches,
columns
is empty, like in the case ofNULL
:Selecting existing column succeeds
Selecting a mix booms selectively
(NEW)
any_of()
safely selects only the existing columnsI totally forgot about
any_of()
and I feel like it does some of the job we wanthas_columns()
to do. Curious what your thoughts are on advertising this as one safe way to select columns in validation functions!