Standardize problem types #51

gadenbuie · 2021-09-09T15:35:06Z

Since there's overlap between the problem type and the problem classes, we should standardize the problem names. Ideally we'd be able to identify the check function that creates a problem by inspecting the class name.

For example, the vec_check_levels() function now returns problems with type

n_levels
levels
level_order_diffs
level_order

but if all problem types started with levels_ we'd be able to identify the underlying check function more easily.

levels_n
levels
levels_order_diffs
levels_order

We should be careful to ensure that problem types from tbl_check_ and vec_check_ functions don't overlap unless it's unavoidable.

The text was updated successfully, but these errors were encountered:

rossellhayes · 2021-09-09T16:55:13Z

@gadenbuie This is great point. I'm wondering if it would be better to stack with prefixes rather than suffixes, e.g.

n_levels
levels
order_levels
diff_order_levels

Not as nice grammatically, but since general functions add prefixes (table_, column_, vector_), this would create a problem type structure of {general function}_{additional info}_{specific function}. That could make it easier to grep: ^{general} or {specific}$ to figure out the checking function.

gadenbuie · 2021-09-10T13:47:40Z

Another option would be to move away from the pattern prefixed problem type pattern and instead use additional classes to capture table, column and vector variants of lower-level checks. Dispatch would happen on the class that generated the check, but within that method we would test for the variant and choose alternate messages.

.result   <- tibble::tibble(x = factor(1:3, letters[1:3]))
.solution <- tibble::tibble(x = factor(1:3, letters[3:1]))
vec_grade_levels()
(problem <- tbl_check_table())
#> <tblcheck problem>
#> The order of the levels in your `x` column are the reverse of the expected order.
#> $ type  : chr "column_reverse_levels"
#> $ column: chr "x"

In this case dispatch happened on column_reverse_levels_problem, followed by reverse_levels_problem.

class(problem)
#> [1] "column_reverse_levels_problem" "vector_reverse_levels_problem"
#> [3] "reverse_levels_problem"        "tblcheck_problem"             
#> [5] "gradethis_problem"

If we adjusted tbl_message.reverse_levels_problem to consolidate all reverse_levels problems (which will probably be renamed levels_reversed, btw), we could use internal logic to decide which string to use.

tbl_message.reverse_levels_problem <- function(problem, ...) {
  problem$msg <- problem$msg %||%
    if (is_problem(problem, "column")) {
      gettext("The order of the levels in your `{column}` column are the reverse of the expected order.")
    } else {
      gettext("The order of the levels in your result are the reverse of the expected order.")
    }
  
  glue::glue_data(problem, problem$msg, problem$exp_msg)
}

Then the problem$type doesn’t need to change and the class list can still show the higher-level general check function that was called.

problem$type <- "reverse_levels"
class(problem)[1:2] <- c("column_problem", "vector_problem")
problem
#> <tblcheck problem>
#> The order of the levels in your `x` column are the reverse of the expected order.
#> $ type  : chr "reverse_levels"
#> $ column: chr "x"
class(problem)
#> [1] "column_problem"         "vector_problem"         "reverse_levels_problem"
#> [4] "tblcheck_problem"       "gradethis_problem"

This feels a little bit cleaner to me since the underlying problem isn't changing, it's still a reversed levels problem, just in a column context. I think we've structured the classes so that the table/column/vector class variants mostly construct the message templates, in which case it wouldn't be too bad to have all of the message variants collocated in the same function, especially since is_problem() makes it easy to test.

We might run into trouble, though, if there's a place where there are vastly different implementations between table/column/vector and standard result variants. Even that we could probably get around with helper functions.

If we go with the above, I think the argument for starting with the problem type is even stronger:

problem
#> <tblcheck problem>
#> The order of the levels in your `x` column are the reverse of the expected order.
#> $ type  : chr "levels_reversed"
#> $ column: chr "x"
class(problem)
#> [1] "column_problem"         "vector_problem"         "levels_reversed_problem"
#> [4] "tblcheck_problem"       "gradethis_problem"

* Standardize problem types (#51) * Do not use `seq_len(max_diffs)`, since this fails if `max_diffs` is Inf * Don't pass `max_diffs` to `vec_check_values()` * Add tests for `vec_grade_values()` * Use `if_problem()` for logic in `tbl_message()` methods * Fix tests with `max_diffs` * Add location element to problems to track whether problem came from `table`, `column`, or `vector`

This was referenced Sep 9, 2021

General checking functions should only rely on specific checking functions #52

Closed

Check for reversed levels and report both actual and expected levels #50

Merged

rossellhayes self-assigned this Sep 9, 2021

rossellhayes added the refactoring label Sep 9, 2021

rossellhayes added a commit that referenced this issue Sep 13, 2021

Standardize problem types (#51)

9dc2fee

rossellhayes mentioned this issue Sep 13, 2021

Standardize problem types #59

Merged

rossellhayes closed this as completed in #59 Sep 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standardize problem types #51

Standardize problem types #51

gadenbuie commented Sep 9, 2021

rossellhayes commented Sep 9, 2021 •

edited

Loading

gadenbuie commented Sep 10, 2021 •

edited

Loading

Standardize problem types #51

Standardize problem types #51

Comments

gadenbuie commented Sep 9, 2021

rossellhayes commented Sep 9, 2021 • edited Loading

gadenbuie commented Sep 10, 2021 • edited Loading

rossellhayes commented Sep 9, 2021 •

edited

Loading

gadenbuie commented Sep 10, 2021 •

edited

Loading