Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize problem types #51

Closed
gadenbuie opened this issue Sep 9, 2021 · 2 comments · Fixed by #59
Closed

Standardize problem types #51

gadenbuie opened this issue Sep 9, 2021 · 2 comments · Fixed by #59
Assignees

Comments

@gadenbuie
Copy link
Member

Since there's overlap between the problem type and the problem classes, we should standardize the problem names. Ideally we'd be able to identify the check function that creates a problem by inspecting the class name.

For example, the vec_check_levels() function now returns problems with type

  • n_levels
  • levels
  • level_order_diffs
  • level_order

but if all problem types started with levels_ we'd be able to identify the underlying check function more easily.

  • levels_n
  • levels
  • levels_order_diffs
  • levels_order

We should be careful to ensure that problem types from tbl_check_ and vec_check_ functions don't overlap unless it's unavoidable.

@rossellhayes
Copy link
Contributor

rossellhayes commented Sep 9, 2021

@gadenbuie This is great point. I'm wondering if it would be better to stack with prefixes rather than suffixes, e.g.

  • n_levels
  • levels
  • order_levels
  • diff_order_levels

Not as nice grammatically, but since general functions add prefixes (table_, column_, vector_), this would create a problem type structure of {general function}_{additional info}_{specific function}. That could make it easier to grep: ^{general} or {specific}$ to figure out the checking function.

@gadenbuie
Copy link
Member Author

gadenbuie commented Sep 10, 2021

Another option would be to move away from the pattern prefixed problem type pattern and instead use additional classes to capture table, column and vector variants of lower-level checks. Dispatch would happen on the class that generated the check, but within that method we would test for the variant and choose alternate messages.

.result   <- tibble::tibble(x = factor(1:3, letters[1:3]))
.solution <- tibble::tibble(x = factor(1:3, letters[3:1]))
vec_grade_levels()
(problem <- tbl_check_table())
#> <tblcheck problem>
#> The order of the levels in your `x` column are the reverse of the expected order.
#> $ type  : chr "column_reverse_levels"
#> $ column: chr "x"

In this case dispatch happened on column_reverse_levels_problem, followed by reverse_levels_problem.

class(problem)
#> [1] "column_reverse_levels_problem" "vector_reverse_levels_problem"
#> [3] "reverse_levels_problem"        "tblcheck_problem"             
#> [5] "gradethis_problem"

If we adjusted tbl_message.reverse_levels_problem to consolidate all reverse_levels problems (which will probably be renamed levels_reversed, btw), we could use internal logic to decide which string to use.

tbl_message.reverse_levels_problem <- function(problem, ...) {
  problem$msg <- problem$msg %||%
    if (is_problem(problem, "column")) {
      gettext("The order of the levels in your `{column}` column are the reverse of the expected order.")
    } else {
      gettext("The order of the levels in your result are the reverse of the expected order.")
    }
  
  glue::glue_data(problem, problem$msg, problem$exp_msg)
}

Then the problem$type doesn’t need to change and the class list can still show the higher-level general check function that was called.

problem$type <- "reverse_levels"
class(problem)[1:2] <- c("column_problem", "vector_problem")
problem
#> <tblcheck problem>
#> The order of the levels in your `x` column are the reverse of the expected order.
#> $ type  : chr "reverse_levels"
#> $ column: chr "x"
class(problem)
#> [1] "column_problem"         "vector_problem"         "reverse_levels_problem"
#> [4] "tblcheck_problem"       "gradethis_problem"

This feels a little bit cleaner to me since the underlying problem isn't changing, it's still a reversed levels problem, just in a column context. I think we've structured the classes so that the table/column/vector class variants mostly construct the message templates, in which case it wouldn't be too bad to have all of the message variants collocated in the same function, especially since is_problem() makes it easy to test.

We might run into trouble, though, if there's a place where there are vastly different implementations between table/column/vector and standard result variants. Even that we could probably get around with helper functions.

If we go with the above, I think the argument for starting with the problem type is even stronger:

problem
#> <tblcheck problem>
#> The order of the levels in your `x` column are the reverse of the expected order.
#> $ type  : chr "levels_reversed"
#> $ column: chr "x"
class(problem)
#> [1] "column_problem"         "vector_problem"         "levels_reversed_problem"
#> [4] "tblcheck_problem"       "gradethis_problem"

rossellhayes added a commit that referenced this issue Sep 13, 2021
rossellhayes added a commit that referenced this issue Sep 16, 2021
* Standardize problem types (#51)

* Do not use `seq_len(max_diffs)`, since this fails if `max_diffs` is Inf

* Don't pass `max_diffs` to `vec_check_values()`

* Add tests for `vec_grade_values()`

* Use `if_problem()` for logic in `tbl_message()` methods

* Fix tests with `max_diffs`

* Add location element to problems to track whether problem came from `table`, `column`, or `vector`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants