Skip to content

feat: add max_levels argument to is_fct_ish() #231

@jonthegeek

Description

@jonthegeek

Summary

As a package developer, in order to reject vectors with too many distinct values for a factor, I would like is_fct_ish() to accept a max_levels argument that fails the entire vector when the number of unique levels exceeds the threshold.

Proposed signature

is_fct_ish(x, ..., levels = NULL, to_na = character(), max_levels = Inf)

Arguments

  • max_levels (numeric(1)) — The maximum number of distinct levels (unique non-NA values, after applying to_na) allowed across the entire vector. Defaults to Inf (no limit). When finite, the check applies to every non-NA value of the vector as a whole.

Returns same as before: is_fct_ish() returns a length-1 logical.

Behavior

  • When max_levels = Inf (default), behavior is identical to the current implementation.
  • When max_levels is finite, count the total number of unique non-NA values across the entire vector, after first removing any values in to_na.
  • If that count exceeds max_levels, return FALSE.
  • to_na values must not be counted as levels — coerce them to NA or remove them from the counted vector before counting unique values, consistent with how to_na is handled elsewhere.
  • This must be fast, since this is meant to be a quick predicate check for if() calls, etc. Be sure to only run this when it matters, and to be smart about how things are compared.

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions