Summary
As a package developer, in order to reject vectors with too many distinct values for a factor, I would like is_fct_ish() to accept a max_levels argument that fails the entire vector when the number of unique levels exceeds the threshold.
Proposed signature
is_fct_ish(x, ..., levels = NULL, to_na = character(), max_levels = Inf)
Arguments
max_levels (numeric(1)) — The maximum number of distinct levels (unique non-NA values, after applying to_na) allowed across the entire vector. Defaults to Inf (no limit). When finite, the check applies to every non-NA value of the vector as a whole.
Returns same as before: is_fct_ish() returns a length-1 logical.
Behavior
- When
max_levels = Inf (default), behavior is identical to the current implementation.
- When
max_levels is finite, count the total number of unique non-NA values across the entire vector, after first removing any values in to_na.
- If that count exceeds
max_levels, return FALSE.
to_na values must not be counted as levels — coerce them to NA or remove them from the counted vector before counting unique values, consistent with how to_na is handled elsewhere.
- This must be fast, since this is meant to be a quick predicate check for
if() calls, etc. Be sure to only run this when it matters, and to be smart about how things are compared.
Summary
Proposed signature
Arguments
max_levels(numeric(1)) — The maximum number of distinct levels (unique non-NAvalues, after applyingto_na) allowed across the entire vector. Defaults toInf(no limit). When finite, the check applies to every non-NA value of the vector as a whole.Returns same as before:
is_fct_ish()returns a length-1 logical.Behavior
max_levels = Inf(default), behavior is identical to the current implementation.max_levelsis finite, count the total number of unique non-NAvalues across the entire vector, after first removing any values into_na.max_levels, returnFALSE.to_navalues must not be counted as levels — coerce them toNAor remove them from the counted vector before counting unique values, consistent with howto_nais handled elsewhere.if()calls, etc. Be sure to only run this when it matters, and to be smart about how things are compared.