You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, only mode_is_trivial() and mode_count_range() have a max_unique parameter. I think all metadata functions should have one. The remaining ones are:
mode_count()
mode_frequency()
mode_frequency_range()
Why do they need the parameter? Consider x <- c(7, 7, 8, 8, NA). Assuming that the NA is known to be either 7 or 8, i.e., max_unique = "known", the functions currently deal with x the wrong way because they are unsure whether NA represents one of the known values or not. If they knew that it does, they would work differently:
mode_count() would be able to conclude that NA breaks the tie so that the mode count is 1.
mode_frequency() would see its target statistic rise to 3.
mode_frequency_range() would know that the maximal frequency is the actual one and return c(3, 3); just like mode_count_range(x, max_unique = "known") already returns c(1, 1).
However, this is just a corner case: a single value is missing, and all known values are equally frequent. max_unique doesn't matter to these functions otherwise, except for mode_frequency_range() if all known values are equal.
I don't think that max_unique matters at all to mode_first(), mode_all(), and mode_single() – the functions that attempt to find actual modes. The metadata functions don't, which is why they are able to gain any information from vectors like x at all. In other words, mode_all() and friends would fail x anyways, so max_unique wouldn't help them.
I have not thought about mode_possible_min() and mode_possible_max() in this context. Would they benefit from max_unique?
The text was updated successfully, but these errors were encountered:
Currently, only
mode_is_trivial()
andmode_count_range()
have amax_unique
parameter. I think all metadata functions should have one. The remaining ones are:mode_count()
mode_frequency()
mode_frequency_range()
Why do they need the parameter? Consider
x <- c(7, 7, 8, 8, NA)
. Assuming that theNA
is known to be either7
or8
, i.e.,max_unique = "known"
, the functions currently deal withx
the wrong way because they are unsure whetherNA
represents one of the known values or not. If they knew that it does, they would work differently:mode_count()
would be able to conclude thatNA
breaks the tie so that the mode count is1
.mode_frequency()
would see its target statistic rise to3
.mode_frequency_range()
would know that the maximal frequency is the actual one and returnc(3, 3)
; just likemode_count_range(x, max_unique = "known")
already returnsc(1, 1)
.However, this is just a corner case: a single value is missing, and all known values are equally frequent.
max_unique
doesn't matter to these functions otherwise, except formode_frequency_range()
if all known values are equal.I don't think that
max_unique
matters at all tomode_first()
,mode_all()
, andmode_single()
– the functions that attempt to find actual modes. The metadata functions don't, which is why they are able to gain any information from vectors likex
at all. In other words,mode_all()
and friends would failx
anyways, somax_unique
wouldn't help them.I have not thought about
mode_possible_min()
andmode_possible_max()
in this context. Would they benefit frommax_unique
?The text was updated successfully, but these errors were encountered: