Skip to content

Should the grid functions retain NA values in a factor? #1275

@DavisVaughan

Description

@DavisVaughan

It currently drops them because of the way sorted_unique() works. (This also would affect forcats::fct_unique() since the implementation came from there). We get them back in complete() because that does a full_join() and the NA was in the original data.

library(tidyr)

df <- tibble(a = factor(c("x", NA), levels = c("x", "y")))

# dropped `NA`
expand(df, a)
#> # A tibble: 2 × 1
#>   a    
#>   <fct>
#> 1 x    
#> 2 y

# we get it back from the full-join
complete(df, a)
#> # A tibble: 3 × 1
#>   a    
#>   <fct>
#> 1 x    
#> 2 y    
#> 3 <NA>

tidyr:::sorted_unique
#> function (x) 
#> {
#>     if (is.factor(x)) {
#>         factor(levels(x), levels(x), exclude = NULL, ordered = is.ordered(x))
#>     }
#>     else if (is_bare_list(x)) {
#>         vec_unique(x)
#>     }
#>     else {
#>         vec_sort(vec_unique(x))
#>     }
#> }
#> <bytecode: 0x7fa27f1ac8b0>
#> <environment: namespace:tidyr>

I am fairly certain we want to keep them, since it keeps NAs with non-factor types?

library(tidyr)

df <- tibble(a = c(NA, 1))

# keeps NA at the end
expand(df, a)
#> # A tibble: 2 × 1
#>       a
#>   <dbl>
#> 1     1
#> 2    NA

Metadata

Metadata

Assignees

No one assigned

    Labels

    ask :bowtie:bugan unexpected problem or unintended behaviorgrids #️⃣expanding, nesting, crossing, ...

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions