Skip to content

fct_inseq() changes values without warning (but it has to order levels only) #221

@GegznaV

Description

@GegznaV

The actual behavior of function fct_inseq() is not documented well (see here) and the behavior of this function violates some principles of the Tidyverse:

  1. I expect the function to change the order of levels only. It must not modify the labels of the levels and must not modify the values of a variable (see Ex. A-1, C-1 and read the comments there).
  2. In some situations, the function fails, but it should not (see Ex. B-1).
  3. If something is changed in a variable, I expect at least a warning and not silent modifications.
  4. What I really expected, is the output similar to the result of gtools::mixedsort() (see Ex. A-2, B-2, C-2). Can this be implemented in forcats?
  5. I expect all peculiarities of the function to be well documented with a section of representative examples.
  6. Shouldn't there be a unit test that checks if all levels are included (despite the order of levels)?

The examples of fct_inseq() (A-1, B-1, C-1 shows actual and A-2, B-2, C-2 shows expected behavior of fct_inseq()):

my_fct_a <- factor(c("010", "001", "005"))

# (Ex. A-1): 
# 1. values are converted to NA;
# 2. leading zeros were dropped;
# 3. no warning was issued.
forcats::fct_inseq(my_fct_a)
#> [1] <NA> <NA> <NA>
#> Levels: 1 5 10

# (Ex. A-2): 
# The expected result for my_fct_a
forcats::fct_relabel(my_fct_a, gtools::mixedsort)
#> [1] 010 001 005
#> Levels: 001 005 010


my_fct_b <- factor(c("10", "001", "1", "23",  "05", "01"))

# (Ex. B-1):
# 1. Original values are different, but function fails with error
forcats::fct_inseq(my_fct_b)
#> Error in `levels<-`(`*tmp*`, value = as.character(levels)): factor level [2] is duplicated

# (Ex. B-2)
# The expected result for my_fct_b
forcats::fct_relabel(my_fct_b, gtools::mixedsort)
#> [1] 10  001 05  23  1   01 
#> Levels: 001 01 1 05 10 23


my_fct_c <- factor(c("A10", 9, "A1", "B1",  6, 5, "A05", "01"))

# (Ex. C-1)
# 1) The levels of the result are different than the original values
# 2) some values are SILENTLY converted to NAs
forcats::fct_inseq(my_fct_c)
#> [1] <NA> 9    <NA> <NA> 6    5    <NA> <NA>
#> Levels: 1 5 6 9

# (Ex. C-2)
# The expected result for my_fct_c 
forcats::fct_relabel(my_fct_c, gtools::mixedsort)
#> [1] A10 9   A05 B1  6   5   A1  01 
#> Levels: 01 5 6 9 A1 A05 A10 B1

# (Ex. C-3)
# 1) mixed sorting (C-2) gives different result than regular sorting (C-3)
forcats::fct_relabel(my_fct_c, sort)
#> [1] A10 9   A1  B1  6   5   A05 01 
#> Levels: 01 5 6 9 A05 A1 A10 B1

Created on 2019-11-19 by the reprex package (v0.3.0)

For unit testing:

x <- factor(c("A10", 9, "A1", "B1",  6, 5, "A05", "01"))
testthat::expect_silent({x_after  <- forcats::fct_inseq(x)})
testthat::expect_true(all(levels(x_after) %in% levels(x)))
testthat::expect_true(all(unique(x_after) %in% unique(x)))
Session info ```r > sessioninfo::session_info("forcats") - Session info ------------------------------------------------------------------------------------------------- setting value version R version 3.6.1 (2019-07-05) os Windows 10 x64 system x86_64, mingw32 ui RStudio language (EN) collate English_United States.1252 ctype English_United States.1252 tz Europe/Helsinki date 2019-11-19
  • Packages -----------------------------------------------------------------------------------------------------
    package * version date lib source
    assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.0)
    backports 1.1.5 2019-10-02 [1] CRAN (R 3.6.1)
    cli 1.1.0 2019-03-19 [1] CRAN (R 3.6.1)
    crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.0)
    digest 0.6.22 2019-10-21 [1] CRAN (R 3.6.1)
    ellipsis 0.3.0 2019-09-20 [1] CRAN (R 3.6.1)
    fansi 0.4.0 2018-10-05 [1] CRAN (R 3.6.0)
    forcats 0.4.0 2019-02-17 [1] CRAN (R 3.6.0)
    glue 1.3.1 2019-03-12 [1] CRAN (R 3.6.1)
    magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.0)
    pillar 1.4.2 2019-06-29 [1] CRAN (R 3.6.0)
    pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 3.6.1)
    rlang 0.4.1.9000 2019-11-17 [1] Github (r-lib/rlang@88deeb2)
    tibble 2.1.3 2019-06-06 [1] CRAN (R 3.6.0)
    utf8 1.1.4 2018-05-24 [1] CRAN (R 3.6.0)
    vctrs 0.2.0 2019-07-05 [1] CRAN (R 3.6.1)
    zeallot 0.1.0 2018-01-28 [1] CRAN (R 3.6.0)
</details>

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugan unexpected problem or unintended behaviortidy-dev-day 🤓Tidyverse Developer Day rstd.io/tidy-dev-day

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions