Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add locale argument to step_date() #1000

Closed
juliasilge opened this issue Jun 7, 2022 · 3 comments · Fixed by #1001
Closed

Add locale argument to step_date() #1000

juliasilge opened this issue Jun 7, 2022 · 3 comments · Fixed by #1001

Comments

@juliasilge
Copy link
Member

Thanks to @cderv for highlighting this problem:

library(recipes)
#> Le chargement a nécessité le package : dplyr
#> 
#> Attachement du package : 'dplyr'
#> Les objets suivants sont masqués depuis 'package:stats':
#> 
#>     filter, lag
#> Les objets suivants sont masqués depuis 'package:base':
#> 
#>     intersect, setdiff, setequal, union
#> 
#> Attachement du package : 'recipes'
#> L'objet suivant est masqué depuis 'package:stats':
#> 
#>     step

examples <- tibble(someday = lubridate::ymd("2000-12-20") + lubridate::days(0:40))
recipe(~ someday, examples) %>%
  step_date(all_predictors()) %>%
  prep() %>%
  bake(new_data = examples)
#> # A tibble: 41 × 4
#>    someday    someday_dow someday_month someday_year
#>    <date>     <fct>       <fct>                <dbl>
#>  1 2000-12-20 "mer\\."    déc                   2000
#>  2 2000-12-21 "jeu\\."    déc                   2000
#>  3 2000-12-22 "ven\\."    déc                   2000
#>  4 2000-12-23 "sam\\."    déc                   2000
#>  5 2000-12-24 "dim\\."    déc                   2000
#>  6 2000-12-25 "lun\\."    déc                   2000
#>  7 2000-12-26 "mar\\."    déc                   2000
#>  8 2000-12-27 "mer\\."    déc                   2000
#>  9 2000-12-28 "jeu\\."    déc                   2000
#> 10 2000-12-29 "ven\\."    déc                   2000
#> # … with 31 more rows

Created on 2022-06-07 by the reprex package (v2.0.1)

I had deployed a workflow that used step_date(), and he could not predict from it locally because of the language difference.

We could:

  • at the least document this possible problem
  • save at prep() time the transformations used so they can be applied at bake() time
@juliasilge
Copy link
Member Author

Issue 1000!!! 🎉 🎉 😆

@EmilHvitfeldt
Copy link
Member

Full reprex

library(recipes)

Sys.setlocale("LC_TIME", 'fr_FR.UTF-8')
#> [1] "fr_FR.UTF-8"

examples <- tibble(someday = lubridate::ymd("2000-12-20") + lubridate::days(0:40))
rec <- recipe(~ someday, examples) %>%
  step_date(all_predictors()) %>%
  prep() 

rec %>%
  bake(new_data = examples)
#> # A tibble: 41 × 4
#>    someday    someday_dow someday_month someday_year
#>    <date>     <fct>       <fct>                <dbl>
#>  1 2000-12-20 Mer         déc                   2000
#>  2 2000-12-21 Jeu         déc                   2000
#>  3 2000-12-22 Ven         déc                   2000
#>  4 2000-12-23 Sam         déc                   2000
#>  5 2000-12-24 Dim         déc                   2000
#>  6 2000-12-25 Lun         déc                   2000
#>  7 2000-12-26 Mar         déc                   2000
#>  8 2000-12-27 Mer         déc                   2000
#>  9 2000-12-28 Jeu         déc                   2000
#> 10 2000-12-29 Ven         déc                   2000
#> # … with 31 more rows

Sys.setlocale("LC_TIME", 'en_GB.UTF-8')
#> [1] "en_GB.UTF-8"

rec %>%
  bake(new_data = examples)
#> # A tibble: 41 × 4
#>    someday    someday_dow someday_month someday_year
#>    <date>     <fct>       <fct>                <dbl>
#>  1 2000-12-20 <NA>        <NA>                  2000
#>  2 2000-12-21 <NA>        <NA>                  2000
#>  3 2000-12-22 <NA>        <NA>                  2000
#>  4 2000-12-23 <NA>        <NA>                  2000
#>  5 2000-12-24 <NA>        <NA>                  2000
#>  6 2000-12-25 <NA>        <NA>                  2000
#>  7 2000-12-26 <NA>        <NA>                  2000
#>  8 2000-12-27 <NA>        <NA>                  2000
#>  9 2000-12-28 <NA>        <NA>                  2000
#> 10 2000-12-29 <NA>        <NA>                  2000
#> # … with 31 more rows

Created on 2022-06-07 by the reprex package (v2.0.1)

@EmilHvitfeldt EmilHvitfeldt changed the title step_date() uses Add locale argument to step_date() Jun 8, 2022
@github-actions
Copy link

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Jun 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants