Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parse_date_time fails for abbreviated month names in locales where abbreviation includes "." #893

Closed
EmilBode opened this issue May 20, 2020 · 1 comment

Comments

@EmilBode
Copy link

@EmilBode EmilBode commented May 20, 2020

I suspect this may be a regression bug due to the fix for issue #781

When trying to parse dates with parse_date_time, I noticed unexected NAs when I used abbreviated month names in French and Spanish

Reprex

library(lubridate)
Sys.setlocale(locale='Spanish_Spain')
# Yes, locale specification is superfluous, just to be sure it's using the right one
lubridate::parse_date_time('02 ene. 2015 12:48', '%d %b %Y %H:%M', locale='Spanish_Spain')
# > [1] NA
# > Warning message:
# > 1 failed to parse. 
# Same result without dot
lubridate::parse_date_time('02 ene 2015 12:48', '%d %b %Y %H:%M', locale='Spanish_Spain')
# > [1] NA
# > Warning message:
# >  1 failed to parse. 
# Does work with strptime (and dot)
strptime('02 ene. 2015 12:48', '%d %b %Y %H:%M')
# > [1] "2015-01-02 12:48:00 CET"

Sys.setlocale(locale='French_France')
lubridate::parse_date_time('02 janv 2015 12:48', '%d %b %Y %H:%M', locale='French_France')
# > [1] NA
# > Warning message:
# >  1 failed to parse. 
# And same with a dot
# But note that this does work, probably because "mars" is not an abbreviation, thus always without dot:
lubridate::parse_date_time('02 mars 2015 12:48', '%d %b %Y %H:%M', locale='French_France')
# > [1] "2015-03-02 12:48:00 UTC"

Environment

  • R 3.6.2 under Rstudio 1.2.5033
  • lubridate 1.7.8
  • Other loaded packages: just the basics: base, methods, dataset, utils, grDevices, graphics, stats
  • Windows 10, 64 bit, Dutch version
  • OS locale "LC_COLLATE=Dutch_Netherlands.1252;LC_CTYPE=Dutch_Netherlands.1252;LC_MONETARY=Dutch_Netherlands.1252;LC_NUMERIC=C;LC_TIME=Dutch_Netherlands.1252" (as specified by Sys.setLocale())

Diagnostics output of month names output according to strptime/format and lubridate:

Sys.setlocale(locale='French_France')
# > [1] "LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252"
format(seq.Date(as.Date('2015-01-01'), by='month', length.out = 12), '%b')
# >  [1] "janv." "févr." "mars"  "avr."  "mai"   "juin"  "juil." "août"  "sept." "oct."  "nov."  "déc." 
month(seq.Date(as.Date('2015-01-01'), by='month', length.out=12),label = TRUE)
# >  [1] janv févr mars avr  mai  juin juil août sept oct  nov  déc 
# > Levels: janv < févr < mars < avr < mai < juin < juil < août < sept < oct < nov < déc
Sys.setlocale(locale='Spanish_Spain')
# > [1] "LC_COLLATE=Spanish_Spain.1252;LC_CTYPE=Spanish_Spain.1252;LC_MONETARY=Spanish_Spain.1252;LC_NUMERIC=C;LC_TIME=Spanish_Spain.1252"
format(seq.Date(as.Date('2015-01-01'), by='month', length.out = 12), '%b')
# >  [1] "ene." "feb." "mar." "abr." "may." "jun." "jul." "ago." "sep." "oct." "nov." "dic."
month(seq.Date(as.Date('2015-01-01'), by='month', length.out=12),label = TRUE)
# >  [1] ene feb mar abr may jun jul ago sep oct nov dic
# > Levels: ene < feb < mar < abr < may < jun < jul < ago < sep < oct < nov < dic
@vspinu vspinu closed this in 1dcb40e May 31, 2020
@vspinu
Copy link
Member

@vspinu vspinu commented May 31, 2020

Thanks for the report. I have added a test for non-english locale to avoid such embarrassing bugs in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.