Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add century parameter when parsing "yy" #795

Open
msberends opened this issue Sep 23, 2019 · 3 comments
Open

Add century parameter when parsing "yy" #795

msberends opened this issue Sep 23, 2019 · 3 comments
Labels
feature a feature request or enhancement parser 🥕

Comments

@msberends
Copy link

msberends commented Sep 23, 2019

I have patients that are, according to my data, born in "27-01-44". Like most West-European countries, we're used to dd-mm-yyyy or dd-mm-yy formats in the Netherlands. No problem for lubridate, just:

dmy("27-01-44")
#> [1] "2044-01-27"

Wait, what? Why is the year higher than that of Sys.Date()?

This has been a problem for years, proven by this question from 2012 until this answer from September 2019.

So what about base R?

as.Date("27-01-44", "%d-%m-%y")
#> [1] "2044-01-27"

So, maybe not a problem in lubridate. But I think (and others on StackOverflow do too) that it would be great if lubridate could bring a solution to this.

I would suggest to add a new argument to dmy() (and probably others as well), to add a reference date that should be the maximum of the year determination, for example Sys.Date().

@vspinu
Copy link
Member

vspinu commented Sep 24, 2019

You can use parse_date_time2 with cutoff_2000=0.

> parse_date_time2("27-01-44", "dmy", cutoff_2000 = 0)
[1] "1944-01-27 UTC"

Unfortunately similar argument cannot be incorporated in other functions because in some cases we rely on R's strptime which doesn't have a configuration for this parameter.

This part could be improved when parser is extracted from lubridate (lubridate rewrite).

@msberends
Copy link
Author

msberends commented Nov 15, 2019

I think if so many people are running into this problem, there should be a more convenient way.

Unfortunately similar argument cannot be incorporated in other functions because in some cases we rely on R's strptime which doesn't have a configuration for this paramete

Sounds like solvable by an ifelse() 😉 Perhaps use another backend than strptime if people use e.g.

dmy("27-01-44", century_max_year = 19)

?

@vspinu
Copy link
Member

vspinu commented Nov 19, 2019

In principle, strptime is needed only for parsing non-numeric months and week days in non english locales, so indeed, a half baked solution to this might be reasonable as it covers 99% of use cases.

@hadley hadley added feature a feature request or enhancement parser 🥕 labels Nov 19, 2019
@hadley hadley changed the title Please add century parameter for incorrect year determination. Add century parameter when parsing "yy" Nov 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement parser 🥕
Projects
None yet
Development

No branches or pull requests

3 participants