New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

January is not parsed correctly #411

Closed
vspinu opened this Issue May 2, 2016 · 3 comments

Comments

Projects
None yet
3 participants
@vspinu
Member

vspinu commented May 2, 2016

> mdy("January131973")
[1] NA

Reported first at stacoverflow.

@vspinu vspinu added the bug label May 2, 2016

@Deleetdk

This comment has been minimized.

Show comment
Hide comment
@Deleetdk

Deleetdk May 19, 2016

Locale problem? I had a similar issue where October and May wouldn't parse. These turned out to be the only two for which the name differed in the 3 letter abbreviation between my locale and English.

#366

Locale problem? I had a similar issue where October and May wouldn't parse. These turned out to be the only two for which the name differed in the 3 letter abbreviation between my locale and English.

#366

@cderv

This comment has been minimized.

Show comment
Hide comment
@cderv

cderv May 19, 2016

Contributor

here's some code I tried to replicate the issue in another locale, then I tried different thing. And clearly I do not understand the behaviour. It juste seems that it is not a simple parsing problem.

  • issue is the same in French.
  • it appears only with a length one character vector beginning by "January" or "janvier"'
  • With a vector with another date to parse, it is working.
library(lubridate)
Sys.setlocale("LC_TIME", "French_France.1252")
#> [1] "French_France.1252"
vdate <- paste0(format(dmy("13/01/1973") + (0:11)*months(1), "%B"), "131973")
vdate
#>  [1] "janvier131973"   "février131973"   "mars131973"     
#>  [4] "avril131973"     "mai131973"       "juin131973"     
#>  [7] "juillet131973"   "août131973"      "septembre131973"
#> [10] "octobre131973"   "novembre131973"  "décembre131973"

mdy(vdate)
#>  [1] "1973-01-13" "1973-02-13" "1973-03-13" "1973-04-13" "1973-05-13"
#>  [6] "1973-06-13" "1973-07-13" "1973-08-13" "1973-09-13" "1973-10-13"
#> [11] "1973-11-13" "1973-12-13"
mdy(vdate[1])
#> Warning: All formats failed to parse. No formats found.
#> [1] NA
mdy(vdate[2])
#> [1] "1973-02-13"
mdy(vdate[1:2])
#> [1] "1973-01-13" "1973-02-13"
mdy(vdate[c(1,1)])
#> Warning: All formats failed to parse. No formats found.
#> [1] NA NA

Sys.setlocale("LC_TIME", "C")
#> [1] "C"

vdate <- paste0(format(dmy("13/01/1973") + (0:11)*months(1), "%B"), "131973")
vdate
#>  [1] "January131973"   "February131973"  "March131973"    
#>  [4] "April131973"     "May131973"       "June131973"     
#>  [7] "July131973"      "August131973"    "September131973"
#> [10] "October131973"   "November131973"  "December131973"

mdy(vdate)
#>  [1] "1973-01-13" "1973-02-13" "1973-03-13" "1973-04-13" "1973-05-13"
#>  [6] "1973-06-13" "1973-07-13" "1973-08-13" "1973-09-13" "1973-10-13"
#> [11] "1973-11-13" "1973-12-13"
mdy(vdate[1])
#> Warning: All formats failed to parse. No formats found.
#> [1] NA
mdy(vdate[2])
#> [1] "1973-02-13"
mdy(vdate[1:2])
#> [1] "1973-01-13" "1973-02-13"
mdy(vdate[c(1,1)])
#> Warning: All formats failed to parse. No formats found.
#> [1] NA NA

Will continue to investigate...

Contributor

cderv commented May 19, 2016

here's some code I tried to replicate the issue in another locale, then I tried different thing. And clearly I do not understand the behaviour. It juste seems that it is not a simple parsing problem.

  • issue is the same in French.
  • it appears only with a length one character vector beginning by "January" or "janvier"'
  • With a vector with another date to parse, it is working.
library(lubridate)
Sys.setlocale("LC_TIME", "French_France.1252")
#> [1] "French_France.1252"
vdate <- paste0(format(dmy("13/01/1973") + (0:11)*months(1), "%B"), "131973")
vdate
#>  [1] "janvier131973"   "février131973"   "mars131973"     
#>  [4] "avril131973"     "mai131973"       "juin131973"     
#>  [7] "juillet131973"   "août131973"      "septembre131973"
#> [10] "octobre131973"   "novembre131973"  "décembre131973"

mdy(vdate)
#>  [1] "1973-01-13" "1973-02-13" "1973-03-13" "1973-04-13" "1973-05-13"
#>  [6] "1973-06-13" "1973-07-13" "1973-08-13" "1973-09-13" "1973-10-13"
#> [11] "1973-11-13" "1973-12-13"
mdy(vdate[1])
#> Warning: All formats failed to parse. No formats found.
#> [1] NA
mdy(vdate[2])
#> [1] "1973-02-13"
mdy(vdate[1:2])
#> [1] "1973-01-13" "1973-02-13"
mdy(vdate[c(1,1)])
#> Warning: All formats failed to parse. No formats found.
#> [1] NA NA

Sys.setlocale("LC_TIME", "C")
#> [1] "C"

vdate <- paste0(format(dmy("13/01/1973") + (0:11)*months(1), "%B"), "131973")
vdate
#>  [1] "January131973"   "February131973"  "March131973"    
#>  [4] "April131973"     "May131973"       "June131973"     
#>  [7] "July131973"      "August131973"    "September131973"
#> [10] "October131973"   "November131973"  "December131973"

mdy(vdate)
#>  [1] "1973-01-13" "1973-02-13" "1973-03-13" "1973-04-13" "1973-05-13"
#>  [6] "1973-06-13" "1973-07-13" "1973-08-13" "1973-09-13" "1973-10-13"
#> [11] "1973-11-13" "1973-12-13"
mdy(vdate[1])
#> Warning: All formats failed to parse. No formats found.
#> [1] NA
mdy(vdate[2])
#> [1] "1973-02-13"
mdy(vdate[1:2])
#> [1] "1973-01-13" "1973-02-13"
mdy(vdate[c(1,1)])
#> Warning: All formats failed to parse. No formats found.
#> [1] NA NA

Will continue to investigate...

@cderv

This comment has been minimized.

Show comment
Hide comment
@cderv

cderv May 19, 2016

Contributor

Playing with debugmode, I find that guess_format is not working for "Janvier131973, and this function is called in the process of mdy("Janvier1973").

I think I found a problem into the regex expression stored into the environnement .locale_reg_cache, that is called at the beggining of guess_format.

(((?<b_m_e>>janv\\.|févr\\. and (?<B_m_e>>janvier|février . I think there is an extra >.
If I delete it, it becomes (((?<b_m_e>janv\\.|févr\\. and (?<B_m_e>janvier|février).
and if I replace the object in .locale_reg_cache, as I tried, it works.

It is a bit long to post here and I will try to make a gist to show you the example I made

I think it may be a problem in guess.r file when creating the object strored in .locale_reg_cache but I do not fully search how .get_locale_regs work to create the regex.

I will try to find later. Hope this helps!

Contributor

cderv commented May 19, 2016

Playing with debugmode, I find that guess_format is not working for "Janvier131973, and this function is called in the process of mdy("Janvier1973").

I think I found a problem into the regex expression stored into the environnement .locale_reg_cache, that is called at the beggining of guess_format.

(((?<b_m_e>>janv\\.|févr\\. and (?<B_m_e>>janvier|février . I think there is an extra >.
If I delete it, it becomes (((?<b_m_e>janv\\.|févr\\. and (?<B_m_e>janvier|février).
and if I replace the object in .locale_reg_cache, as I tried, it works.

It is a bit long to post here and I will try to make a gist to show you the example I made

I think it may be a problem in guess.r file when creating the object strored in .locale_reg_cache but I do not fully search how .get_locale_regs work to create the regex.

I will try to find later. Hope this helps!

cderv added a commit to cderv/lubridate that referenced this issue May 22, 2016

@vspinu vspinu closed this in #415 May 23, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment