Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing single digit months #24

Closed
OmarGonD opened this issue Oct 24, 2016 · 9 comments

Comments

@OmarGonD
Copy link

commented Oct 24, 2016

I wanted to answer this question on StackOverFlow, with "anytime".

This are the dates the have, the format is: "single digit month, day, full year"

2/10/2016  
4/4/2016  
5/8/2016  
10/1/2016

However, anydate() only works on the first entry, not the rest. In the PDF at CRAN, we find:

Issues
The Boost Date_Time library cannot parse single digit months or days. So while ‘2016/09/02’
works (as expected), ‘2016/9/2’ will not. Other non-standard formats may also fail.
The is a known issue (discussed at length in issue tick 5) where Australian times are off by an hour.
This seems to affect only Windows, not Linux.

So, apparently this is a known bug/issue. Are there anywork arounds?

Or we should just use as.POSIXct(df$final_date, format = "%d/%m/%Y") ?

anydate("2/10/2016")
[1] "2016-02-10"
anydate("4/4/2016")
[1] NA
anydate("5/8/2016")
[1] NA
anydate("10/1/2016")
[1] NA

R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=Spanish_Peru.1252  LC_CTYPE=Spanish_Peru.1252   
[3] LC_MONETARY=Spanish_Peru.1252 LC_NUMERIC=C                 
[5] LC_TIME=Spanish_Peru.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
[1] anytime_0.0.4 dplyr_0.5.0  

loaded via a namespace (and not attached):
[1] lazyeval_0.2.0  magrittr_1.5    R6_2.1.3       
[4] assertthat_0.1  rsconnect_0.4.3 DBI_0.5        
[7] tools_3.3.1     tibble_1.2      Rcpp_0.12.7
@eddelbuettel

This comment has been minimized.

Copy link
Owner

commented Oct 24, 2016

It is a shortcoming in Boost Date_Time. It just doesn't do it:

R> anytime:::testFormat("%m/%d/%Y", "10/11/2016")
[1] "2016-10-11 CDT"
R> anytime:::testFormat("%m/%d/%Y", "2/3/2016")
[1] NA
R> 

Sadly, there is nothing we can do here. My wording is a little off on the help page but this is documented. So I'll close this, ok?

The "good news" that base R does parse it:

R> as.Date("2/3/2016")
[1] "2-03-20"
R> 
@OmarGonD

This comment has been minimized.

Copy link
Author

commented Oct 24, 2016

Yes, sure. No problem.

@eddelbuettel

This comment has been minimized.

Copy link
Owner

commented Oct 24, 2016

Trust me, I find it annoying as hell too. The hope for anytime was to parse all sane formats. But arguably single digit is not one. You could write a regexp which transform a single digit month or day to two and then parse that...

@alexanu

This comment has been minimized.

Copy link

commented Sep 20, 2018

Thank you for explanation. I'm having the same problem, but with hours (it doesn't recognize e.g 9:15, but does 09:15).

@eddelbuettel

This comment has been minimized.

Copy link
Owner

commented Sep 20, 2018

Try setting the R parser which should do it:

R> anytime("2018-9-20 7:24:30", useR=TRUE)
[1] "2018-09-20 07:24:30 CDT"
R> 

It's too bad the Boost one doesn't by default but such is life.

@alexanu

This comment has been minimized.

Copy link

commented Sep 20, 2018

Thank you for quick reply. Unfortunately it didn't work out for me from the 1st attempt. I didn't want to dig why, so used lubridate::parse_date_time2("dmY HM") - was also pretty fast for 2mn rows.

@eddelbuettel

This comment has been minimized.

Copy link
Owner

commented Sep 20, 2018

lubridate, under the competition, also switched to a C++-based backend. Sadly by copying the library code I had already in the RcppCCTZ package. And out main difference is ease-of-use / absence of format requirement. If you bother with a format, you can just to strptime() in base R...

@alexanu

This comment has been minimized.

Copy link

commented Sep 20, 2018

:) thank you! you are doing really cool stuff. Unfortunately, I still do not understand all your packages, but maybe one day :)

@eddelbuettel

This comment has been minimized.

Copy link
Owner

commented Aug 8, 2019

Good news - this now works in master (and hence the next release probably at the end of the month):

 R> inp <- c("2/10/2016", "4/4/2016", "5/8/2016", "10/1/2016")
 R> library(anytime)
 R> anytime(inp)
 [1] "2016-02-10 CST" "2016-04-04 CDT" "2016-05-08 CDT"
 [4] "2016-10-01 CDT"
 R>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.