Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should undefined date parsing error? #258

Closed
TimTaylor opened this issue Oct 18, 2021 · 2 comments · Fixed by #259
Closed

Should undefined date parsing error? #258

TimTaylor opened this issue Oct 18, 2021 · 2 comments · Fixed by #259

Comments

@TimTaylor
Copy link

I've been using date_parse to parse character inputs, from a user, in to dates with something akin to

as_myclass.character <- function(x, format = NULL, locale = clock_locale(), ...) {
  check_dots_empty()
  x <- date_parse(x, format = format, locale = locale)
  if (all(is.na(x))) abort("Unable to parse any entries of x as Dates")
  vec_cast(x, new_myclass())
}

However I have just noticed the warning in the "Details" of the documentation:

Parsing strings with sub-daily components, such as hours, minutes, or seconds, should be done with date_time_parse(). If you only need the date components, round the result to day precision, and then use as_date(). Attempting to directly parse a sub-daily string into a Date is ambiguous and undefined, and is unlikely to work as you might expect.

I had wrongly been assuming this would work similar to strptime where trailing characters were ignored

strptime converts character vectors to class "POSIXlt": its input x is first converted by as.character. Each input string is processed as far as necessary for the format specified: any trailing characters are ignored.

This leads to the following question. If passing a sub-standard daily string into a Date is undefined in clock would it be better for the following to error?

clock::date_parse(c("2019-01-01 11", "2019-01-01 12"), format = "%Y-%m-%d")

Hope this makes sense. This potentially links to #255

@DavisVaughan
Copy link
Member

Oh, it actually does work like strptime in this way.

That warning is actually about truly trying to parse the time components when the result is only day precision. Here are a few related tests for the undocumented behavior

test_that("parsing into a date if you requested to parse time components rounds the time (#207) (#230) (undocumented)", {
expect_identical(
date_parse("2019-12-31 11:59:59", format = "%Y-%m-%d %H:%M:%S"),
as.Date("2019-12-31")
)
expect_identical(
date_parse("2019-12-31 12:00:00", format = "%Y-%m-%d %H:%M:%S"),
as.Date("2020-01-01")
)
})
test_that("parsing fails when undocumented rounding behavior would result in invalid 60 second component (#230) (undocumented)", {
expect_warning(
expect_identical(
date_parse("2019-01-01 01:01:59.550", format = "%Y-%m-%d %H:%M:%6S"),
new_date(NA_real_)
),
class = "clock_warning_parse_failures"
)
})

In your example it would be like doing this to try and parse the hour, which is then automatically rounded in some undefined (to us) way:

clock::date_parse(c("2019-01-01 11", "2019-01-01 12"), format = "%Y-%m-%d %H")

I can probably try and clarify this with an example in the docs

@TimTaylor
Copy link
Author

Cool - that makes sense. Yep - clarification in docs for this case would be great but agree you are right to leave the exact behaviour in presence of time components undefined.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants