Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dealing with dates #78

Closed
1 of 3 tasks
cboettig opened this issue Dec 14, 2013 · 3 comments
Closed
1 of 3 tasks

Dealing with dates #78

cboettig opened this issue Dec 14, 2013 · 3 comments

Comments

@cboettig
Copy link
Member

Date columns are somewhat problematic than they should be. Several challenges exist here:

  • the R user may represent dates as almost any type: integer, numeric, character, factor, or Date. So detecting the format based on the column class isn't reliable. (there may be good reasons of course to treat date as a factor for a given statistical test or plot, etc)
  • R's Date type insists on a fully specified date-time object. So if a user has separate columns for year and day, we can't treat each as an date column independently.
  • EML's date format does't use the ISO C99 / POSIX standard for strftime, for instance Julian days might be specified using the less explicit format string DDD rather than %j.

current automated handling of dates based on column types can result in incorrect EML (see Advanced writing example). Typing these as characters and explicitly declaring it as a date in the unit metadata.

@cboettig
Copy link
Member Author

I've implemented a basic date parser. It's not completely satisfactory in the multiple-column case #17, but does detect a handful of common formats. See code: https://github.com/ropensci/EML/blob/6f052acfb64ece7f0577f7842a9c8ed28910a5e0/R/eml_get.R#L113-123

@ivanhanigan
Copy link
Contributor

I think you deal with dates as well as can be expected given the tricksy nature of the beast. You can probably close this issue hey?

@cboettig
Copy link
Member Author

Yup, right. Meanwhile there's a good ecosystem for dealing with dates in general; e.g. readr package read_csv() does a much better job reading in dates, and lubridate for manipulating them. But now that we're just focused on the metadata representation we aren't assuming the actual data table is even read into R anyway, so like you say these issues are now beyond the scope here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants