-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Support parsing of ordinal ISO-8601 dates #10687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support parsing of ordinal ISO-8601 dates #10687
Conversation
Parses a naive datetime string. | ||
The date part of the string supports both ISO-8601 calendar dates and ordinal dates. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function's examples are rapidly becoming a test suite, and probably too expansive to be helpful documentation. I'd like to break them out into an actual suite and leave behind mostly instructive examples, but am unsure which ones are most valuable to keep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's a great idea regardless of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll do this in a separate PR, by describing what formats we expect/support a little better textually in the moduledoc (do moduledocs support doctests?), making sure all current examples are represented in the test suite, then tightening these up for all functions in the module to cover just happiest-path and examples of distinct errors. Then rebase this off it.
There's enough in-flight discussion on the mailing list about what we want to support here, though, that I'll wait for that to reach a conclusion.
end | ||
end | ||
|
||
defp calendar_day_of_year(year, ordinal_day) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a better name for this function? Is it possibly a valuable addition to the calendar public interface protocol at large? Alternatively, could it be added to the top-level Calendar
public API since it should work for anything supporting the Calendar
protocol?
Notably, this is the logical inverse of day_of_year/3
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's referred to as the ordinal day in the spec (iirc), so probably ordinal_day
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable here is ordinal_day
, and the function is meant to turn that into what the spec calls a "calendar day" (year, month, day). It's a little awkward tho b/c nothing else in the module refers to this construct as a "calendar day".
end | ||
end | ||
|
||
defp days_in_year(year) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this function possibly a valuable addition to the calendar public interface protocol at large? Alternatively, could it be added to the top-level Calendar
public API since it should work for anything supporting the Calendar
protocol?
Notably, this function comes up in the discussion here around the cldr_dates_times library, but I can't actually find an implementation in the source anymore.
{_offset, ""} <- parse_offset(rest) do | ||
{year, ordinal_day} = unquote(read_ordinal_date) | ||
|
||
if ordinal_day > days_in_year(year) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth noting: the implementation of calendar_day_of_year
returns the year as well, and could trivially "roll over" the year if given an ordinal day beyond its range (my original impl did this). However, this is against the ISO-8601 spec, so we handle it here instead, and similarly in the other parsing functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just 2 small changes making code shorter and easier to read. I wrote them by hand, so please test those changes.
y1 >= ?0 and y1 <= ?9 and y2 >= ?0 and y2 <= ?9 and y3 >= ?0 and y3 <= ?9 and y4 >= ?0 and | ||
y4 <= ?9 and d1 >= ?0 and d1 <= ?9 and d2 >= ?0 and d2 <= ?9 and d3 >= ?0 and d3 <= ?9, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about using making it short with in
:
y1 >= ?0 and y1 <= ?9 and y2 >= ?0 and y2 <= ?9 and y3 >= ?0 and y3 <= ?9 and y4 >= ?0 and | |
y4 <= ?9 and d1 >= ?0 and d1 <= ?9 and d2 >= ?0 and d2 <= ?9 and d3 >= ?0 and d3 <= ?9, | |
y1 in ?0..?9 and y2 in ?0..?9 and y3 in ?0..?9 and y4 in ?0..?9 and | |
d1 in ?0..?9 and d2 in ?0..?9 and d3 in ?0..?9, |
or even:
Edit: my second suggestion was a mistake. Suggested x in '09'
, but should be x in '0123456789'
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly, the second form — ?5 in '09'
— doesn't seem to work for me, but I do like the former.
There might be a performance implication here though--is list inclusion faster or slower than two equality checks? These are guards and compile-time lists so I assume either is awfully fast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We originally wrote in this format because it skips an is_integer
check compared to ranges. But it is likely that the is_integer check is optimized away today. @michalmuskala would probably know. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@christhekeele Ah, I see … as said I wrote it by hand … Should be:
iex> ?5 in '0123456789'
true
In that case 0..9
range is shorter.
{ | ||
(y1 - ?0) * 1000 + (y2 - ?0) * 100 + (y3 - ?0) * 10 + (y4 - ?0), | ||
(d1 - ?0) * 100 + (d2 - ?0) * 10 + (d3 - ?0) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a function for that:
{ | |
(y1 - ?0) * 1000 + (y2 - ?0) * 100 + (y3 - ?0) * 10 + (y4 - ?0), | |
(d1 - ?0) * 100 + (d2 - ?0) * 10 + (d3 - ?0) | |
} | |
{List.to_integer([y1, y2, y3, y4]), List.to_integer([d1, d2, d3])} |
Closing per discussion on mailing list. Incorporating some of the notes here into the successor #10689, though! |
To supplement the discussion in the core mailing list, this adds support for the ISO-8601
YYYY-DDD
ordinal date format toCalendar.ISO
string parsing functions.