Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduced dates #7

Closed
jpmckinney opened this issue Jan 26, 2013 · 7 comments
Closed

reduced dates #7

jpmckinney opened this issue Jan 26, 2013 · 7 comments

Comments

@jpmckinney
Copy link
Member

@jpmckinney jpmckinney commented Jan 26, 2013

PopIt uses MongoDB's native Date objects, so it cannot use ISO 8601:2004 reduced dates.

@evdb
Copy link

@evdb evdb commented Mar 1, 2013

The approach in PopIt is (was :) ) to treat all dates as a range. This works well in that it can represent a specific date like 2 Mar 2012 (by using a range from midnight to the following midnight). It can also represent arbitrary ranges like 1 Jan 2012 to 30 April 2012.

However it has the following problems:

  • It is clunky - for the cases where the date is known it feels silly to have two dates stored.
  • Timezones become relevant as there is a time component and this can lead to unexpected issues. This is what mysociety/popit#207 refers to and using strings to represent the date would be a good solution.

But the following advantages:

  • there is no arbitrary constraint that date must be either accurate to the day, the month or the year. Allows for greater granularity.
  • sorting in both directions is simpler as there is a high date and a low date to use as expected. See the extra code that was needed in the mzalendo codebase for sorting purposes when using a reduced date style.
  • the date is an object that further fields can be added to. For example the date could easily be given a name, or a comment (I'll create another issue regarding the name and the intended use).
@jpmckinney
Copy link
Member Author

@jpmckinney jpmckinney commented Mar 1, 2013

With respect to the advantages:

  1. Does that arbitrary constraint come up in practice? How big a disadvantage is that constraint?
  2. Can you provide an example where the ISO reduced dates do not give the sort order you would expect, when sorting their original values as strings? All I can tell is that you need to do some magic for your sentinel value "future" and null values to be properly sorted.
  3. Linking to #20
@evdb
Copy link

@evdb evdb commented Mar 1, 2013

Does that arbitrary constraint come up in practice? How big a disadvantage is that constraint?

It has not, but it is realistic to expect that it will. Without this it won't be possible to represent "Q2 2012" or "late 2012" more accurately than just "2012". It would also not be possible to represent a particular week of the year if that week spanned two months. And using the ISO 8601 format of 2012-W12 would not sort correctly either.

For the Kenyan elections we would have been able to give a very specific range that the election should have occurred in according to the legislation.

Can you provide an example where the ISO reduced dates do not give the sort order you would expect, when sorting their original values as strings?

Ascending sorts tend to be fine, the problem is that when you do descending sorts with partial dates the expected result is not just the reverse of the ascending. Take these dates as an example:

ascending: 2001, 2001-03, 2001-03-15, 2001-04
descending: 2001, 2001-04, 2001-03, 2001-03-15

This sorting order could be contentious, but it is what we found to be the expected order when displaying the dates to the user. Having each partial date contain a definite high and low date for the range made sorting easier. Notably the Extended Date/Time Format (EDTF) 1.0 spec dodges the question completely.

All I can tell is that you need to do some magic for your sentinel value "future" and null values to be properly sorted.

That is a little messy, but using 'future' to distinguish between not known and not happened yet was useful. It is not deal though - if future is the end of the range it is a good way to say something like '15 Mar 2001 onwards', if future is the start of the range then its meaning is not clear.

@jpmckinney
Copy link
Member Author

@jpmckinney jpmckinney commented Mar 1, 2013

  1. ISO 8601 already has support for time intervals, e.g. 1985-04-12T23:20:50/1985-06-25T10:30:00. This is quite obviously awful for querying against. Having use cases (along the lines of your "test suite" from an earlier email) to lend support to why we would want to do certain types of queries would be great.
  2. Can we come up with a list of use cases in which we must sort by date, and in which the date is likely to be a reduced date? (If all use cases are ones where we are always have full dates, then sorting reduced dates is a non-issue). For these use cases, it would also be relevant to include an idea of how many records would be sorted. If it turns out that the number of records is small, then that sorting can be done by the app and not the DB. I see the "correct" sort order as being contentious - preferably our use cases can leave it up to each app.

Re: future let's have that discussion in #17

Update: Just for clarity:

Default reverse sort order on strings:

2001-04, 2001-03-15, 2001-03, 2001

Desired reverse sort order on string:

2001, 2001-04, 2001-03, 2001-03-15

i.e. sort by precision in normal order, then reverse sort by string all the strings with the same precision

@jpmckinney
Copy link
Member Author

@jpmckinney jpmckinney commented Mar 8, 2013

Btw, here is a US gov draft specification which may be more suitable than ISO 8601: http://www.loc.gov/standards/datetime/pre-submission.html

@jpmckinney
Copy link
Member Author

@jpmckinney jpmckinney commented Nov 1, 2014

Closing, as this has not come up again as an issue. We can re-open as necessary.

@jpmckinney jpmckinney closed this Nov 1, 2014
@girogiro
Copy link
Contributor

@girogiro girogiro commented May 29, 2019

Regarding descending order of date strings.

Default reverse sort order on strings:

2001-04, 2001-03-15, 2001-03, 2001

Desired reverse sort order on string:

2001, 2001-04, 2001-03, 2001-03-15

i.e. sort by precision in normal order, then reverse sort by string all the strings with the same precision

This is acually not true. If 2000 is added to the values above it's not sorted last where it should be but right after 2001.

A working SQL solution for descending order is ORDER BY CONCAT(value, 'T25') DESC that works also for combined date and time values. The same is easy to implement in app code.

Furthermore, I suggest to use * and | as internal values for past and future, respectivelly. They are sorted correctly (* < any date < |) both for ascending and descending order and remind asterisk and dagger commonly used for birth date and death date.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants