Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reevaluate date change for 1.4.0 #665

Closed
jyutzler opened this issue Aug 30, 2023 · 4 comments · Fixed by #666
Closed

Reevaluate date change for 1.4.0 #665

jyutzler opened this issue Aug 30, 2023 · 4 comments · Fixed by #666
Milestone

Comments

@jyutzler
Copy link
Contributor

jyutzler commented Aug 30, 2023

During the Open Comment period for GeoPackage 1.4, we received some concern that making date formats more flexible (see #648) would require coding changes for some clients. Apparently some older clients would not be able to cast the string to date.

We need to evaluate the extent to which this is a concern and whether it is a significant enough concern to pull the change from the release.

@KRyden

@jyutzler jyutzler added this to the 1.4.0 milestone Aug 30, 2023
@jyutzler
Copy link
Contributor Author

My own opinion on this (subject to change!) is that this change does not break existing clients. The 1.4 GeoPackage would load, but the dates might not parse as a dates.

All we would need to do is make it clear in the documentation that dates that don't use the original format are not guaranteed to load as dates in clients that do not support this new capability. GeoPackage producers may wish to export with the original date format if querying capabilities are needed and the expectation is that non-1.4 clients will be in use.

However, I am just one voice here.

@KRyden
Copy link

KRyden commented Aug 31, 2023

Issue - Equality tests, including joins and group by and sorts, will not work without using the date/time functions to normalize the values, which is potentially expensive and won’t work if there are more than 3 decimal places for seconds (which is also part of the proposal).

Here are some examples:

sqlite> with cte(a) as (values ('1999-12-12 14:21'),('1999-12-12 14:21:00'),('1999-12-12 14:21:00.000'),('1999-12-12 14:21:00.000000'))
select a from cte group by a;
a

1999-12-12 14:21
1999-12-12 14:21:00
1999-12-12 14:21:00.000
1999-12-12 14:21:00.000000
sqlite>

Notice that all of those values, which are identical in value, don’t compare as identical.

You can use SQLite date functions to normalize the values to get the expected result:

sqlite> with cte(a) as (values ('1999-12-12 14:21'),('1999-12-12 14:21:00'),('1999-12-12 14:21:00.000'),('1999-12-12 14:21:00.000000'))
select strftime('%Y-%m-%d %H:%M:%f',a) from cte group by strftime('%Y-%m-%d %H:%M:%f',a);
strftime('%Y-%m-%d %H:%M:%f',a)

1999-12-12 14:21:00.000
sqlite>

At some extra cost (and making it more difficult to use an index on that field). But this only works correctly if you limit the timestamp to millisecond precision:

sqlite> with cte(a) as (values ('1999-12-12 14:21'),('1999-12-12 14:21:00'),('1999-12-12 14:21:00.000'),('1999-12-12 14:21:00.000099'))
select strftime('%Y-%m-%d %H:%M:%f',a) from cte group by strftime('%Y-%m-%d %H:%M:%f',a);
strftime('%Y-%m-%d %H:%M:%f',a)

1999-12-12 14:21:00.000
sqlite>

Notice the last date/time in the query ending in 14:21:00.000099

These issues will result in user errors and support calls resulting from the loosening of the date/time portion of the geopackage specification. These issues will be seen as implementation bugs and will require careful analysis and coding of solutions to avoid them - even then the millisecond precision issue will likely continue to be a recurring problem.

While the use cases seemed reasonable at the time this was proposed, the apparent fallout trying to implement in a generic way and ensure easily understood behavior makes continuing forward with this proposed change questionable.

@heidivanparys
Copy link
Contributor

The GeoPackage format does not fit our needs for an exchange format1 if the datetime data type is limited to exactly three decimals. Many of our datasets contain timestamps with a finer temporal resolution than milliseconds, some datasets contain data with a courser temporal resolution than milliseconds. (Note that for a given column, the values will all have the same temporal resolution.)

Leaving out the extra decimals in a datetime column is of course not an option. That would mean that we would change the data, and that the data we exchange with third parties would be different from the data we have in our own database. For timestamps with fewer decimals, adding one or more zeroes would give a false impression of the temporal resolution.

Putting our timestamps with all the extra decimals in a datetime column gives a GeoPackage file that is not compliant to the current specification. That is not a good option for an organisation that wants to promote the use of and compliance to standards.

Putting our timestamps in text columns is a third, theoretical option, but it is not very user-friendly for those that use GIS applications.

Note again that the SQLite documentation does not mandate exactly three decimals, that is the interpretation in the GeoPackage specification / by the GeoPackage SWG.

1 This means that often, the one receiving the data will not query the GeoPackage file directly, but will load the data in their own database, and query it there, using the datetime functionality of that database.

@jyutzler
Copy link
Contributor Author

jyutzler commented Sep 6, 2023

The SWG moved to remove the date changes for 1.4.0. This will be revisited in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants