Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #1390 LocalDateTime string serialization #1391

Closed
wants to merge 2 commits into from

Conversation

kdubb
Copy link
Contributor

@kdubb kdubb commented Jan 18, 2019

This is a proposed fix for #1390.

All Submissions:

  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

  1. Does your submission pass tests?
  2. Does mvn checkstyle:check pass ?

Changes to Existing Features:

  • Does this break existing behaviour? If so please explain.

Yes, LocalDateTime is now inserted and retrieved without regard for any current time zone. Which, in my interpretation, is the correct thing to do.

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your core changes, as applicable?
  • Have you successfully run tests with your changes locally?

This removes the tethering of passed in `LocalDateTime` objects to the current time zone. Instead it tethers it to the UTC offset only for purposes of code reuse.

Tethering it to UTC at least guarantees the value won’t be changed during serialization as it sometimes is when tething to the current time zone.
@davecramer
Copy link
Member

@kdubb don't get me wrong I have no clue what is the "correct" interpretation, but do we have any corroborating evidence this is the correct thing to do ?

@kdubb
Copy link
Contributor Author

kdubb commented Jan 20, 2019

Depends on what your definition of this is...

If you're asking for corroborating evidence that my fix is the correct one, no. If you read my actual commit message I tether it to UTC only to re-use the string formatting routine already in place; because it will not ever alter the value. Alternatively a specific LocalDateTime formatter could be written; but it seems unnecessary.

If you're asking the bigger question then the answer is "sort of" read the docs for LocalDateTime the atOffset and atZone methods and then the JDBC 4.2 spec.

The spec is most interesting in that it only specifies that timestamp with no timezone be convertible to LocalDateTime no there JSR310 objects. This is pretty damning evidence that they wanted timestamp with no timezone to be unaltered by a timezone such that they won't even allow the driver to convert it to an OffsetDateTime or ZonedDateTime using a "default" timezone (how previous time & calendar objects worked).

They also specify that timestamp with timezone only be convertible to/from OffsetDateTime not in any fashion to a LocalDateTime.

It seems pretty clear, to me at least, they're looking to ensure they don't make the mistakes of the past and they're making the API user be explicit about how they are using timezones.

The previous code flies directly in the face of that explicit-ness. It converts a LocalDateTime to an OffsetDateTime (even just for string serialization); which the JDBC spec has said should not happen.

@davecramer
Copy link
Member

Reading the javadocs now I'd have to agree that LocalDateTime was never supposed to have any timezone information.

In which case I agree with this change

@vlsi
Copy link
Member

vlsi commented Jan 20, 2019

Just in case, the test is locale-dependent. That is not good.

до н.э. is a Russian for BC

org.junit.ComparisonFailure: LocalDateTime=-1996-06-30T23:59:59.999999, with TimeZone.default=Z, setObject(int, Object) 
Expected :1997-06-30 23:59:59.999999 до н.э.
Actual   :1997-06-30 23:59:59.999999 BC

	at org.junit.Assert.assertEquals(Assert.java:115)
	at org.postgresql.test.jdbc42.SetObject310Test.localTimestamps(SetObject310Test.java:282)
	at org.postgresql.test.jdbc42.SetObject310Test.testSetLocalDateTimeBc(SetObject310Test.java:356)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
	at org.junit.runners.Suite.runChild(Suite.java:128)
	at org.junit.runners.Suite.runChild(Suite.java:27)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
	at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
	at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
	at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
	at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
	at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)

@kdubb
Copy link
Contributor Author

kdubb commented Jan 20, 2019

@vlsi Just to be clear... that isn't the test I altered. This existed before my change and is related to other tests.

@vlsi
Copy link
Member

vlsi commented Jan 20, 2019

@kdubb , technically speaking, the test was disturbed in https://github.com/pgjdbc/pgjdbc/pull/1388/files#diff-8b723c347b2146e27df7b1aa5f76e800L317

As far as I understand, when using ISO, PostgreSQL never prints locale-dependent BC.

@kdubb
Copy link
Contributor Author

kdubb commented Jan 20, 2019

@vlsi The code the formatter replaced appended the literal value "BC". Again, existed before my change.

String expected = bcDate.toString().substring(1).replace('T', ' ') + " BC";

I'm all for it being fixed but maybe this should be taken up in a different issue/PR as not to conflate the reasoning of this one.

@vlsi
Copy link
Member

vlsi commented Jan 20, 2019

Re this one:

https://travis-ci.org/pgjdbc/pgjdbc/jobs/482125785#L1090-L1094

testSetLocalDateTime[binary = FORCE](org.postgresql.test.jdbc42.SetObject310Test)  Time elapsed: 0.546 sec  <<< FAILURE!
org.junit.ComparisonFailure: LocalDateTime=2000-03-26T02:00, with TimeZone.default=Europe/Moscow, setObject(int, Object) expected:<2000-03-26 0[2]:00:00> but was:<2000-03-26 0[3]:00:00>
	at org.junit.Assert.assertEquals(Assert.java:115)
	at org.postgresql.test.jdbc42.SetObject310Test.localTimestamps(SetObject310Test.java:282)
	at org.postgresql.test.jdbc42.SetObject310Test.testSetLocalDateTime(SetObject310Test.java:220)

@kdubb
Copy link
Contributor Author

kdubb commented Jan 20, 2019

@vlsi I am fully aware my "fix" may not handle all cases you are adding. It doesn't change the fact that the original test was incorrect.

It seems I fixed the text version but not the binary; if so, the binary now needs fixing.

ZonedDateTime zonedDateTime = localDateTime.atZone(getDefaultTz().toZoneId());
return toString(zonedDateTime.toOffsetDateTime());
OffsetDateTime offsetDateTime = localDateTime.atOffset(ZoneOffset.UTC);
return toString(offsetDateTime);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This renders something like 2019-07-30 09:17:25.676+00
Why can't this be 2019-07-30 09:17:25.676, i.e. without zone offset?

Consider use-cases

  • inserting ? bound to LocalDateTime to a timestamp column. Then zone-less literal would be OK.
  • inserting ? bound to LocalDateTime to a timestamp with time zone column (which represents point in time in PostgreSQL). Then zone-less literal should probably be coerced on server side to session zone, so zone-less literal should be OK. UTC-zoned literal will likely be not OK as it will represent a different point in time.
  • (....)

If this needs to stay this way, please add some explanation.

// LocalDateTime is always passed with time zone so backend can decide between timestamp and timestamptz
ZonedDateTime zonedDateTime = localDateTime.atZone(getDefaultTz().toZoneId());
return toString(zonedDateTime.toOffsetDateTime());
OffsetDateTime offsetDateTime = localDateTime.atOffset(ZoneOffset.UTC);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do the opposite there. LocalDateTime has no time zone, we should not make up one. I would simply not send a time zone if we don't have one.

@@ -199,7 +214,7 @@ public void testSetLocalDateTime() throws SQLException {
ZoneId zone = ZoneId.of(zoneId);
for (String date : datesToTest) {
LocalDateTime localDateTime = LocalDateTime.parse(date);
String expected = localDateTime.atZone(zone)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO it is correct this was changed, asserting that 2000-03-26T02:00:00 is stored or retrieved as 2000-03-26T03:00:00 when the system time zone is a specific one is wrong IMHO. 2000-03-26T02:00:00 is 2000-03-26T02:00:00. As it has no time zone the behaviour should be the same and independent of any time zone.

I think the tests should be changed though to not call #getString as it currently calls PgResultSet#internalGetObject which ends up calling java.sql.Timestamp#toString() which is bound to the JVM time zone and can therefore not display timestamps that don't exist in that time zone. As LocalDateTime has no time zone this is wrong IMHO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense.. care to provide a PR ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking at this and wondering why we are using timezones at all on LocalDateTime?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense.. care to provide a PR ?

I can have a look. I'll have to check if it breaks any tests. It may take a few days.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not worth your time until we decide how to deal with this.

@bokken
Copy link
Member

bokken commented Jan 5, 2022

My understanding of the timestamp and timestamptz data types is that both are effectively some fractional second offset since epoch, but the difference is that for timestamp epoch is Jan 1, 1970 in some user defined timezone and for timestamptz it is always UTC.

Neither of those map very cleanly to the concept of LocalDateTime, which is timezone-less and has no concept of epoch.
/An/ approach is to always use UTC for LocalDateTime (whether timestamp or timestamptz) to get from LocalDateTime to an offset from an epoch. This is an approach which will not be affected by any daylight savings issues and will produce consistent results regardless of the database or client timezone.

@davecramer
Copy link
Member

My understanding of the timestamp and timestamptz data types is that both are effectively some fractional second offset since epoch, but the difference is that for timestamp epoch is Jan 1, 1970 in some user defined timezone and for timestamptz it is always UTC.

Neither of those map very cleanly to the concept of LocalDateTime, which is timezone-less and has no concept of epoch. /An/ approach is to always use UTC for LocalDateTime (whether timestamp or timestamptz) to get from LocalDateTime to an offset from an epoch. This is an approach which will not be affected by any daylight savings issues and will produce consistent results regardless of the database or client timezone.

That's an interesting question. What value is actually stored in a timestamp. I assume it would be epoh from UTC

@vlsi
Copy link
Member

vlsi commented Jan 6, 2022

Unfortunately, I do not understand the nature of the issue yet. I would review this sometime later.

The old behavior was there for quite some time, so users might already have implemented workarounds to "invalid pgjdbc behaviour". So we can't easily change the default without breaking people's data.

If we really want to flip the default (e.g. if the current behaviour is buggy), I would suggest the following:

  • We implement a feature flag so the new behaviour could be activated on demand: "localdatetime.mode= old | warn | error | new"
  • We start printing warnings whenever users hit the problematic code (e.g. when they really use the problematic lines)
  • Sometime later (e.g. after 6-12 months) we could flip the default
  • We might add an API so the users can request "what is the current behaviour of the driver". For instance something like is_LocalDateTime_sending_client_timezone. That might help writing client code in such a way so it automatically skips the workaround when the driver behaviour changes (e.g. default flips).

@bokken
Copy link
Member

bokken commented Jan 6, 2022

@davecramer, the binary parsing[1] of timestamp shows how we adjust the offset from epoch from some timezone. Note the comments about dragons.

[1] - https://github.com/pgjdbc/pgjdbc/blob/master/pgjdbc/src/main/java/org/postgresql/jdbc/TimestampUtils.java#L1211-L1215

@davecramer
Copy link
Member

@bokken thanks for that... more data for the indecision ... I guess my concern is what to do with timestamps without TZ. If the object that they request requires TZ then we can attempt to do something although it's apparent that it may be wrong.
At this point I'd like to do some research to see what others are doing

@marschall
Copy link
Contributor

My understanding of the timestamp and timestamptz data types is that both are effectively some fractional second offset since epoch, but the difference is that for timestamp epoch is Jan 1, 1970 in some user defined timezone and for timestamptz it is always UTC.

Neither of those map very cleanly to the concept of LocalDateTime, which is timezone-less and has no concept of epoch. /An/ approach is to always use UTC for LocalDateTime (whether timestamp or timestamptz) to get from LocalDateTime to an offset from an epoch. This is an approach which will not be affected by any daylight savings issues and will produce consistent results regardless of the database or client timezone.

Short answer: no, these are unfortunate side effects of implementation details of old JDK classes (java.util.Date, java.sql.*), the whole point of the java.time API is to get id of these

Long answer:

In SQL TIMESTAMP is just yyyy-MM-dd HH:mm:ss.SSSSSS, no time zone. Therefore also no daylight savings time or any need to convert anything between client and server time zones. This matches exactly to LocalDateTime. IMHO we should just send these values as is without any conversion and decode them without any conversion. How the server internally stores them should not concern us, we just keep to the protocol. In binary mode (which isn't implemented yet) a TIMESTAMP is implemented as microseconds since an epoch (2000-01-01 00:00:00). However since no time zones are involved we do not need to introduce any time zones and can just let the java.time API compute the microseconds without invoking any time zones. This is possible since SQL TIMESTAMP and LocalDateTime are not points in time or instants since no time zones are involved. Notice how everything is very simple, no time zones, no daylight savings time, no millis since the epoch are involved just yyyy-MM-dd HH:mm:ss.SSSSSS.

Where it gets complicated is once we involve java.sql.Timestamp. Keep in mind that the point of java.time was to solve these problems, so we should not introduce these issues in the java.time code path. java.sql.Timestamp is (unfortunately) implemented as a subclass (but not a subtype) of java.util.Date so we first need to understand java.util.Date. java.util.Date identifies a point in time or instant by means of milliseconds since the UNIX epoch (1970-01-01T00:00:00Z). It has no year, month, day, hour, minute, second, ... because it has no timezone. See also this Jon Skeet article. We can see this because all the year, month, day, hour, minute, second methods on java.util.Date are deprecated and tell us to use a java.util.Calendar and a time zone. In addition java.util.Date offers conversion only to and fromInstant, not LocalDateTime, ZonedDateTime or OffsetDateTime.
java.sql.Timestamp, besides adding nanoseconds, defines the year, month, day, hour, minute, second, ... to be in JVM time zone. Suddenly everything gets very complicated as two time zones are involved. What previously was, and semantically still is, yyyy-MM-dd HH:mm:ss.SSSSSS is now the local values in JVM time zone of a point in time / instant identified by the number of milliseconds since the epoch in UTC. Besides making everything unnecessarily complicated it also introduces new issues that come with time zones, namely what you do with yyyy-MM-dd HH:mm:ss.SSSSSS that don't exist as local values in JVM time zone because of daylight savings time changes.
java.sql.Timestamp is the equivalent of LocalDateTime which we can see from the additional conversion methods it offers.
For a meaningful #toString implementation java.sql.Timestamp references the JVM default time zone at object creation time. Unfortunately this is currently used for RestultSet#getString on TIMESTAMP columns.

Now for TIMESTAMP WITH TIME ZONE. What SQL calls a time zone is actually just a UTC offset in HH:mm. This is both good and bad. It's bad for the application as it doesn't allow you to do time calculation across daylight savings time changes. It's good for both Postgres and the driver because it keeps things things really simple, just subtract the offset and you have UTC. No need to deal with daylight savings time
On the Java side with java.time it's really simple again, OffsetDateTime is literally just a LocalDateTime plus a ZoneOffset. String serialisation is really simple, we use what we have for LocalDateTime and additionally append the ZoneOffset. For binary mode we can either compute the micros between 2000-01-01 00:00:00Z or convert to LocalDateTime in UTC. We can not send the zone offset in binary mode, everything is assumed to be UTC. I assume Postgres stores timestamptz internally the same as timestamp. Therefore if we send a zone offset over the text protocol Postgres converts it to UTC. Notice how only it is only now when our values have "time zones" that time zones have entered the discussion.
For java.sql.Timestamp it gets even more complicated, you're supposed to additionally pass a java.util.Calendar and take the time zone from there.

@bokken
Copy link
Member

bokken commented Jan 6, 2022

@davecramer, LocalDateTime maps really well to the Oracle DATE or TIMESTAMP data types. Neither of those are natively offset constructs. Rather they have year, month, day, hour, minute, seconds values.
It even maps ok to the Oracle TIMESTAMP WITH TIMEZONE, because the timezone is actually stored as a distinct attribute. It is reasonably easy to get each of the discrete field values in that timezone to populate a LocalDateTime.

@marschall
Copy link
Contributor

  • We start printing warnings whenever users hit the problematic code (e.g. when they really use the problematic lines)

I'm struggling to come up with a good way to detect this:

  • We could display a warning once for anybody that uses the LocalDateTime path and runs a time zone that observes daylight saving time. This could potentially be a lot of people that in practice never observe any issues.
  • We could look at each value that we store and the time zone and figure out whether it potentially falls into a daylight savings time transition. This is a potentially expensive calculation that has to be done for every value.
  • In TimestampUtils#toTimestamp(Calendar, String) we could check whether Calendar#get(int) returns the same values for the fields we just set. This affects people reading that through #getTimestamp or #getString. This would only be fixed if users migrate to LocalDateTime.

@bokken
Copy link
Member

bokken commented Jan 6, 2022

@marschall, are you suggesting to use the plus[1] and until[2] methods to convert from/to binary representation for timestamp?
Assuming that actually aligns with how postgresql does the calculations, that would seem to be a pretty decent improvement over current handling for that type.

[1] - https://docs.oracle.com/javase/8/docs/api/java/time/LocalDateTime.html#plus-long-java.time.temporal.TemporalUnit-
[2] - https://docs.oracle.com/javase/8/docs/api/java/time/LocalDateTime.html#until-java.time.temporal.Temporal-java.time.temporal.TemporalUnit-

@bokken
Copy link
Member

bokken commented Jan 6, 2022

Postgresql does not actually store any timezone or offset information as part of timestamptz. It is effectively identical to Instant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants