Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TimeTest unit test appears to only pass in UTC #1048

Open
bokken opened this issue Dec 29, 2017 · 39 comments
Open

TimeTest unit test appears to only pass in UTC #1048

bokken opened this issue Dec 29, 2017 · 39 comments

Comments

@bokken
Copy link
Member

bokken commented Dec 29, 2017

The test creates a java.sql.Time object using the deprecated constructor setting hours, minutes and seconds to 0. This constructor interprets all of these values in the local timezone (using constructor from java.util.Date). This value is set to variable midnight.
The value "00:00:00" is directly inserted to the table. This value is then read from the table. PGResultSet has specific logic for the string "00:00:00", always returning new Time(0);.
The test then compares this value to midnight for equality. These values can only be equal if the local time zone is UTC so that the original midnight has no offset.

The behavior in PGResultSet seems wrong to me in that it ignores the Calendar for the 2 special values of 00:00:00 and 24:00:00. But that may be overly simplistic.

@davecramer
Copy link
Member

we've always had problems with the tests passing in different timezones when daylight savings changes. If you can fix this that would be great

@bokken
Copy link
Member Author

bokken commented Dec 29, 2017

But what is the fix? It /seems/ like it might actually be a code defect, not a test defect, but I am not certain.

@bokken
Copy link
Member Author

bokken commented Dec 29, 2017

Indeed, changing PGResultSet to remove the special handling for 00:00:00 and 24:00:00 seems like the fix to me.

@davecramer
Copy link
Member

be nice if whomever put that in there would comment why( hopefully it wasn't me)

@davecramer
Copy link
Member

davecramer commented Dec 29, 2017 via email

@bokken
Copy link
Member Author

bokken commented Dec 29, 2017

So if I follow that correctly, the problem is when using LocalTime.MIN and LocalTime.MAX?
As an aside, based on java doc for LocalTime.MAX, it seems that should be 23:59:59, not 24:00:00.
Regardless, the changes to getTime seem wrong. There seems to be no need for special handling of 00:00:00 (i.e. LocalTime.MIN). If LocalTime.MAX is to set 24:00:00, the current code appears to be invalid according to java doc of java.sql.Time: hour - 0 to 23

@davecramer
Copy link
Member

This is really an artifact of not being able to map PostgreSQL times to java times. Do you have a specific issue that this is not working for ?

@bokken
Copy link
Member Author

bokken commented Dec 29, 2017 via email

@davecramer
Copy link
Member

davecramer commented Dec 29, 2017 via email

@bokken
Copy link
Member Author

bokken commented Dec 29, 2017 via email

@bokken
Copy link
Member Author

bokken commented Dec 29, 2017 via email

@davecramer
Copy link
Member

@vlsi comments?

@vlsi
Copy link
Member

vlsi commented Dec 29, 2017

@bokken , what is the issue?
Would you please state it as in "expected outcome / actual outcome"? Would you please add a PR with failing test?

The test you mentioned does call cal.setTimeZone(TimeZone.getTimeZone("GMT")); (

cal.setTimeZone(TimeZone.getTimeZone("GMT"));
), so it is supposed to be GMT-only test.

There are getTime(int, Calendar) tests as well:

public void testGetTime() throws Exception {

There's

public void localTimestamps(String timeZone) throws Exception {
test as well

@bokken
Copy link
Member Author

bokken commented Dec 29, 2017

The existing test fails when run in any timezone other than UTC/GMT (i.e. offset of 0).

The explicit setting of time zone to GMT for that calendar is to calculate the offset for the local time zone. Note that Calendar instance is never used to interact with any of the pgjdbc methods.

The existing TimeZone tests do not cover the special values for getTime of "00:00:00" or "24:00:00". The current behavior for both of those values is to ignore the calendar provided. I have added tests for both of those cases in PR #1049

In doing this, I have found an additional problem when getTime is called with a Calendar for a column data type which includes TZ information. But I will bring that up a different issue.

@bokken
Copy link
Member Author

bokken commented Dec 29, 2017

I think the tests the behavior I am expecting for 00:00:00 is pretty straight foward.

The handling of 24:00:00, however, is far trickier. If it is not massaged to 23:59:59.999999999, then the point in time represented is actually midnight the next day. Which is to say that 1970-01-01 24:00:00 is the same as 1970-01-02 00:00:00. Indeed, the java.sql.Time object toString method will return 00:00:00 after parsing 24:00:00.
It seems like the goal is to treat 24:00:00 in the database as a sentinel value for "max time", similar to LocalTime.MAX. In this vein, I added tests to the PR that compare getTime and getObject(LocalTime.class) to assert consistent results.

@bokken
Copy link
Member Author

bokken commented Dec 31, 2017

The discussion for this is now over at #992.

To summarize, java.sql.Time and getTime methods have always treated 24:00:00 to be the same as 00:00:00.
I would propose that LocalTime do the same.

The PR #1049 has tests covering 00:00:00, 24:00:00 and various time zone options along with proposed fixes for the issues identified.

@davecramer
Copy link
Member

this artifact seems to be a mistake IMO. As noted on the hackers list https://www.postgresql.org/message-id/CADK3HHJgZ2dxdTOTYr4bD70ifzbT5k5xsRM%2Br%3DQAQfT8m4vj6Q%40mail.gmail.com 24:00:00 is exactly 1 day so treating it like 00:00:00 seems wrong.

@bokken
Copy link
Member Author

bokken commented Dec 31, 2017

@davecramer That seems to be a difference between duration and time. 1 day is a duration.
1 day after 00:00:00, the time is 00:00:00.

LocalTime actually provides pretty good semantics around this. One of the changes I also had to make in my PR was to handle the binary representation of 24:00:00 to LocalTime. The binary representation is 86,400,000,000 (exactly 1 day/24 hours in microseconds). Trying to create a LocalTime with that results in an IllegalArgumentException stating that the max allowed value is 85,399,999,999,999 (nano seconds). This is consistent with LocalTime not allowing a concept of 24:00:00; treating it the same as 00:00:00.

@davecramer
Copy link
Member

@bokken ya, there is definitely an impedance mismatch between time and duration here.
I'm open to suggestions

@bokken
Copy link
Member Author

bokken commented Dec 31, 2017

I think that the least wrong solution is to treat postgres 24:00:00 (string) and 86,400,000,000 (binary) as 00:00:00 and 0 respectively. It is not clear to me why postgres allows the representation of 2 midnight values, but since java only allows 1 (documented in java.sql.Time and actually enforced in LocalTime), massaging the second midnight into the first midnight seems reasonable.

This is how java.sql.Time has been treated historically and I propose that it is the cleanest mapping for LocalTime a well.

@bokken
Copy link
Member Author

bokken commented Dec 31, 2017

Also, this discussion has wandered a bit from what I originally reported (sorry).

The problem I was originally reporting is how getTime is handling string values for time of 00:00:00 and 24:00:00. It appears that the changes in f2d8ec5 added specific handling, and that handling appears wrong to me. Specifically, for 24:00:00 it always uses the default time zone (ignoring the provided Calendar) and for 00:00:00 it always uses UTC (ignoring the provided Calendar and the local time zone when no Calendar).
This behavior is also inconsistent with the binary representation of time for 0 and 86,400,000,000.
The one case where all of this could work correctly is if the default time zone is UTC.

@davecramer
Copy link
Member

I'm beginning to think that the current behaviour is not that wrong.

The reason this occurs is due to integer precision see https://www.postgresql.org/message-id/28026.1514742632%40sss.pgh.pa.us

and note the following:

anything beyond 6 9999995 produces 24:00:00

select '23:59:59.9999999'::time;
time

24:00:00
(1 row)

below that produces:

test=# select '23:59:59.999999'::time;
time

23:59:59.999999
(1 row)

So if we get 24:00:00 from the backend and then set it to 23:59:59.999_999_999 internally in the java code it should work fine. Writing it back to the server will end up with 24:00:00

as far as UTC goes. If there is no timezone information then it has to use UTC. If it is timetz then we should use the calendar provided.

@bokken
Copy link
Member Author

bokken commented Dec 31, 2017

That feels... odd.
My initial reaction is that this reflects wrong handling of fractional seconds on the server side. It seems it would be better to truncate fractional seconds at whatever precision is supported rather than round to the next hour.
However, it seems that ship has sailed since that change appears to have been made in 2005.

So do we have consensus that 24:00:00 only exists to represent rounding error from 23:59:59.999999?
@vlsi, do you concur?

{quote}
as far as UTC goes. If there is no timezone information then it has to use UTC. If it is timetz then we should use the calendar provided.
{quote}

I do not believe that is accurate. For timetz, the offset in the data type itself should be used and any provided Calendar should be ignored. For time (no time zone), the provided Calendar should be used, if no Calendar provided, the default TimeZone of the system should be used.

@vlsi
Copy link
Member

vlsi commented Dec 31, 2017

TL;DR: I would refrain from committing any time-related changes unless those are proven to be blocker/critical and/or trivial ones.

Frankly speaking, 42.2.0 release is a long overdue, and

  1. Current issue does not sound like a blocker/critical one
  2. Current issue does not sound like a trivial change
  3. I still do not see "expected vs actual" comparison that would stress inaccurate behavior of a current code
  4. PR test: test behavior of time columns for special values #1049 changes lots of things at once (both implementation and test code), and it is hard to trace what fixes what

bokken: I do not believe that is accurate. For timetz, the offset in the data type itself should be used and any provided Calendar should be ignored.

Why do you think so? Do you know server sends binary timestamps without time zones? Don't you say the query might produce different time values depending on the binary/text wire format?

@bokken
Copy link
Member Author

bokken commented Dec 31, 2017

Frankly speaking, 42.2.0 release is a long overdue

I agree.

Current issue does not sound like a blocker/critical one

The one area I might disagree is the if/else introduced here which handles both string values incorrectly in most time zones. It is also inconsistent with binary representation in most time zones. I cannot think of a single use case where this fixes any behavior which was previously wrong. While there are a couple of cases where it is not wrong, that is kind of like a broken clock being right twice a day :)

Current issue does not sound like a trivial change

Certainly not trivial. Timezones regularly make my head hurt.

I still do not see "expected vs actual" comparison that would stress inaccurate behavior of a current code

The TimeTest test0000Time provides a number of cases of expected/actual which are all currently broken in string representation /except/ when the time zone is GMT.
Here is an example from an earlier test from before adding in testing with more default time zones.

bokken: I do not believe that is accurate. For timetz, the offset in the data type itself should be used and any provided Calendar should be ignored.

Why do you think so? Do you know server sends binary timestamps without time zones? Don't you say the query might produce different time values depending on the binary/text wire format?

Because that is how the jdbc doc says it should behave.

Part of my proposed set of changes is that TimestampTZ should not be supported in binary format for the very reason that offset data is lost in the binary format.
This was required to get consistent results for getTime for timestamptz columns in both binary and string formats.

@vlsi
Copy link
Member

vlsi commented Dec 31, 2017

Because that is how the jdbc doc says it should behave.

Technically speaking, PostgreSQL DB does not store timezone information.

@vlsi
Copy link
Member

vlsi commented Dec 31, 2017

Part of my proposed set of changes is that TimestampTZ should not be supported in binary format for the very reason that offset data is lost in the binary format.

Now I see why, however please add explicit comments for that kind of changes. Otherwise the change is cryptic.

In any way, the offset data is not lost as it was not there in the first place. DB just uses "current timezone" to fake the offset. It might be related to "connection timezone", however it does not matter much, as the offset is never stored in the data files. That is why I do not think "text" format contains more information than binary one.

@vlsi
Copy link
Member

vlsi commented Dec 31, 2017

@bokken
Copy link
Member Author

bokken commented Dec 31, 2017

The TimeZoneTest seems to show that the string representation returns the offset that was stored. The test has set the default timezone to GMT+1. I am running this on a system where the timezone is America/Chicago (GMT-6), yet the string returned reflects GMT+3 that was inserted.

@bokken
Copy link
Member Author

bokken commented Dec 31, 2017

Now I see why, however please add explicit comments for that kind of changes. Otherwise the change is cryptic.

Agreed. I commented on it at the end of the block adding binary OIDs. It did not seem to make sense to have that in the middle of that block, but I can certainly move it if that would help.

@davecramer
Copy link
Member

I'm not convinced that we need to remove this. As @vlsi there is no timezone data in the server data. If the user wants binary timestamps and is aware of the issues then we should let them use them

@bokken
Copy link
Member Author

bokken commented Dec 31, 2017

So somewhere either in the unit tests or some config I have not noticed in my local postrgresql instance, the "session" timezone is being set to GMT+3.

That offset being returned led me to believe it was stored in the db.
Knowing that it is not stored, I would agree there is no need/value in removing the binary support.

However, it introduces a different set of potential issues on how getTime should behave for timestamp columns. My initial thought is that the timestamp should be read to determine absolute point in time, then time converted to the provide calendar.

My earlier comment about Calendar:

For timetz, the offset in the data type itself should be used and any provided Calendar should be ignored. For time (no time zone), the provided Calendar should be used, if no Calendar provided, the default TimeZone of the system should be used.

I was talking specifically about time and timez columns, not timestamps. As best I can tell, the timez column truly does store offset information in the database. In that case, the provided Calendar should not be used. For a simple time column, the provided Calendar (or determined default time zone), should always be used.

@davecramer
Copy link
Member

davecramer commented Jan 1, 2018 via email

@vlsi
Copy link
Member

vlsi commented Jan 1, 2018

As best I can tell, the timez column truly does store offset information in the database

Well, PG's timez does have an offset field. java.sql.Time does not have one.

What do you mean by "using offset information" when converting something like "15:40+04:00" into java.sql.Time? Which value do you expect to end up with?

@bokken
Copy link
Member Author

bokken commented Jan 1, 2018

What do you mean by "using offset information" when converting something like "15:40+04:00" into java.sql.Time? Which value do you expect to end up with?

As I read the jdbc javdoc, that should result in a java.sql.Time object representing 1970-01-01 15:40:00 +04:00
Regardless of what the default time zone is or any optional Calendar provided to getTime.

And this is mostly how pgjdbc has behaved historically. The existing TimezoneTest mostly asserts this behavior. I have found 1 issue where, if a Calendar with TimeZone that is something like GMT+12 or more, the TimestampUtils.convertToTime method butchers the data. This is why the existing TimezoneTest shows getTime with different Calendars always returns the same value, until given a timezone of GMT+13, then it returns something different.
This also affects when getTimestamp is called on a timetz column. This is also shown in the existing TimezoneTest where there is a "CHECK ME" comment seemingly indicating the author was unconvinced that the assertion was actually desired.

The tests I have added cover this behavior in a more exhaustive way, testing many more timezones and covering both string and binary representations. Though given my surprise on behavior of timestamptz, I have been thinking about adding more tests with different explicit offsets for timestamptz columns.

The only tests I have logically changed are the 2 Timezone tests referenced above which were asserting wrong behavior.

The changes to "real" code I have proposed cover basically 3 things:

  1. Remove the special handling of strings 00:00:00 and 24:00:00 in pg result set for getTime
  2. Massage the reading of time 24:00:00 to 23:59:59.999999 (both string and binary, both java.sql.Time and LocalTime)
  3. For timetz columns only use offset from DB for constructing java.sql.Time object (both string and binary).

To muddy the waters even more, java 8 introduced new methods to move between java.sql.Time and LocalTime. I started a different issue to discuss that: #1050
It is a radical departure to how pgjdbc has historically handled java.sql.Time.

@vlsi
Copy link
Member

vlsi commented Jan 1, 2018

As I read the jdbc javdoc, that should result in a java.sql.Time object representing 1970-01-01 15:40:00 +04:00

Your move. Please, go ahead and represent 1970-01-01 15:40:00 +04:00 via java.sql.Time. I'm all ears.

@bokken
Copy link
Member Author

bokken commented Jan 2, 2018

    final TimeZone timeZone = TimeZone.getTimeZone("GMT+4:00");
    Calendar cal = Calendar.getInstance(timeZone);
    cal.set(Calendar.YEAR, 1970);
    cal.set(Calendar.MONTH, 0);
    cal.set(Calendar.DAY_OF_MONTH, 1);
    cal.set(Calendar.HOUR_OF_DAY, 15);
    cal.set(Calendar.MINUTE, 40);
    cal.set(Calendar.SECOND, 0);
    cal.set(Calendar.MILLISECOND, 0);
    Time time = new Time(cal.getTimeInMillis());
    SimpleDateFormat sdf = new SimpleDateFormat("HH:mm:ss Z");
    sdf.setTimeZone(timeZone);
    System.out.println(sdf.format(time));

This will print out: 15:40:00 +0400
This is also basically what pgjdbc does to parse the timetz string 15:40:00+4 into a java.sql.Time.

@vlsi
Copy link
Member

vlsi commented Jan 8, 2018

@bokken , thanks for bringing the issue up.
tz issues are extremely hard to get, and I hope I was not too resistant.

On the other hand, by

vlsi: Would you please state it as in "expected outcome / actual outcome"? Would you please add a PR with failing test?

I meant something like a3982b4 (a couple of lines changed in .travis.yml)

As you can see, it just adds time zone to the equation, and it does result in CI test failures: https://travis-ci.org/pgjdbc/pgjdbc/builds/326318944

TL;DR: it is preferred to have as short reproducers as it is possible, so I hope that suggests why I was not eager reading through https://github.com/pgjdbc/pgjdbc/pull/1049/files.

There might be good stuff in #1049, however I would really like to see the problem first, only then discuss solutions.

PS. I think MIN/MAX/24:00:00 is not tested properly, however we do have multiple issues on the subject. I'm inclined to close as much of them as it is possible, otherwise they will pile up with no good result.

@bokken
Copy link
Member Author

bokken commented Jan 8, 2018

@vlsi, unfortunately, I do not know enough about travis to have even considered that sort of proposal.

I attempted (though clearly inarticulately and certainly not early enough in conversation) to indicate that simply running the existing tests in a different default timezone caused failures.

There might be good stuff in #1049, however I would really like to see the problem first, only then discuss solutions.

Your changes in #1053 actually fairly closely match significant portions of #1049. The specific pieces which are different are the handling of 24:00:00 (whose meaning still does not seem clear to me and further discussion is absolutely warranted) and timestamptz (where I think the provided calendar should be used).

PS. I think MIN/MAX/24:00:00 is not tested properly, however we do have multiple issues on the subject. I'm inclined to close as much of them as it is possible, otherwise they will pile up with no good result.

+1
I do think there is value in taking a closer look specifically at the unit tests in #1049. There are areas where testing 24:00:00 and timestamptz where the expected value might need to be changed, but the tests cover a number of use cases which are not currently covered. For example, reading 24:00:00 as binary in LocalDateTime currently fails because the value 86,400,000,000 is larger than the allowed max.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants