Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JDBC's setTimestamp method #2888

Merged
merged 11 commits into from
Jan 20, 2022
Merged

Conversation

Jens-H
Copy link
Contributor

@Jens-H Jens-H commented Jan 8, 2022

This PR refactors the conversions to/from Java's sql.Timestamp to enable the setTimestamp method.

The first step is to change the transformation between a DuckDB timestamp_t and sql.Timestamp. Instead of a conversion to string and back, a new class representing the DuckDB timestamp in the Java world is introduced. This class implements the conversions in both directions and therefore also enables the usage of setTimestamp and setObject with Timestamp. The second commit adds the necessary changes for the set... enhancement.

Also the TestDuckDBJDBC class got some new tests.

Copy link
Contributor

@hawkfish hawkfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! My main concerns are the difference between local time and GMT and testing the new TIMESTAMP WITH TIME ZONE type. Also, your editor seems to be using a different indentation setting from the rest of the file (we have an .editorconfig file but it might not be set correctly for Java?)

@@ -325,9 +330,11 @@ JNIEXPORT jobjectArray JNICALL Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1fetch(
case LogicalTypeId::DOUBLE:
constlen_data = env->NewDirectByteBuffer(FlatVector::GetData(vec), row_count * sizeof(double));
break;
case LogicalTypeId::TIMESTAMP:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget TIMESTAMPTZ!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls. see below.

@@ -164,6 +164,10 @@ public void setObject(int parameterIndex, Object x) throws SQLException {
if (params.length == 0) {
params = new Object[getParameterMetaData().getParameterCount()];
}
// Change sql.Timestamp to DuckDBTimestamp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Java's sql library have TIMESTAMPTZ too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you know what type of timestamp you will get back, it is possible to use sql.Timestamp in both cases and then make a cast. In my understanding this is legacy since Java 8 and the much better java.time package. Via the get/setObject method an OffsetDateTime can be used for TIMESTAMPZ, but as this is also new in DuckDB I haven't tried it.

I would defer the whole timestamp with timezone to a later PR as it is a new feature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIMESTAMPTZ internally is identical to TIMESTAMP so I was thinking just an extra case label was all that was needed. It sounds like Java (v8?) actually treats all timestamps as TIMESTAMPTZ and does the part binning in the users' local time zone?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. E.g. a Timestamp.valueOf("1970-01-01 00:00:00").getTime() has the result -3600000 in my timezone. This is why it is necessary to normalize the long representation of the TIMESTAMP to UTC inside the database and then back to what ever timezone the client is in.

Implementing the TIMESTAMPTZ should be straight forward, but there is some work left to implement the get/setObject methods.

}

public DuckDBTimestamp(Timestamp sqlTimestamp) {
this.timeMicros = DuckDBTimestamp.RefLocalDateTime.until(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DuckDB (and RDBMSes in general) store GMT instants, so is local time what is wanted here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java's sql.Timestamp and LocalDateTime (and util.Date) do not have a timezone, but internally they do have it in the difference to epoch. For storage, this has to be normalized to GMT and translated to the same value in the client timezone while reading.

JDBC has only get and setTimestamp, but get and setObject allow to ask for a LocalDateTime. For this reason the transformations are already implemented, though the get/setObject calls for LocalDateTime are not supported (yet).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so older Java uses naïve timestamps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you mean "just store the diff to epoch as int" with naïve timestamps, then I believe the answer is no.

public static void test_duckdb_timestamp() throws Exception {
Connection conn = DriverManager.getConnection("jdbc:duckdb:");
Statement stmt = conn.createStatement();
stmt.execute("CREATE TABLE a (ts TIMESTAMP)");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also test TIMESTAMPTZ

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls. see above


// Insert and read a timestamp
stmt.execute("INSERT INTO a (ts) VALUES ('2005-11-02 07:59:58')");
ResultSet rs = stmt.executeQuery(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation looks odd?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot that VIM (with my config) changes tabs to spaces and created a mess here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, we've all been there! Maybe install the VIM Editor Config plugin?

ps = conn.prepareStatement(
"SELECT COUNT(ts) FROM a WHERE ts = ?");
ps.setObject(1, Timestamp.valueOf("2005-11-02 07:59:58"), Types.TIMESTAMP);
ResultSet rs4 = ps.executeQuery();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where you could check the local/GMT time values. The EPOCH function will tell you what the count of seconds from the GMT epoch is for each value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I missed something, but comparing the seconds from epoch between DuckDB's timestamp and Java's timestamp will not work if the client has not GMT. Only the result of the toString() functions can be compared and must be equal in the end.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly does Java store in these objects? I was under the impression it was an instant that got binned in the local time zone, but it sounds like they are actually storing naïve timestamps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To sum it up:
The older Java objects (Timestamp, Date) may store the diff to epoch and the timezone. The complete replacement with the new java.time package indicate that there were problems with the implementation and approach.

The newer LocalDateTime has a LocalDate and LocalTime, which keep the year, months, (...) as ints. This means they are unaware of the timezone.
The newer OffsetDateTime stores a LocalDateTime and a timezone offset.


Timestamp ts = Timestamp.valueOf("1970-01-01 01:01:01");

for (long i = 134234533L; i < 13423453300L; i=i+73512) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The EPOCH test would fit well here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls. see above

@hawkfish
Copy link
Contributor

Now that I've had a chance to look at the Java docs (sorry it took so long!) I'm thinking that theLocalDateTime is a bit of a distraction. Java has an Instant type, which is exactly what DuckDB uses internally -and the correct type to use for temporal modeling. Does it make sense to always use this type to translate from TIMESTAMP? Would that help with future versions of Java temporal libraries?

@Jens-H
Copy link
Contributor Author

Jens-H commented Jan 15, 2022

I am not sure if you want to return Instant from the JDBC driver or rather use it as an internal, intermediate class?

@Jens-H
Copy link
Contributor Author

Jens-H commented Jan 15, 2022

As an intermediate class Instant could be a good fit for TIMESTAMP WITH TIMEZONE.

Unfortunately the conversion to OffsetDateTime is not very straight forward:
Instant.ofEpochSecond(0).atZone(ZoneOffset.systemDefault()).toOffsetDateTime()

For timestamps without a timezone I am not sure. Look at this behavior:
Timestamp.from(Instant.ofEpochSecond(0)) has this result: 1970-01-01 01:00:00.0

From my point of view the Instant would be no help for normal TIMESTAMPs.

@hawkfish
Copy link
Contributor

This stuff is such a mess. I feel like these libraries used to be assigned to junior devs because the manager thought it was simple and too boring for the rest of the team, and the junior dev didn't know what they were getting into.

I think the main thing now is just to make sure it works as expected, and that test results are not test location dependent. For the latter, value checking should be converted to UTC and formatted with offsets to make sure. I'd also suggest that since Java tries to be so "helpful" that the test case strings include TZ specifiers (e.g., Timestamp.valueOf("1970-01-01 01:01:01 +00:00")) just to be safe.

@Jens-H
Copy link
Contributor Author

Jens-H commented Jan 15, 2022

Timestamp.valueOf() does no accept a timezone, but the test could run several times in different local time zones using TimeZone.setDefault(). Would that be an option?

Jens-H and others added 2 commits January 16, 2022 22:18
For better test coverage the default timezone is changed at least 2 times
Copy link
Contributor

@hawkfish hawkfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want fun time zones, there are always Asia/Katmandu and America/St_Johns (east and west partial hours). But thanks for adding this.

@Jens-H
Copy link
Contributor Author

Jens-H commented Jan 17, 2022

So is there anything missing?

@hawkfish
Copy link
Contributor

Nope - I gave it my stamp of approval! 🐠

@Mytherin
Copy link
Collaborator

Thanks!

@Jens-H Jens-H deleted the set-timestamp-parameter branch February 24, 2022 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants