Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PreparedStatement.setTimestamp(index, timestamp, calendar) has datatype conversion issues. #443

Closed
gs-rezaem opened this issue Aug 14, 2017 · 40 comments
Assignees

Comments

@gs-rezaem
Copy link

Driver version or jar name

The test works correctly with the old version of driver (2.0.1803), but fails with 4.x and 6.x, including 6.2.1 and 6.3 preview.

SQL Server version

Microsoft SQL Server 2008 (SP4) - 10.0.6241.0 (X64)

Client operating system

Windows 7

Java/JVM version

Oracle JDK 1.8.0_xxx

Table schema

See the test code Stand alone test class

Problem description

See the test code here: Stand alone test class

Expected behavior and actual behavior

When the query is executed with a prepared statement, it should return the correct result.

Repro code

Stand alone test class
This test code works correctly with MSSQL/2.0 driver, MSSQL/jtds driver, 4 other commercial and 3 open source databases. It fails with 4.x and 6.x series drivers.

@gordthompson
Copy link
Contributor

This appears to be a result of sending the Timestamp parameter value as datetime2:

exec sp_executesql N'select count(*) from #TEST_TIMESTAMP_CAL where TIMESTAMP_COL_NONE = @P0        ',N'@P0 datetime2','2007-01-01 01:01:01.9990000'

We can see the same behaviour with a stored procedure that tries to use a datetime2 parameter to look up values in a datetime column.

The following procedure returns a count of 0 ...

CREATE PROCEDURE issue_443 
    @p1 DATETIME2 = '2007-01-01 01:01:01.999'
AS
BEGIN
    SET NOCOUNT ON;
    CREATE TABLE #TEST_TIMESTAMP_CAL (ID INTEGER NOT NULL, TIMESTAMP_COL_NONE DATETIME NOT NULL);
    INSERT INTO #TEST_TIMESTAMP_CAL (ID, TIMESTAMP_COL_NONE) VALUES (1, '2007-01-01 01:01:01.999');
    SELECT COUNT(*) AS n FROM #TEST_TIMESTAMP_CAL WHERE TIMESTAMP_COL_NONE = @p1;
END

... but if we simply change the parameter type from DATETIME2 to DATETIME ...

ALTER PROCEDURE issue_443 
    @p1 DATETIME = '2007-01-01 01:01:01.999'
AS
...

... then it returns a count of 1.

@gs-rezaem gs-rezaem changed the title PreparedStatement.setTimestamp(index, timestamp, calendar) has timezone conversion issues. PreparedStatement.setTimestamp(index, timestamp, calendar) has datatype conversion issues. Aug 15, 2017
@gs-rezaem
Copy link
Author

So I've been thinking this through and I'm not sure if this is fixable. It appears to be a very unfortunate combination of rounding, driver parameter type selection and default casting/conversion rules.

The drivers that work send the data as datetime. Modifying the current driver to send datetime also works, but I'm reasonably certain that would break datetime2 columns.

Here is my current understanding of how this works:
on insert, the value stored in the database is '2007-01-01 01:01:02.000' (not 1.999) because of inherent precision of datetime.

  1. TIMESTAMP_COL_NONE = '2007-01-01 01:01:01.999' works because the database sees this:
    <type datetime> = <type varchar literal>
    default casting rule: convert varchar literal to datetime, which results in '2007-01-01 01:01:02.000', and matches the value in the database.

  2. TIMESTAMP_COL_NONE = ? , with ? sent as datetime has no conversion issues and just works.

  3. TIMESTAMP_COL_NONE = ?, with ? sent as datetime2. the database sees:
    <type datetime> = <type datetime2>
    default casting rule: upcast the left side to datetime2. This causes the table value to be interpreted as '2007-01-01 01:01:02.000' , but the parameter is '2007-01-01 01:01:01.999', which don't match.

So is this a driver bug? Can the driver intelligently choose between datetime and datetime2?

Is the default conversion rule for column = parameter or column = variable badly chosen on the server side? Under what conditions does it make sense to upcast a datetime to datetime2, if the datetime2 is a parameter or variable?

@gordthompson
Copy link
Contributor

gordthompson commented Aug 15, 2017

@gs-rezaem - I think you've nailed it. I had forgotten about datetime rounding values so that the third millisecond digit is 0, 3, or 7.

The good news is that it is fixable if the driver responds to prepareStatement by actually sending the statement (command text) to the server to be prepared, retrieving the results of the prepare to determine the data types that the server expects, and then sending the parameter values in the appropriate format.

The bad news is that this extra step (and server round-trip) is something that the mssql-jdbc team has been working hard on trying to optimize out of the process, primarily for the case when PreparedStatement objects are used to send lots of "one-off" queries.

A user-side workaround would be to specify the parameters for datetime fields using setString instead of setTimestamp. However, that does not address the general case, e.g., for ORMs or other libraries that create the JDBC calls on our behalf.

I seem to recall someone ( @TobiasSQL ?) mentioning an option to turn off the optimization(s) and let the driver work with PreparedStatement objects the old-fashioned way. I did a bit of searching around but unfortunately I could not find it.

@gs-rezaem
Copy link
Author

gs-rezaem commented Aug 15, 2017

@gordthompson it'd be great for the driver to fix this, because it's much harder to fix at the app/ORM layer. One of the core promises of JDBC is taking care of DB specific types. An analogous example is the mapping to String: set/getString ought to work regardless of whether the column is defined as char, varchar, nvarchar, unichar, clob, text or any other DB specific type.

I didn't really want to go into workarounds, because they all cause other problems. For example, an ORM could issue the SQL as TIMESTAMP_COL_NONE = CAST(? as datetime) to workaround the driver's insistence on datetime2, but that would break for datetime2.

@gs-rezaem
Copy link
Author

gs-rezaem commented Aug 15, 2017

I tried the driver parameter enablePrepareOnFirstPreparedStatementCall=true as well as executing the same preparedStatement twice without any luck (not that I would want to execute every statement twice 😄 ). @gordthompson I guess even if the statement is sent to the server, the driver currently doesn't do anything with the returned types.

@gordthompson
Copy link
Contributor

I guess even if the statement is sent to the server, the driver currently doesn't do anything with the returned types.

Perhaps. I've been following some similar discussions regarding pyodbc (e.g., here and here) so I may have assumed JDBC behaviour based on what I understand about ODBC behaviour.

Now that I think about it, when they say "enablePrepareOnFirstPreparedStatementCall" they may actually mean that the first JDBC execute... call will result in an sp_prepexec instead of an sp_executesql. I honestly don't remember if (in recent history) the JDBC driver actually did a SQL Server sp_prepare in response to a JDBC prepareStatement call. I think that sp_prepare is what triggers the response from SQL Server that ODBC could process using the ODBC SQLDescribeParam function.

@AfsanehR-zz
Copy link
Contributor

That is correct @gordthompson . Setting enablePrepareOnFirstPreparedStatementCall results in executing the first call to be sp_prepexec rather than sp_executesql. However, as @gs-rezaem mentioned, even enabling that would still have no effect since the driver is still setting typeDefinition of the parameter as datetime2.
https://github.com/Microsoft/mssql-jdbc/blob/dev/src/main/java/com/microsoft/sqlserver/jdbc/Parameter.java#L640

@TobiasSQL
Copy link
Contributor

TobiasSQL commented Aug 18, 2017

@gordthompson , I am probably missing something but isn't the analog of ODBC's SQLDescribeParam JDBC's PreparedStatement.getParameterMetaData?

If you use this API to determine the type that should be a reasonable solution, no?

@gordthompson
Copy link
Contributor

@TobiasSQL - Good suggestion. getParameterMetaData will in fact send

exec sp_executesql N'exec sp_describe_undeclared_parameters @P0        ',N'@P0 nvarchar(4000)',N'select count(*) from TEST_TIMESTAMP_CAL where TIMESTAMP_COL_NONE = @P0        '

to the server, and if we interrogate the ParameterMetaData object we can determine that the server expects a datetime parameter, but we already knew that. :-)

What seems to be missing is the way for the JDBC driver to know that too, so if we use setTimestamp the driver can send the parameter value as datetime instead of datetime2.

@gs-rezaem
Copy link
Author

gs-rezaem commented Sep 6, 2017

@peterbae thanks for having a look at this. I don't believe this is trivial to fix, but thanks to the conversation here, I'd like to propose a fix:

Calling getParameterMetaData is a possible solution here.

  1. adding caching to getParameterMetaData so it ever hits the db once. (seems safe because the userSQL is final)
  2. if ps.setTimestamp is called, call getParameterMetaData and pass a boolean flag to value setters that represents whether the type is datetime.
  3. clear the parameter data cache on execute. This is to guard against the extremely unlikely event of the types changing between calls, as might happen with drop/recreate of a (possibly temp) table.

Pros: Fixes the issue. No extra database hops if ps.setTimestamp is never called. Possible benefits from caching getParameterMetaData.
Cons: Extra db hop if ps.setTimestamp is called.

What do you think of this? If the above has a good chance being merged, I'll see if I can code it up. Can you point me to an existing test case that I can use as a template for this?

@peterbae
Copy link
Contributor

peterbae commented Sep 6, 2017

Hi @gs-rezaem, thanks for the contribution.

I had thought of a very similar solution to yours for this problem, but I couldn't get to the implementation as I was busy with focusing on other priorities. Please feel free to implement this and create a PR - I have a couple of comments as well:

  • I believe ps.setObject and ps.setDateTime should be checked as well as ps.setTimestamp for setting the boolean flag since those methods can be used to insert datetime as well.
  • As for clearing the parameter data cache on execute, it's an excellent point but since the user could be using a different Statement object to drop/recreate a table (and this different Statement object would not be able to clear the caching in the pstmt), I'm not sure how effective this would be.

As for the existing test case, the tests under TVP and BulkCopy test all datatypes, so you can run those to verify your fixes. For example: https://github.com/Microsoft/mssql-jdbc/blob/master/src/test/java/com/microsoft/sqlserver/jdbc/tvp/TVPAllTypes.java#L144

@ajlam
Copy link
Member

ajlam commented Sep 12, 2017

What do you think of this? If the above has a good chance being merged, I'll see if I can code it up. Can you point me to an existing test case that I can use as a template for this?

@gs-rezaem, wondering if you're still planning on submitting a fix?

@ajlam ajlam added the Waiting for Response Waiting for a reply from the original poster, or affiliated party label Sep 12, 2017
@gs-rezaem
Copy link
Author

Sorry, I was out with a cold and just getting back into this. I expect this will take me another week to deliver.

@ajlam
Copy link
Member

ajlam commented Sep 13, 2017

@gs-rezaem, glad you're feeling better 😄 let us know if you need any pointers or run into any issues. Thanks again for your contributions.

@gs-rezaem
Copy link
Author

I had a little trouble with the test cases. I can't seem to find a test case that does a select statement with a parameterized prepared statement. Let me know if I missed something. So here is the plan:

  1. Submit a pull request for a passing test case. Just to make sure we're on the same page here, formatting is good, cla is signed, etc. Test case for a select statement with prepared statement with parameters #492
  2. Modify the test to fail with the appropriate data and fix the code.

@gs-rezaem
Copy link
Author

As I'm digging into this, I've realized that this bug has a larger surface area than I had originally thought. Specifically, the setObject(... int jdbcType ...) is troublesome. In designing the solution, I'm prioritizing backward compatibility over code readability, which is not a great choice. This comes down to one thing, in Types.java:

    /**
     * The constant in the Java programming language, sometimes referred to as a type code, that identifies the Microsoft SQL type DATETIME.
     * This type is considered legacy and DATETIME2 is the preferred new type.
     */
    public static final int LEGACY_DATETIME = -152;

    /**
     * The constant in the Java programming language, sometimes referred to as a type code, that identifies the Microsoft SQL type DATETIME2.
     */
    public static final int DATETIME = -151;

I'm clarifying Types.DATETIME to be DATETIME2 (because that's how the driver is behaving now) and introducing LEGACY_DATETIME, which maps to DATETIME. (The backward-incompatible, more readable choice would be to introduce DATETIME2).

Shout now if that's not consistent with how you want this done 😄.

@peterbae
Copy link
Contributor

peterbae commented Sep 14, 2017

I don't think we need to modify the Types.java file - we have DATETIME and DATETIME2 under SSType, so we can use those instead.

I think we can solve this problem simply by modifying the Parameter::setTypeDefinition method for TIMESTAMP and DATETIME switch cases, where we would use a cached flag (that gets set whenever the user calls setTimeStamp, setDateTime or setObject), and set param.typeDefinition to DATETIME2 only if we're using Katmai servers and the target column on the SQL Server's end is DATETIME2.

Please let me know if you need more help or clarification. 😄

@gs-rezaem
Copy link
Author

gs-rezaem commented Sep 14, 2017

The flag solution was what I had originally thought about. But as I'm looking at the code, I think it's both more practical and consistent to go with they type solution.

Practical:

  • don't need an extra flag. just pass in the correct jdbcType (already part of Parameter). This makes the wire serialization, which is also in Parameter trivial.
  • don't need to change 70+ call sites to setValue

Consistency:

  • The documentation in Types.DATETIME is out of date. the mapping was changed to DATETIME2 sometime in the 4.x line.
  • The point of having microsoft.sql.Types extending java.sql.Types appears to be allowing microsoft-specific type information to be passed into methods like setObject. There is no way to pick DATETIME currently.

Edit: clarification on that last point. I'm left wondering:
When a developer calls setObject(... jdbcType= java.sql.Types.TIMESTAMP ...) what is their expectation?
When a developer calls setObject(... jdbcType= microsfot.sql.Types.DATETIME ...) what is their expectation?
I imagine, in the first case (TIMESTAMP), they want the driver to behave per JDBC spec and map appropriately to the whatever the db type is. In the second case, I imagine the developer is being prescriptive about the exact implementation type. If that is the case, both the mapping definition (javadoc) should be corrected to DATETIME2 and a choice for plain DATETIME should be provided.

At the end of the day, it's your choice. Just let me know what you like.

@gs-rezaem
Copy link
Author

Another way I look at the issue is this: how did this happen? What was the root cause?
If I'm reading the code correctly, the root cause was the mapping of java.sql.Types.TIMESTAMP and microsoft.sql.Types.DATETIME to DATETIME2 when DATETIME2 was introduced. If at that time, the question was asked "do we need another type in microsoft.sql.Types" and a DATETIME2 was introduced, keeping DATETIME for DATETIME, this could've been at some level avoided (the mapping of TIMESTAMP would've at least become questionable).

@peterbae
Copy link
Contributor

I'm not sure about the reasoning behind the approach we took (to take advantage of the increased precision with DATETIME2?), but I like your approach as well. Please go ahead with it 👍

@gs-rezaem
Copy link
Author

So I've run into a blocker, possibly another bug #493. I have no idea what to do there. Once that's fixed I'll continue here.

@gs-rezaem
Copy link
Author

gs-rezaem commented Sep 22, 2017

@peterbae any thoughts as to how to proceed here or on #493?
One way forward would be to add a connection parameter "timestampEncoding" with these possible values:

  1. DATETIME2 -- always use DATETIME2 (current behavior). Good if the user knows they only have DATETIME2 columns.
  2. DATETIME -- always use DATETIME (old behavior). Good if the user is dealing with a legacy schema and is not willing to upgrade.
  3. QUERYDB -- the new code, but not compatible with encrypted connections. Good for mixed schemas.

(there is a 4th possibility CHAR. by always encoding as CHAR, the server can do the parsing/casting correctly without the need for a second metadata call).

(aside, @ajlam, "waiting for response" is probably the wrong label on this)

@AfsanehR-zz AfsanehR-zz removed the Waiting for Response Waiting for a reply from the original poster, or affiliated party label Sep 25, 2017
@peterbae
Copy link
Contributor

Hi @gs-rezaem, I think it would be best for us to wait until #493 is resolved, then continue with the implementation of this issue. We will keep you posted.

@peterbae
Copy link
Contributor

Hi @gs-rezaem, it seems like #493 might take some time to be implemented - do you think it'll be possible for you to work a fix for this issue without the encrypted connections case?

@gs-rezaem
Copy link
Author

@peterbae The only solution I can think of is outlined in my last comment. This has been percolating in the back of my mind for a while, and I'm actually warming up to that path forward, for one thing because users who don't have this issue can opt out of the extra DB hit.

@AfsanehR-zz AfsanehR-zz added this to the Next Tasks milestone Jan 23, 2018
@kimzeevaarders
Copy link

kimzeevaarders commented Feb 12, 2018

Hi guys, any news on this issue? This problem is killing us in production ever since we upgraded to SQLServer 2016 and switched from jTDS to this driver. Is there something I can do to solve the problem without having to wait for the official fix?

Just for reference: I posted a detailed description of the problem on stackoverflow and linked back to this ticket from there: https://stackoverflow.com/questions/48422032/hibernate-jtds-mssql-jdbc-driver-problems-with-datetime-column-in-sql-server-2

@gs-rezaem
Copy link
Author

@kimzeevaarders I don't believe there is a perfect way to fix this. If I were in your shoes, I would do this:

Option 1: If you own the schema, change all your datetime columns to datetime2(3). This is a good way to move your application forward.
Option 2: Not for production! Checkout my tsfix branch. Create a hibernate user type that calls to the new set/getDatetimeLegacy methods. Test it out and if this works for you, we could commit these new methods by themselves without the rest of the work.

@peterbae peterbae assigned ulvii and unassigned ulvii Feb 13, 2018
@kimzeevaarders
Copy link

kimzeevaarders commented Feb 14, 2018

@gs-rezaem thanks for the feedback. I am indeed owner of the schema and updated the column datatypes to DATETIME2. The problem is fixed now. However, could you please shed your light on potential problems such a modification could cause that I am maybe not aware of c,q, need further testing?
Thanks!

@gs-rezaem
Copy link
Author

gs-rezaem commented Feb 14, 2018

@kimzeevaarders Just a clarification: DATETIME2 defaults to (7) which might cause other problems when mapping to Java's millisecond precision. Use (3) to avoid mapping issues and minimize the column size.

@kimzeevaarders
Copy link

@gs-rezaem I am aware of this thanks!

@gordthompson
Copy link
Contributor

@gs-rezaem - DATETIME2 defaults to DATETIME2(7), doesn't it?

@gs-rezaem
Copy link
Author

@gordthompson Yes, you're right. Fixed my comment. I emphasized using (3) because it maps well to Java time and uses the least amount of storage.

@pacham7
Copy link

pacham7 commented Jun 26, 2019

Given #493 and #927 are closed, can this be looked at again. Pretty sure it's breaking some functionality via jooq for us.

@ulvii
Copy link
Contributor

ulvii commented Aug 30, 2019

Hi @pacham7 ,

We will look into this in one of the upcoming releases. Please stay tuned.

@dswitzer
Copy link
Contributor

@peterbae any thoughts as to how to proceed here or on #493?
One way forward would be to add a connection parameter "timestampEncoding" with these possible values:

1. DATETIME2 -- always use DATETIME2 (current behavior). Good if the user knows they only have DATETIME2 columns.

2. DATETIME -- always use DATETIME (old behavior). Good if the user is dealing with a legacy schema and is not willing to upgrade.

3. QUERYDB -- the new code, but not compatible with encrypted connections. Good for mixed schemas.

(there is a 4th possibility CHAR. by always encoding as CHAR, the server can do the parsing/casting correctly without the need for a second metadata call).

(aside, @ajlam, "waiting for response" is probably the wrong label on this)

I really wish something like this would be implemented. There's a SendTimeAsDatetime connection string that can be used to specify how Time values should be treated, so I don't know why we can't specify at the connection string how to deal with Timestamp values as well.

Our problem is there are some performance reasons for our application which makes sense to stick with datetime over datetime2. This means we're either stuck having to use the database in a lower compatibility level, or we need to migrate our schema to use datetime2 and incur the performance penalties.

We could have the best of both worlds, if we could just specify how we want Timestamp values translated in SQL.

@lukaseder
Copy link

For those running into problems via jOOQ, @pacham7 has documented a jOOQ specific workaround here: jOOQ/jOOQ#11916 (comment)

@dswitzer
Copy link
Contributor

dswitzer commented Jun 2, 2021

@peterbae any thoughts as to how to proceed here or on #493?
One way forward would be to add a connection parameter "timestampEncoding" with these possible values:

1. DATETIME2 -- always use DATETIME2 (current behavior). Good if the user knows they only have DATETIME2 columns.

2. DATETIME -- always use DATETIME (old behavior). Good if the user is dealing with a legacy schema and is not willing to upgrade.

3. QUERYDB -- the new code, but not compatible with encrypted connections. Good for mixed schemas.

(there is a 4th possibility CHAR. by always encoding as CHAR, the server can do the parsing/casting correctly without the need for a second metadata call).
(aside, @ajlam, "waiting for response" is probably the wrong label on this)

I really wish something like this would be implemented. There's a SendTimeAsDatetime connection string that can be used to specify how Time values should be treated, so I don't know why we can't specify at the connection string how to deal with Timestamp values as well.

Our problem is there are some performance reasons for our application which makes sense to stick with datetime over datetime2. This means we're either stuck having to use the database in a lower compatibility level, or we need to migrate our schema to use datetime2 and incur the performance penalties.

We could have the best of both worlds, if we could just specify how we want Timestamp values translated in SQL.

I filed #1590 to add this specific feature.

@httpmurilo
Copy link

Any proposed solution?

@lilgreenbird
Copy link
Contributor

hi @httpmurilo

This, along with the enhancement request filed for this issue are in our backlog they will be triaged when we do planning for the next semester. Unfortunately we do not have resources to deal with anything other than the highest priority issues right now.

@Jeffery-Wasty
Copy link
Contributor

#1687 resolves this by allowing users to specify the datatype to use for date/timestamp parameters. @gs-rezaem , please let us know if this fix (included in 12.2+) works for you. Closing issue, but can reopen if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests