New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CALCITE-6265] Type coercion is failing for numeric values in prepared statements #3687
Conversation
@@ -1374,11 +1374,35 @@ private Result toInnerStorageType(Result result, Type storageType) { | |||
} | |||
final Type storageType = currentStorageType != null | |||
? currentStorageType : typeFactory.getJavaClass(dynamicParam.getType()); | |||
final Expression valueExpression = | |||
|
|||
// For numeric types, get the value using the following functions on the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may work for numeric values, but does not seem to be a general solution for other types. Is there a design that would handle that? It looks like any source/destination type combination is possible. So perhaps this could be handled like a cast expression? Although I don't see the destination type in this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. The PR only fixes the issue for numeric values, but for other types the user would still encounter a ClassCastException
. Ideally, we would have a function that handles all implicit conversions and fails with a user-friendly error for unsupported conversions. I am sure this functionality already exists in the Calcite in the form of CAST
expressions. Possibly it could be leveraged here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @mihaibudiu suggests, this is basically a cast, which is already available via the Types.castIfNecessary
method, see below for how it can be used:
final boolean isNumeric = SqlTypeFamily.NUMERIC.contains(dynamicParam.getType());
final Expression valueExpression = Types.castIfNecessary(storageType,
EnumUtils.convert(Expressions.call(root, BuiltInMethod.DATA_CONTEXT_GET.method,
Expressions.constant("?" + dynamicParam.getIndex())),
isNumeric ? java.lang.Number.class : storageType));
If you replace your code with this, it passes all the tests you added, and it might be working for non-numeric cases too (maybe with some adaptations).
Could you give it a try and add more tests non-numeric types to check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion! It greatly simplifies the fix.
I made the requested change, but kept the null check as the cast operation may fail. For instance, o.intValue()
requires o
to be non-null. Otherwise, testPreparedStatement
would fail.
As for additional tests, do you have in mind assigning a non-numeric value to a numeric value? These would still raise an exception:
@Test void bindStringParameter() {
for (SqlTypeName tpe : SqlTypeName.INT_TYPES) {
final String sql =
"with cte as (select cast(100 as " + tpe.getName() + ") as empid)"
+ "select * from cte where empid = ?";
CalciteAssert.hr()
.query(sql)
.consumesPreparedStatement(p -> {
p.setString(1, "100");
})
.returnsUnordered("EMPID=100");
}
}
This throws a ClassCastException
which we could rewrite into a more user-friendly message. Alternatively, we could prevent the conversion happening in the first place by introducing a validation check.
java.lang.ClassCastException: class java.lang.String cannot be cast to class java.lang.Number
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally would file a new Jira issue for the casts not covered and merge this PR as is.
You have to decide which casts can appear in this context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. It seems unrelated to this PR. I created a ticket for it: https://issues.apache.org/jira/browse/CALCITE-6284
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I haven't been very precise, I didn't mean to cast Strings to numbers, what I had in mind was values from the string family to be used in parameters of the same family, possibly with (e.g., varchar(2)
value for varchar(3)
parameter), but I don't know if it is even expressible with the APIs we use to pass literals in, so feel free to leave this outside the scope of the present PR in case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made the requested change, but kept the null check as the cast operation may fail. For instance,
o.intValue()
requireso
to be non-null. Otherwise,testPreparedStatement
would fail.
I might have made a mistake but I tried all your tests with the suggested piece of code and all passed.
Did you add a new test specifically for this maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that makes sense. I added a similar test based on Mihai's request below. Since this PR is mostly about numeric types, VARCHAR
conversions could be tackled in a separate ticket.
Indeed. All the tests I added were passing with your change, but testPreparedStatement
existed before and started failing. Explicitly checking whether the value is null
, fixed it.
@@ -1374,11 +1374,35 @@ private Result toInnerStorageType(Result result, Type storageType) { | |||
} | |||
final Type storageType = currentStorageType != null | |||
? currentStorageType : typeFactory.getJavaClass(dynamicParam.getType()); | |||
final Expression valueExpression = | |||
|
|||
// For numeric types, get the value using the following functions on the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @mihaibudiu suggests, this is basically a cast, which is already available via the Types.castIfNecessary
method, see below for how it can be used:
final boolean isNumeric = SqlTypeFamily.NUMERIC.contains(dynamicParam.getType());
final Expression valueExpression = Types.castIfNecessary(storageType,
EnumUtils.convert(Expressions.call(root, BuiltInMethod.DATA_CONTEXT_GET.method,
Expressions.constant("?" + dynamicParam.getIndex())),
isNumeric ? java.lang.Number.class : storageType));
If you replace your code with this, it passes all the tests you added, and it might be working for non-numeric cases too (maybe with some adaptations).
Could you give it a try and add more tests non-numeric types to check?
I have added the "discussion-in-jira" label as Julian raised an interesting point, we might be looking at the symptoms but not at the cause. @tindzk, would you mind checking in Avatica if there is a way to prevent this problem from happening by coercing/casting there? |
5809e02
to
1718aa0
Compare
} | ||
} | ||
|
||
@Test void bindLongParameter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add a test where the value overflows the type, e.g., TINYINT with a value of 300?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TINYINT(300)
is not valid SQL, but while working on a test for it, I noticed that NUMERIC(<digits>)
conversions are currently not supported either. An exception is thrown because Primitive.ofBox(returnType))
returns null
for BigDecimal
. This is fixed in the last commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CAST(300 as TINYINT)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying. CAST(300 as TINYINT)
is correctly fine, but CAST(? as TINYINT)
does not trigger an error when the value is out of bounds. I added a test and fixed it.
1718aa0
to
5506fd9
Compare
storageType); | ||
isNumeric ? java.lang.Number.class : storageType); | ||
|
||
// The cast operation will fail for null values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see why this comment may be confusing: what you mean is that casts have to be handled specially, because otherwise they would cause a failure. But the comment as phrased can be interpreted that a cast null will cause a failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I clarified the comment.
Type type) { | ||
throw Extensions.todo(); | ||
if (type == Byte.class |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's funny, I implemented this myself in a separate PR: #3589
There I am relying on an existing implementation in Primitive.numberValue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Your approach seems to cover more scenarios.
I will keep convertChecked
as-is for now. The inlined validations should be sufficient within the current PR's scope, but feel free to replace it once #3589 has been merged.
/** | ||
* Creates an expression that represents the throwing of an exception. | ||
*/ | ||
public static Expression throwExpr(Expression expression) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a ThrowStatement.
Maybe having an expression is useful, since it can be used in an expression context - you don't need to break the current expression to create a new statement.
I would document this feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
throwExpr()
is perhaps overly specific. I generalised the function such that it can be used with any statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR is getting to a mergeable status, but we need tests covering the new/modified expressions in ExpressionTest.
Could you also check Sonar issues and fix the ones which are legit?
/** | ||
* Represents an expression cast to the specified type. | ||
*/ | ||
public class CastExpression extends Expression { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I left a comment in my previous review that the functionality of Cast already seems to exist: it is a UnaryExpression with type Convert. I don't think we should have two ways to do the same thing unless you can explain clearly how this is different; this will be very hard to maintain, since people won't know where to look for the functionality of casts. Why can't you reuse the Convert?
For some reason I don't see this comment, maybe I forgot to type it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, this was also suggested in #3687 (comment) and adopted in a following commit, so the change was intentional I guess, but if possible let's reuse convert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing it out. I was not aware that Convert
maps onto a cast operation. I removed CastExpression
and changed the code to use Expressions.convert_()
instead.
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nice work @tindzk!
the next step is to rebase on main and squash all the commits into a single one in preparation for merging |
@tindzk please make sure the commit message is well formatted and it matches the title of the corresponding Jira ticket when you rebase, thanks! |
…d statements Given a column of type `INT`. When providing a `short` value as a placeholder in a prepared statement, a `ClassCastException` is thrown. Test case: ``` final String sql = "select \"empid\" from \"hr\".\"emps\" where \"empid\" in (?, ?)"; CalciteAssert.hr() .query(sql) .consumesPreparedStatement(p -> { p.setShort(1, (short) 100); p.setShort(2, (short) 110); }) .returnsUnordered("empid=100", "empid=110"); ``` Stack trace: ``` java.lang.ClassCastException: class java.lang.Short cannot be cast to class java.lang.Integer (java.lang.Short and java.lang.Integer are in module java.base of loader 'bootstrap') at Baz$1$1.moveNext(Unknown Source) at org.apache.calcite.linq4j.Linq4j$EnumeratorIterator.<init>(Linq4j.java:679) ```
@asolimando I squashed the commits and updated the title. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tindzk by checking the Jira ticket I just remembered we didn't follow up on Julian's suggestions to look into the Avatica side before trying to fix it into the Calcite side.
Not surprisingly he seems to be right: if you check AvaticaSite.java#L93, you have setXYZ
methods, where we have all the information we need to check overflows and possibly do type coercion.
It seems the right thing to do for two reasons:
- it's early on, the earliest we fix this kind of issues, the better it is
- kind of a consequence of 1), but the fix in
Linq
realms seems overly complicated compared to what we can do on the Avatica side, this suggests that we should do it before landing there
For these reasons I am removing my approval and I am asking you to check the aforementioned AvaticaSide
and see if we can get a simpler fix there.
If after this due diligence we are still convinced that Calcite (Linq) is the better place to have this handled, let's make our point in Jira and try to have consensus there.
Apologies for forgetting to follow-up on Julian's message, I have also put the "discussion-in-jira" label as a reminder but it has slipped through the cracks nonetheless!
After checking the referenced code I am not sure whether the issue can be fixed in Avatica. In order to perform type coercion, the target type needs to be known which does not seem to be the available there. Without this information, we also cannot check for overflows. Note that type conversions are already happening in The remaining changes are unrelated to the reported issue and fix tests that were requested during the PR review, notably passing values to |
From my understanding of the codebase this is not an Avatica issue. |
It would be nice to get this in for the next release, so hopefully we can merge this. |
Maybe I am wrong. I guess this is a bug in the transfer of data between JDBC and the evaluator, and JDBC is the one that should enforce the constraints given by the type. I haven't looked at the flow of data enough to judge this. |
I agree, that's why I have set the fixVersion to
Indeed after a closer look I agree that we don't seem to have enough context, neither in the Avatica setters I suggested earlier, nor in Calcite at an earlier stage, or at least I could not figure it out with the time I could devote. So yeah, not sure what Julian had in mind in his Jira comment, could be worth replying there with our findings and at least give him a chance to clarify, if there are no objections or replies for a while, then we merge what we have, we don't need to lose the release train. WDYT? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(removing the request for changes as my questions have been answered)
Given a column of type
INT
. When providing ashort
value as a placeholder in a prepared statement, aClassCastException
is thrown.Test case
Stack trace