New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-8993][SQL] More comprehensive type checking in expressions. #7348
Conversation
cc @cloud-fan I had this change on my laptop I didn't commit, which I believe actually encompasses your change. #7338 |
override def checkInputDataTypes(): TypeCheckResult = { | ||
// First call the checker for ExpectsInputTypes, and then check whether left and right have | ||
// the same type. | ||
super.checkInputDataTypes() match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that this will use expects input types to do the type check, if available.
Test build #37063 has finished for PR 7348 at commit
|
} else { | ||
TypeCheckResult.TypeCheckSuccess | ||
} | ||
case TypeCheckResult.TypeCheckFailure(msg) => TypeCheckResult.TypeCheckFailure(msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we simplify this to case fail => fail
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a good idea
I've started writing a very basic fuzz-testing suite to check that expression code generation does not fail if an expression is initialized with NullType-d expressions as children and passes type checks after type coercion is applied: master...JoshRosen:fuzz-test Running that suite against this patch shows that it fixes 9 of my test's failures. I wouldn't read too much into this for now, since my suite needs some work to be more useful, but just thought I'd share since it's a neat testing technique and uncovered a few bugs that I had previously found via other methods (such as |
It's worth noting that a bunch of the errors in that fuzzing suite are caused by the fact that we skip type checks for certain internal expressions, such as UnscaledValue or AddItemToSet. Is there a good reason why we shouldn't add type checking to these expressions, if only to act as an internal sanity check? |
Those two expressions should / will just go away. |
* | ||
* 1. The string representation is "x symbol y", rather than "funcName(x, y)". | ||
* 2. Two inputs are expected to the be same type. If the two inputs have different types, | ||
* the analyzer will find the tightest common type and do the proper type casting. | ||
*/ | ||
abstract class BinaryOperator extends BinaryExpression { | ||
self: Product => | ||
|
||
def symbol: String |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove this function if we removed the function toString
. Let the subclass define the symbol
as needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a mistake actually. i'm adding tostirng back
@@ -48,6 +49,9 @@ case class UnaryMinus(child: Expression) extends UnaryArithmetic { | |||
case class UnaryPositive(child: Expression) extends UnaryArithmetic { | |||
override def prettyName: String = "positive" | |||
|
|||
override def checkInputDataTypes(): TypeCheckResult = | |||
TypeUtils.checkForNumericExpr(child.dataType, "operator -") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be "operator +"?
Test build #37202 has finished for PR 7348 at commit
|
* Types that can be used in bitwise operations. | ||
*/ | ||
val Bitwise = TypeCollection( | ||
BooleanType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does boolean support bitwise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does boolean support bitwise operations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
Test build #37210 has finished for PR 7348 at commit
|
Jenkins, retest this please. |
Test build #37248 has finished for PR 7348 at commit
|
Test build #37257 has finished for PR 7348 at commit
|
Test build #37273 has finished for PR 7348 at commit
|
Test build #37280 has finished for PR 7348 at commit
|
Test build #37294 has finished for PR 7348 at commit
|
Test build #37296 has finished for PR 7348 at commit
|
Test build #37302 has finished for PR 7348 at commit
|
Jenkins, retest this please. |
The test failure looks unrelated. I'm going to merge this and fix any problems later, since it changes a lot of code. |
@cloud-fan can you review this and help me fix the ignored test cases? Thanks. |
val children: Seq[Expression] = e.children.zip(e.inputTypes).map { case (in, expected) => | ||
if (in.dataType == NullType && !expected.acceptsType(NullType)) { | ||
Cast(in, expected.defaultConcreteType) | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if expected.acceptsType(in.dateType) == false
, probably we'd better to raise a TypeChecking exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that happens during CheckAnalysis when we report errors.
Test build #37320 has finished for PR 7348 at commit
|
This patch makes the following changes:
TODOs needed: fix unit tests for error reporting.
I'm intentionally not changing anything in aggregate expressions because @yhuai is doing a big refactoring on that right now.