-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-4565] [table] Support for SQL IN operator #4404
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @twalthr, the change itself looks very good. I had just one minor comment about TableReference
.
I would change the tests from runtime tests to plan tests though. The change does not add a new operator, just expressions (which are separately tested) and translation to regular (tested) joins. I think we also need tests to validate the equivalence of expression and string Table API.
It would be good to extend the documentation as well.
What do you think?
Fabian
@@ -137,6 +133,72 @@ class ScalarOperatorsTest extends ScalarOperatorsTestBase { | |||
} | |||
|
|||
@Test | |||
def testIn(): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add tests for expressions that result in false
?
@@ -138,6 +138,25 @@ case class WindowReference(name: String, tpe: Option[TypeInformation[_]] = None) | |||
override def toString: String = s"'$name" | |||
} | |||
|
|||
case class TableReference(name: String, table: Table) extends Attribute { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TableReference
shouldn't extend Attribute
IMO.
We can make it extend LeafExpression with NamedExpression
.
@@ -225,4 +227,62 @@ class SetOperatorsITCase( | |||
val expected = "2,1,Hi\n" + "3,2,Hello\n" + "4,2,Hello world\n" | |||
TestBaseUtils.compareResultAsText(results.asJava, expected) | |||
} | |||
|
|||
@Test | |||
def testInWithFilter(): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add stringexpr
tests to validate equivalence of Scala expression and Java String Table API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These runtime tests could also be replaced by plan tests, IMO.
@@ -260,4 +260,108 @@ class SetOperatorsITCase( | |||
val results = result.toDataSet[Row].collect() | |||
TestBaseUtils.compareResultAsText(results.asJava, expected) | |||
} | |||
|
|||
@Test | |||
def testInWithFilter(): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we validate the feature with plan tests (TableTestBase
) rather than runtime tests?
The feature does not require new operators, just expressions which are tested by dedicated tests and joins which are also tested.
@@ -17,9 +17,9 @@ | |||
*/ | |||
package org.apache.flink.api.scala.util | |||
|
|||
import java.sql.{Date, Time, Timestamp} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove unused imports
import scala.collection.mutable | ||
import scala.util.Random | ||
|
||
object BatchTestData { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed if we switch to plan tests.
Thanks for reviewing @fhueske. I addressed your feedback and added documentation. I will merge this now... |
<span class="label label-primary">Batch</span> | ||
</td> | ||
<td> | ||
Returns true if an expression exists in a given table sub-query. The sub-query table must consist of one column. This column must have the same data type as the expression. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the syntax be enhanced so that user can specify one column in the table with multiple columns ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean with "enhanced"? We cannot change the SQL standard. The user can nest another query and reduce the table to one column.
} | ||
(expression.resultType, tableOutput.head.resultType) match { | ||
case (lType, rType) if isNumeric(lType) && isNumeric(rType) => ValidationSuccess | ||
case (lType, rType) if lType == rType => ValidationSuccess |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be moved ahead of the isNumeric() check since this check is cheaper ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, yes we could do it. Feel free to submit a patch for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I open a new JIRA ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a hotfix is enough. This part is not performance critical anyway, because it happens before runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #4493
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@twalthr:
Can you take a look at my PR ?
What is the purpose of the change
This PR adds support for the IN operator in both SQL and Table API. Both for testing in a set of elements
f0.in(1, 2, 3)
as well as in sub-queryf0.in(table)
Brief change log
Various changes in plan translation, code generation, and API.
Verifying this change
This change added tests and can be verified as follows:
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: noDocumentation