Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-8075][SQL] apply type check interface to more expressions #6723

Closed
wants to merge 3 commits into from

Conversation

cloud-fan
Copy link
Contributor

a follow up of #6405.
Note: It's not a big change, a lot of changing is due to I swap some code in aggregates.scala to make aggregate functions right below its corresponding aggregate expressions.

@SparkQA
Copy link

SparkQA commented Jun 9, 2015

Test build #34506 has finished for PR 6723 at commit 64d78a8.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class CountFunction(expr: Expression, base: AggregateExpression) extends AggregateFunction
    • case class CountDistinctFunction(
    • case class ApproxCountDistinctPartitionFunction(
    • case class ApproxCountDistinctMergeFunction(
    • case class Sum(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]
    • case class CombineSum(child: Expression) extends AggregateExpression
    • case class SumDistinct(child: Expression)
    • case class CombineSetsAndSum(inputSet: Expression, base: Expression) extends AggregateExpression
    • case class CombineSetsAndSumFunction(
    • case class First(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]
    • case class Last(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]

@cloud-fan cloud-fan force-pushed the type-check branch 2 times, most recently from 6f09d05 to 305be74 Compare June 9, 2015 15:32
*/
def checkInputDataTypes(): TypeCheckResult = TypeCheckResult.TypeCheckSuccess
def checkInputDataTypes(): TypeCheckResult
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want this? There are a lot of expressions that don't need type check...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok i agree with you maybe we should implement this by default to be always "success", so we don't need to define this for every expression.

@cloud-fan
Copy link
Contributor Author

cc @rxin , according to here and here, do we require BinaryExpression to have 2 equal-type children? If so, I think we should add java doc of BinaryExpression to say that, or we should create a subclass of BinaryExpression which has equal-type children.

@SparkQA
Copy link

SparkQA commented Jun 9, 2015

Test build #34509 has finished for PR 6723 at commit 305be74.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class CountFunction(expr: Expression, base: AggregateExpression) extends AggregateFunction
    • case class CountDistinctFunction(
    • case class ApproxCountDistinctPartitionFunction(
    • case class ApproxCountDistinctMergeFunction(
    • case class Sum(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]
    • case class CombineSum(child: Expression) extends AggregateExpression
    • case class SumDistinct(child: Expression)
    • case class CombineSetsAndSum(inputSet: Expression, base: Expression) extends AggregateExpression
    • case class CombineSetsAndSumFunction(
    • case class First(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]
    • case class Last(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]

@SparkQA
Copy link

SparkQA commented Jun 9, 2015

Test build #34508 has finished for PR 6723 at commit 6f09d05.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds the following public classes (experimental):
    • case class CountFunction(expr: Expression, base: AggregateExpression) extends AggregateFunction
    • case class CountDistinctFunction(
    • case class ApproxCountDistinctPartitionFunction(
    • case class ApproxCountDistinctMergeFunction(
    • case class Sum(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]
    • case class CombineSum(child: Expression) extends AggregateExpression
    • case class SumDistinct(child: Expression)
    • case class CombineSetsAndSum(inputSet: Expression, base: Expression) extends AggregateExpression
    • case class CombineSetsAndSumFunction(
    • case class First(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]
    • case class Last(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]

@@ -89,6 +89,8 @@ case class UnresolvedFunction(name: String, children: Seq[Expression]) extends E
throw new TreeNodeException(this, s"No function to evaluate expression. type: ${this.nodeName}")

override def toString: String = s"'$name(${children.mkString(",")})"

override def checkInputDataTypes(): TypeCheckResult = TypeCheckResult.TypeCheckSuccess
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do they need to be here?

@cloud-fan cloud-fan force-pushed the type-check branch 2 times, most recently from 8423bc7 to 3c3c87b Compare June 11, 2015 09:43
@@ -69,7 +69,7 @@ case class WindowSpecDefinition(
override def children: Seq[Expression] = partitionSpec ++ orderSpec

override lazy val resolved: Boolean =
childrenResolved && frameSpecification.isInstanceOf[SpecifiedWindowFrame]
childrenResolved && checkInputDataTypes().isSuccess && frameSpecification.isInstanceOf[SpecifiedWindowFrame]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking of making resolved final, and add a overridable method like extraResolveLogic, so that people won't forget to check types when they override resolved.

@SparkQA
Copy link

SparkQA commented Jun 11, 2015

Test build #34684 has finished for PR 6723 at commit 3c3c87b.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class CountFunction(expr: Expression, base: AggregateExpression) extends AggregateFunction
    • case class CountDistinctFunction(
    • case class ApproxCountDistinctPartitionFunction(
    • case class ApproxCountDistinctMergeFunction(
    • case class Sum(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]
    • case class CombineSum(child: Expression) extends AggregateExpression
    • case class SumDistinct(child: Expression)
    • case class CombineSetsAndSum(inputSet: Expression, base: Expression) extends AggregateExpression
    • case class CombineSetsAndSumFunction(
    • case class First(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]
    • case class Last(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]

@cloud-fan cloud-fan force-pushed the type-check branch 2 times, most recently from 772866d to 0faa15a Compare June 19, 2015 07:14
@SparkQA
Copy link

SparkQA commented Jun 19, 2015

Test build #35244 has finished for PR 6723 at commit 0faa15a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

findTightestCommonTypeAndPromoteToString(types) match {
case Some(finalDataType) => CreateArray(children.map(Cast(_, finalDataType)))
case None => a
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked with hive, CreateArray should cast input to string, but with some rules like Coalesce did. For example we should not cast boolean to string for CreateArray.

@SparkQA
Copy link

SparkQA commented Jun 21, 2015

Test build #35387 has finished for PR 6723 at commit 343074b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class CountFunction(expr: Expression, base: AggregateExpression) extends AggregateFunction
    • case class CountDistinctFunction(
    • case class ApproxCountDistinctPartitionFunction(
    • case class ApproxCountDistinctMergeFunction(
    • case class Sum(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]
    • case class CombineSum(child: Expression) extends AggregateExpression
    • case class SumDistinct(child: Expression)
    • case class CombineSetsAndSum(inputSet: Expression, base: Expression) extends AggregateExpression
    • case class CombineSetsAndSumFunction(
    • case class First(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]
    • case class Last(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]

@SparkQA
Copy link

SparkQA commented Jun 21, 2015

Test build #35388 has finished for PR 6723 at commit 2811da0.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class Module(object):
    • case class CountFunction(expr: Expression, base: AggregateExpression) extends AggregateFunction
    • case class CountDistinctFunction(
    • case class ApproxCountDistinctPartitionFunction(
    • case class ApproxCountDistinctMergeFunction(
    • case class Sum(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]
    • case class CombineSum(child: Expression) extends AggregateExpression
    • case class SumDistinct(child: Expression)
    • case class CombineSetsAndSum(inputSet: Expression, base: Expression) extends AggregateExpression
    • case class CombineSetsAndSumFunction(
    • case class First(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]
    • case class Last(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]

@SparkQA
Copy link

SparkQA commented Jun 21, 2015

Test build #35392 has finished for PR 6723 at commit bb9948a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class Module(object):
    • case class CountFunction(expr: Expression, base: AggregateExpression) extends AggregateFunction
    • case class CountDistinctFunction(
    • case class ApproxCountDistinctPartitionFunction(
    • case class ApproxCountDistinctMergeFunction(
    • case class Sum(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]
    • case class CombineSum(child: Expression) extends AggregateExpression
    • case class SumDistinct(child: Expression)
    • case class CombineSetsAndSum(inputSet: Expression, base: Expression) extends AggregateExpression
    • case class CombineSetsAndSumFunction(
    • case class First(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]
    • case class Last(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]

@SparkQA
Copy link

SparkQA commented Jun 24, 2015

Test build #35637 has finished for PR 6723 at commit 2124301.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class CountFunction(expr: Expression, base: AggregateExpression) extends AggregateFunction
    • case class CountDistinctFunction(
    • case class ApproxCountDistinctPartitionFunction(
    • case class ApproxCountDistinctMergeFunction(
    • case class Sum(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]
    • case class CombineSum(child: Expression) extends AggregateExpression
    • case class SumDistinct(child: Expression)
    • case class CombineSetsAndSum(inputSet: Expression, base: Expression) extends AggregateExpression
    • case class CombineSetsAndSumFunction(
    • case class First(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]
    • case class Last(child: Expression) extends PartialAggregate with trees.UnaryNode[Expression]

@cloud-fan
Copy link
Contributor Author

cc @rxin

@marmbrus
Copy link
Contributor

Thanks! Merging to master.

@asfgit asfgit closed this in b71d325 Jun 24, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants