Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20329][SQL] Make timezone aware expression without timezone unresolved #17641

Closed
wants to merge 6 commits into from

Conversation

hvanhovell
Copy link
Contributor

@hvanhovell hvanhovell commented Apr 14, 2017

What changes were proposed in this pull request?

A cast expression with a resolved time zone is not equal to a cast expression without a resolved time zone. The ResolveAggregateFunction assumed that these expression were the same, and would fail to resolve HAVING clauses which contain a Cast expression.

This is in essence caused by the fact that a TimeZoneAwareExpression can be resolved without a set time zone. This PR fixes this, and makes a TimeZoneAwareExpression unresolved as long as it has no TimeZone set.

How was this patch tested?

Added a regression test to the SQLQueryTestSuite.having file.

@hvanhovell hvanhovell changed the title [SPARK-20329][SQL] Make timezone aware expression without timezone unresolved. [SPARK-20329][SQL] Make timezone aware expression without timezone unresolved [WIP] Apr 14, 2017
@SparkQA
Copy link

SparkQA commented Apr 14, 2017

Test build #75816 has finished for PR 17641 at commit 0654409.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class ResolveTimeZone(conf: SQLConf) extends Rule[LogicalPlan]

@@ -34,6 +33,9 @@ import org.apache.spark.unsafe.types.{CalendarInterval, UTF8String}
* Common base class for time zone aware expressions.
*/
trait TimeZoneAwareExpression extends Expression {
/** The expression is only resolved when the time zone has been set. */
override lazy val resolved: Boolean =
childrenResolved && checkInputDataTypes().isSuccess && timeZoneId.isDefined
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation may have some custom resolution logic, how about super.resolved && timeZoneId.isDefined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried doing this. Scala does not allow you to call super.resolved in resolved :(....

case e: TimeZoneAwareExpression if e.timeZoneId.isEmpty =>
e.withTimeZone(conf.sessionLocalTimeZone)
}.eval()
castedExpr.eval()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are nested expressions which are timezone aware, I think we still need to attach time zone to them?

Copy link
Member

@ueshin ueshin Apr 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess now that TimeZoneAwareExpression is resolved if it has timeZoneId, so we don't need to transform children.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, right. I saw the changes to TimeZoneAwareExpression. :-)

@hvanhovell hvanhovell changed the title [SPARK-20329][SQL] Make timezone aware expression without timezone unresolved [WIP] [SPARK-20329][SQL] Make timezone aware expression without timezone unresolved Apr 18, 2017
@SparkQA
Copy link

SparkQA commented Apr 18, 2017

Test build #75898 has finished for PR 17641 at commit ceec2b0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class ResolveTimeZone(conf: SQLConf) extends Rule[LogicalPlan]
  • trait CastSupport
  • case class AliasViewChild(conf: SQLConf) extends Rule[LogicalPlan] with CastSupport
  • case class DataSourceAnalysis(conf: SQLConf) extends Rule[LogicalPlan] with CastSupport
  • case class DataSourceStrategy(conf: SQLConf) extends Strategy with Logging with CastSupport
  • case class PreprocessTableInsertion(conf: SQLConf) extends Rule[LogicalPlan] with CastSupport

@SparkQA
Copy link

SparkQA commented Apr 18, 2017

Test build #75901 has finished for PR 17641 at commit b5848ca.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -99,12 +99,9 @@ case class ResolveInlineTables(conf: SQLConf) extends Rule[LogicalPlan] {
val castedExpr = if (e.dataType.sameType(targetType)) {
e
} else {
Cast(e, targetType)
Cast(e, targetType, Some(conf.sessionLocalTimeZone))
Copy link
Member

@ueshin ueshin Apr 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ResolveInlineTables can be CastSupport and we should use cast of it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -99,12 +99,9 @@ case class ResolveInlineTables(conf: SQLConf) extends Rule[LogicalPlan] {
val castedExpr = if (e.dataType.sameType(targetType)) {
e
} else {
Cast(e, targetType)
Cast(e, targetType, Some(conf.sessionLocalTimeZone))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also make ResolveInlineTables mix-in CastSupport?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -41,13 +41,17 @@ class CastSuite extends SparkFunSuite with ExpressionEvalHelper {
}
}

private def castWithDefaultTZ(v: Any, targetType: DataType): Cast = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we just change the default value of timeZoneId parameter in def cast to GMT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

None
} else if (potentialSpecs.size == 1) {
val partValue = potentialSpecs.head._2
Some(Alias(Cast(Literal(partValue), field.dataType), field.name)())
Some(Alias(cast(Literal(partValue), field.dataType),
field.name)())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this can be put in the previous line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -156,6 +156,7 @@ abstract class BaseSessionStateBuilder(
override val postHocResolutionRules: Seq[Rule[LogicalPlan]] =
PreprocessTableCreation(session) +:
PreprocessTableInsertion(conf) +:
ResolveTimeZone(conf) +:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why run this rule again? it's already in the resolution batch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@SparkQA
Copy link

SparkQA commented Apr 20, 2017

Test build #75979 has finished for PR 17641 at commit cc8de3d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 20, 2017

Test build #75981 has finished for PR 17641 at commit d7bc1e8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

LGTM, merging to master/2.2

asfgit pushed a commit that referenced this pull request Apr 21, 2017
…resolved

## What changes were proposed in this pull request?
A cast expression with a resolved time zone is not equal to a cast expression without a resolved time zone. The `ResolveAggregateFunction` assumed that these expression were the same, and would fail to resolve `HAVING` clauses which contain a `Cast` expression.

This is in essence caused by the fact that a `TimeZoneAwareExpression` can be resolved without a set time zone. This PR fixes this, and makes a `TimeZoneAwareExpression` unresolved as long as it has no TimeZone set.

## How was this patch tested?
Added a regression test to the `SQLQueryTestSuite.having` file.

Author: Herman van Hovell <hvanhovell@databricks.com>

Closes #17641 from hvanhovell/SPARK-20329.

(cherry picked from commit 760c8d0)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@asfgit asfgit closed this in 760c8d0 Apr 21, 2017
peter-toth pushed a commit to peter-toth/spark that referenced this pull request Oct 6, 2018
…resolved

## What changes were proposed in this pull request?
A cast expression with a resolved time zone is not equal to a cast expression without a resolved time zone. The `ResolveAggregateFunction` assumed that these expression were the same, and would fail to resolve `HAVING` clauses which contain a `Cast` expression.

This is in essence caused by the fact that a `TimeZoneAwareExpression` can be resolved without a set time zone. This PR fixes this, and makes a `TimeZoneAwareExpression` unresolved as long as it has no TimeZone set.

## How was this patch tested?
Added a regression test to the `SQLQueryTestSuite.having` file.

Author: Herman van Hovell <hvanhovell@databricks.com>

Closes apache#17641 from hvanhovell/SPARK-20329.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants