Skip to content

[SPARK-35170][SQL] Extend BinaryOperator by SubtractDates and SubtractTimestamps#32267

Closed
MaxGekk wants to merge 3 commits intoapache:masterfrom
MaxGekk:refactor-binary-operator
Closed

[SPARK-35170][SQL] Extend BinaryOperator by SubtractDates and SubtractTimestamps#32267
MaxGekk wants to merge 3 commits intoapache:masterfrom
MaxGekk:refactor-binary-operator

Conversation

@MaxGekk
Copy link
Copy Markdown
Member

@MaxGekk MaxGekk commented Apr 21, 2021

What changes were proposed in this pull request?

In the PR, I propose to modify the SubtractDates and SubtractTimestamps expressions to extend BinaryOperator instead of BinaryExpression.

Why are the changes needed?

To improve code maintenance.

Does this PR introduce any user-facing change?

No

How was this patch tested?

By existing test suites.

@MaxGekk
Copy link
Copy Markdown
Member Author

MaxGekk commented Apr 21, 2021

@gengliangwang @cloud-fan FYI, this PR changes behavior in errors slightly (I guess it is not critical) :

[info] - typeCoercion/native/promoteStrings.sql *** FAILED *** (5 seconds, 764 milliseconds)
[info]   typeCoercion/native/promoteStrings.sql
[info]   Expected "...data type mismatch: [argument 1 requires timestamp type, however, ''1'' is of string type].; line 1 pos 7", but got "...data type mismatch: [differing types in '('1' - CAST('2017-12-11 09:30:00.0' AS TIMESTAMP))' (string and timestamp)].; line 1 pos 7" Result did not match for query #23
[info]   typeCoercion/native/decimalPrecision.sql
[info]   Expected "...data type mismatch: [argument 2 requires timestamp type, however, 'CAST(1 AS DECIMAL(3,0))' is of decimal(3,0) type].; line 1 pos 7", but got "...data type mismatch: [differing types in '(CAST('2017-12-11 09:30:00.0' AS TIMESTAMP) - CAST(1 AS DECIMAL(3,0)))' (timestamp and decimal(3,0))].; line 1 pos 7" Result did not match for query #121
[info]   SELECT cast('2017-12-11 09:30:00.0' as timestamp) - cast(1 as decimal(3, 0)) FROM t (SQLQueryTestSuite.scala:459)

@MaxGekk
Copy link
Copy Markdown
Member Author

MaxGekk commented Apr 21, 2021

Though, no. It changes the behavior actually - an exception instead of NULL:

[info] - typeCoercion/native/promoteStrings.sql *** FAILED *** (13 seconds, 407 milliseconds)
[info]   "NULL" did not contain "Exception" Exception did not match for query #24
[info]   SELECT '1' - cast('2017-12-11 09:30:00' as date)        FROM t, expected: NULL, but got: java.sql.SQLException
[info]   org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: cannot resolve '('1' - CAST('2017-12-11 09:30:00' AS DATE))' due to data type mismatch: differing types in '('1' - CAST('2017-12-11 09:30:00' AS DATE))' (string and date).; line 1 pos 7;

@SparkQA
Copy link
Copy Markdown

SparkQA commented Apr 21, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42251/

@SparkQA
Copy link
Copy Markdown

SparkQA commented Apr 21, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42251/

struct<>
-- !query output
NULL
org.apache.spark.sql.AnalysisException
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan @gengliangwang Are we ok to change the behavior?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look good...Why does that happen?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably because we have special type coercion logic for BinaryOperators?

-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(CAST('2017-12-11 09:30:00.0' AS TIMESTAMP) - CAST(1 AS DECIMAL(3,0)))' due to data type mismatch: argument 2 requires timestamp type, however, 'CAST(1 AS DECIMAL(3,0))' is of decimal(3,0) type.; line 1 pos 7
cannot resolve '(CAST('2017-12-11 09:30:00.0' AS TIMESTAMP) - CAST(1 AS DECIMAL(3,0)))' due to data type mismatch: differing types in '(CAST('2017-12-11 09:30:00.0' AS TIMESTAMP) - CAST(1 AS DECIMAL(3,0)))' (timestamp and decimal(3,0)).; line 1 pos 7
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my point of view, old error looks better.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@SparkQA
Copy link
Copy Markdown

SparkQA commented Apr 21, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42260/

@SparkQA
Copy link
Copy Markdown

SparkQA commented Apr 21, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42260/

@SparkQA
Copy link
Copy Markdown

SparkQA commented Apr 21, 2021

Test build #137724 has finished for PR 32267 at commit 02e2fbb.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Apr 21, 2021

Test build #137733 has finished for PR 32267 at commit f0abdbf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@MaxGekk
Copy link
Copy Markdown
Member Author

MaxGekk commented May 18, 2021

I am closing this because:

  1. Behavior change
  2. New error message becomes worse

@MaxGekk MaxGekk closed this May 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants