Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-36921][SQL] Support ANSI intervals by DIV #34257

Closed
wants to merge 1 commit into from

Conversation

Peng-Lei
Copy link
Contributor

What changes were proposed in this pull request?

  1. support div(YearMonthIntervalType, YearMonthIntervalType), return long result
  2. support div(DayTimeIntervalType, DayTimeIntervalType), return long result
  3. if input is NULL or input2 is 0, then return null

Why are the changes needed?

Extended the div function to support ANSI intervals. The operation should produce quotient of division.
SPARK-36921

Does this PR introduce any user-facing change?

Yes, user can use user can use YearMonthIntervalType and DayTimeIntervalType as input for div function.

How was this patch tested?

Add ut testcase

@github-actions github-actions bot added the SQL label Oct 12, 2021
@Peng-Lei
Copy link
Contributor Author

@MaxGekk Could you take a look? Thank you. I'm not quite sure if this PR will solve the SPARK-36921

@MaxGekk
Copy link
Member

MaxGekk commented Oct 12, 2021

ok to test

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Peng-Lei Could you add a couple of end-to-end tests to intervals.sql including a negative test:

SELECT DIV(INTERVAL '1' MONTH, INTERVAL '1' DAY);

@SparkQA
Copy link

SparkQA commented Oct 12, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48638/

@SparkQA
Copy link

SparkQA commented Oct 12, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48638/

Literal(Period.ofYears(1))), 2L)
checkEvaluation(IntegralDivide(Literal(Period.ofYears(1)),
Literal(Period.ofMonths(3))), 4L)
checkEvaluation(IntegralDivide(Literal(Period.ofYears(1)),
Copy link
Member

@MaxGekk MaxGekk Oct 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check negative intervals, and also some corner cases like division by max/min and zero intervals (ArithmeticExpressionSuite).

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general, could you add an example to the expression description.

@SparkQA
Copy link

SparkQA commented Oct 12, 2021

Test build #144160 has finished for PR 34257 at commit a593010.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 13, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48670/

@SparkQA
Copy link

SparkQA commented Oct 13, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48670/

@MaxGekk MaxGekk changed the title [SPARK-36921][SQL] The DIV function should support ANSI intervals [SPARK-36921][SQL] Support ANSI intervals by DIV Oct 13, 2021
@MaxGekk
Copy link
Member

MaxGekk commented Oct 13, 2021

+1, LGTM. Merging to master.
Thank you, @Peng-Lei .

@MaxGekk MaxGekk closed this in 1ccc4dc Oct 13, 2021
@SparkQA
Copy link

SparkQA commented Oct 13, 2021

Test build #144191 has finished for PR 34257 at commit 1f3459b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

  1. why do we pick long as the result type? any reference?
  2. why we return null for divide by 0, instead of failing?

@MaxGekk
Copy link
Member

MaxGekk commented Oct 15, 2021

@cloud-fan The ANSI SQL standard doesn't define the semantic of DIV. We can implement it as we want.

why do we pick long as the result type?

For the consistency to existing Spark's DIV implementation for other input types. This is widest integer that doesn't overflow on div of day-time interval.

@MaxGekk
Copy link
Member

MaxGekk commented Oct 15, 2021

Why DIV returns long is the question to you, @cloud-fan ;-)
#22395 (comment)

@MaxGekk
Copy link
Member

MaxGekk commented Oct 15, 2021

why we return null for divide by 0, instead of failing?

Following to our current approach for ANSI interval in other places, we should be more stronger here. @Peng-Lei Could you address this, please.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants