Skip to content

Conversation

@sarutak
Copy link
Member

@sarutak sarutak commented Jun 19, 2021

What changes were proposed in this pull request?

This PR fixes an issue that IntervalUtils.toYearMonthIntervalString doesn't consider the case that year-month interval type is casted as month interval type.
If a year-month interval data is casted as month interval, the value of the year is multiplied by 12 and added to the value of month. For example, INTERVAL '1-2' YEAR TO MONTH will be INTERVAL '14' MONTH if it's casted.
If this behavior is intended, it's stringified to be 'INTERVAL 14' MONTH but currently, it will be INTERVAL '2' MONTH

Why are the changes needed?

It's a bug if the behavior of cast is intended.

Does this PR introduce any user-facing change?

No, because this feature is not released yet.

How was this patch tested?

Modified the tests added in SPARK-35771 (#32924).

@github-actions github-actions bot added the SQL label Jun 19, 2021
@SparkQA
Copy link

SparkQA commented Jun 19, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44556/

@SparkQA
Copy link

SparkQA commented Jun 19, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44556/

@SparkQA
Copy link

SparkQA commented Jun 20, 2021

Test build #140029 has finished for PR 32982 at commit 003dc84.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Merging to master.
Thank you @sarutak , and @HyukjinKwon for your review.

@MaxGekk MaxGekk closed this in 4758dc7 Jun 20, 2021
@sarutak
Copy link
Member Author

sarutak commented Jun 20, 2021

@MaxGekk BTW, what do you think day-time should be?
If interval '1 2:3' day to minute is casted as interval hour to minute, interval '25:3' hour to minute doesn't conform to ANSI.

One option would be to compromise the ANSI compliance only for the string representation of such types like RDBMSs does, and represent like interval '25' hour '3' minute.
Just FYI, PostgreSQL seems always show <day> <hour>:<minute>:<second> regardless of its type.

@MaxGekk
Copy link
Member

MaxGekk commented Jun 20, 2021

If interval '1 2:3' day to minute is casted as interval hour to minute, interval '25:3' hour to minute doesn't conform to ANSI.

I think the restriction of 0..23 for hours is applicable when it is in middle. But if the field is the start field, its range is restricted either by precision, see SQL standard rules:

<start field> ::=
<non-second primary datetime field>
      [ <left paren> <interval leading field precision> <right paren> ]
<interval leading field precision> ::=
  <unsigned integer>

We can a confirmation of this at the section 4.6.3:

Within a value of type interval, the first field is constrained only by the <interval leading field precision> of the associated <interval qualifier>. Table 6, “Valid values for fields in INTERVAL values”, specifies the con- straints on subsequent field values.

In our case interval leading field precision has a default value limited by internal type Long.

@sarutak
Copy link
Member Author

sarutak commented Jun 20, 2021

I confirmed the syntax rule but the precision here means the number of digit and doesn't means the valid range of each field doesn't it?
And the range of HOUR seems limited within 0-23.

interval-rule

@MaxGekk
Copy link
Member

MaxGekk commented Jun 20, 2021

doesn't means the valid range of each field doesn't it?

I think it does. For example, INTERVAL HOUR(5) TO SECOND means the HOUR field can have 5 digits. And as the consequence of that, its maximum values is 99999. So, the range for hours 0..99999. And of course, the intervals might have a sign, and the range for such type of intervals is INTERVAL '-99999:59:59.999999' HOUR(5) TO SECOND ... INTERVAL '99999:59:59.999999' HOUR(5) TO SECOND ( in our case the precision of seconds is microseconds).

@sarutak
Copy link
Member Author

sarutak commented Jun 20, 2021

O.K, the standard says as follows.

Within a value of type interval, the first field is constrained only by the <interval leading fieldprecision> of the associated <interval qualifier>.  Table 8, ‘‘Valid values for fields in INTERVALvalues’’, specifies the constraints on subsequent field values.

So, the first field seems to be constrained only by the number of digit. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants