Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-6269] Fix missing/broken BigQuery date-time format elements #3761

Merged
merged 1 commit into from Apr 22, 2024

Conversation

Anthrino
Copy link
Contributor

No description provided.

@Anthrino Anthrino force-pushed the CALCITE_2980_cast_op_format_clause branch from 6e4ec7d to 760f6d6 Compare April 15, 2024 16:51
final Calendar calendar = Work.get().calendar;
calendar.setTime(date);
String formattedYear = String.format(Locale.ROOT, "%d", calendar.get(Calendar.YEAR));
sb.append(formattedYear.substring(formattedYear.length() - 3));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens for a year like 50?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, the earlier method was reusing what was being done in YYYY and cropping that down to three digits. But the YYYY implementation was incorrect as it was reading an integer where the Calendar object stored the year, and that would return an int value and not a formatted year with 4 digits (showing the century number).
Corrected the YYYY and YYY functions to return a proper formatted value in terms of digits, and added a method pctY to return values consistent to what BQ returns.

!if (false) {
### disabled until Bug.CALCITE_6269_FIXED ###
### unsupported format model/elements, test disabled ###
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a way in Quidem tests to refer to a Bug, why don't we use that one?
The good thing about the previous version of this string is that you can grep for it, here it is not clear which bug tracks the problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mihaibudiu good point, I was also unsure of this comment and ideally wanted to link JIRA issues which would describe the exact causes. But there are quite a number of problems with the current implementation which is causing these many tests to fail and would take time to document them all, what do you recommend be done in this scenario?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's best to have an issue to keep track of all of them, even if they are different, and to have an easy way to find it in the code.

The syntax in .iq files is: !if (fixed.calcite1048)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the comments and disable condition to point to the calcite issue, thanks for suggesting this change!

core/src/test/resources/sql/cast-with-format.iq Outdated Show resolved Hide resolved
@@ -979,7 +939,7 @@ EXPR$0
select cast(date'2019-01-07' as varchar
FORMAT 'WW');
EXPR$0
01
02
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is happening here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WW operator returns the week number of the year (00-53), and that depends on what is being set as the start day of a week (Sun v/s Mon). The variances you see here are because we set the week start on Sunday, but setting to Monday also fails a bunch of other tests, which is strange as you would expect all of them to be consistent. So we have to use either of them and update the tests accordingly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is BigQuery, they better give the same results.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, a small number of the tests return the same answer as recorded here as they produced the same week number irrespective of start of week, the others had to be changed wrt to what Calcite/BQ is returning.

Another point to consider with these tests is BQ does not support use of elements like WW, DDD to parse a datetime object, its only available for formatting output strings. I left these tests as is because we do not have any restrictions in calcite's FormatModel for the use of elements in parsing or formatting scenarios.

core/src/test/resources/sql/cast-with-format.iq Outdated Show resolved Hide resolved
@Anthrino Anthrino force-pushed the CALCITE_2980_cast_op_format_clause branch from 760f6d6 to 585b1fc Compare April 15, 2024 21:31
Copy link
Contributor

@tanclary tanclary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall think this looks good ! I agree with @mihaibudiu 's comments and left a couple questions of my own.

@Anthrino Anthrino force-pushed the CALCITE_2980_cast_op_format_clause branch from 585b1fc to 4916779 Compare April 18, 2024 16:56
@Anthrino
Copy link
Contributor Author

@mihaibudiu @tanclary addressed your comments, would appreciate another review on this, thanks!

Copy link
Contributor

@tanclary tanclary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if @mihaibudiu doesn't have any comments I think this is good to go

@mihaibudiu
Copy link
Contributor

This PR seems to implement a few flags which don't even work in BigQuery, but other than that seems fine.

@tanclary tanclary merged commit 4a977fe into apache:main Apr 22, 2024
15 of 16 checks passed
@Anthrino Anthrino deleted the CALCITE_2980_cast_op_format_clause branch April 22, 2024 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants