Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-5180] Implement BigQuery Date/Time Type Aliases and Constructors #3023

Closed
wants to merge 11 commits into from

Conversation

wnob
Copy link
Contributor

@wnob wnob commented Jan 11, 2023

This incorporates the type alias work drafted by Julian in https://github.com/julianhyde/calcite/tree/5424-custom-literals and builds on it by incorporating my own first draft to address CALCITE-5180 (Implement BigQuery date/time constructors), and then adding in the DATETIME() function making use of the type aliases (Add DATETIME() function and switch to the core parser), and also switches the parsing logic from the Babel parser to the core parser.

julianhyde and others added 11 commits January 5, 2023 10:59
If the output of a parser test case is the same as its
input, it now must call same().
Share interval tests between parser, validator
Add DATETIME, TIMESTAMP WITH LOCAL TIME;
add class SqlUnknownLiteral;
defer validation of character string
In Quidem, modify scott-big-query to use BigQuery's type system

Rework big-query.iq to use BigQuery type names.

Return type of TIMESTAMP_ADD applied to TIMESTAMP WITH LOCAL
TIME ZONE should have been TIMESTAMP WITH LOCAL TIME ZONE,
not TIMESTAMP; fixed it.
return SqlParserUtil.parseTimestampLiteral(getValue(), pos);
case TIMESTAMP_WITH_LOCAL_TIME_ZONE:
return SqlParserUtil.parseTimestampWithLocalTimeZoneLiteral(getValue(), pos);
default:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm discussed offline, I think that location should be return SqlLiteral.createUnknown("TIMESTAMP") instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See in SqlValidatorImpl.resolveLiteral (which calls this function):

      final SqlIdentifier identifier =
          new SqlIdentifier(unknownLiteral.tag, SqlParserPos.ZERO);
      final @Nullable RelDataType type = catalogReader.getNamedType(identifier);
      final SqlTypeName typeName;
      if (type != null) {
        typeName = type.getSqlTypeName();
      } else {
        typeName = SqlTypeName.lookup(unknownLiteral.tag);
      }
      return unknownLiteral.resolve(typeName);

So, it makes sense to call SqlLiteral.createUnknown("DATETIME"), passing "DATETIME" as the tag. The tag just says "this is what the type called itself", and in order to resolve it, we look up in catalogReader to see if it's mapped to anything. In the case of BQ, it will map to TIMESTAMP. In the case of everything else, it would be null, so validation would fail when it invokes SqlTypeName.lookup(unknownLiteral.tag).

}
args = FunctionParameterList(ExprContext.ACCEPT_SUB_QUERY) {
quantifier = (SqlLiteral) args.get(0);
args.remove(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this line doing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the production for FunctionParameterList() you'll see it start like this:

{
    <LPAREN>
    (
        qualifier = AllOrDistinct() { list.add(qualifier); }
    |
        { list.add(null); }
    )
    ...

Here, list is the return value (args here in DateFunctionCall()). So the first argument is always either null, or either ALL or DISTINCT, since some functions can take those qualifiers. Here, it should always be null, although I suppose this means the parser would silently accept something like DATE(DISTINCT 2023, 01, 18), which it probably shouldn't.

@julianhyde
Copy link
Contributor

There's too much going on here for me to review.

Can you separate it out into pieces that can be understood by an end user as adding some useful functionality? If there refactorings that don't change functionality, keep those separate.

It's confusing that there are type aliases, and there are also functions that are named after types. Basically you need to craft commit messages that are suitable for the release notes. (And each such commit should be a jira case that explains the new functionality.)

!ok

#####################################################################
# DATETIME
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a set of DATETIME tests currently disabled around line 650 btw

# DATETIME(year, month, day, hour, minute, second)

julianhyde added a commit to julianhyde/calcite that referenced this pull request Jan 29, 2023
…tors

Restore "DATE(string)" function (desugared to CAST)

Combine parsing of DATE/TIME/TIMESTAMP/DATETIME constructor
functions.

Use SqlBasicFunction for all overloads of DATE.
(Add OperandTypes.typeName to support.)

In SQL reference, use Calcite rather than BigQuery type
names: switch DATETIME to TIMESTAMP, TIMESTAMP to TIMESTAMP
WITH LOCAL TIME ZONE.

Close apache#3023
@wnob wnob marked this pull request as draft January 31, 2023 00:57
julianhyde added a commit to julianhyde/calcite that referenced this pull request Feb 9, 2023
…tors

Restore "DATE(string)" function (desugared to CAST)

Combine parsing of DATE/TIME/TIMESTAMP/DATETIME constructor
functions.

Use SqlBasicFunction for all overloads of DATE.
(Add OperandTypes.typeName to support.)

In SQL reference, use Calcite rather than BigQuery type
names: switch DATETIME to TIMESTAMP, TIMESTAMP to TIMESTAMP
WITH LOCAL TIME ZONE.

Close apache#3023
julianhyde added a commit to julianhyde/calcite that referenced this pull request Feb 10, 2023
…ATETIME (enabled in BigQuery library)

New DATE functions such as DATE(int, int, int) needs to work
alongside the existing DATE(string) function; all are
validated usig the type operand checker, but after
validation DATE(string) is desugared to CAST(string AS DATE).

Following
  [CALCITE-5424] Customize handling of literals based on type system
which deferred validation of DATE, TIME, TIMESTAMP literals,
we now also allow DATETIME literal. (In BigQuery mode it is
mapped to a literal in Calcite's TIMESTAMP type.)

Also includes a fix for
  [CALCITE-5498] BigQuery TIMESTAMP() function short notation for
  timezone offsets isn’t supported in Java 8

In a few places I had to change test output. I removed "UTC"
from test output and added TODO comments; these changes
should reverted when
  [CALCITE-5446] Support TIMESTAMP WITH LOCAL TIME ZONE type in JDBC driver
is fixed. And I manually added the offset to some TIMESTAMP
WITH LOCAL TIME ZONE LITERALS (which were written as TIMESTAMP
because big-query.iq uses BigQuery's type aliasing); these
changes can be reverted when
  [CALCITE-5496] Support time zones when parsing TIMESTAMP WITH LOCAL TIME ZONE
is fixed.

Close apache#3023
@asfgit asfgit closed this in 4fbcbf4 Feb 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants