Skip to content

Conversation

@twalthr
Copy link
Contributor

@twalthr twalthr commented Jul 10, 2019

What is the purpose of the change

This adds a parser for all logical types defined in FLIP-37. This parser is both useful for the SQL Client as well as type annotations for UDFs.

Brief change log

See commit messages.

Verifying this change

This change added tests and can be verified as follows: LogicalTypeParserTest

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): yes
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? JavaDocs

@flinkbot
Copy link
Collaborator

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Details
The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

CI report for commit 256fd8a: SUCCESS Build

@aljoscha aljoscha self-requested a review July 12, 2019 09:07
@dawidwys dawidwys self-requested a review July 17, 2019 07:01
@flinkbot
Copy link
Collaborator

flinkbot commented Jul 17, 2019

CI report:

Copy link
Contributor

@dawidwys dawidwys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@twalthr Thank you for the PR! I put some rather minor comments.

* evaluation of types.
*
* <p>The enumeration is very close to the SQL standard in terms of naming and completeness. However,
* it reflects just a subset of the evolving standard and contains some extensions (such as {@code NULL},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the opening ( or revert the closing one.


private final @Nullable String database;

private final String typeIdentifier;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using ObjectIdentifier here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the first version ObjectIdentifier allowed also partially defined paths. But we changed it to full identifiers. I would be up for relaxing this constraint to be honest as I see already a couple of String[] List<String> objects travelling through the stack.

/**
* Parses a type string. All types will be fully resolved except for {@link UnresolvedUserDefinedType}s.
*
* <p>Throws {@link ValidationException} in case of parsing errors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use @throws annotation for this.

IDENTIFIER,

// e.g. "myCatalog.myDatabase."
DOT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, (also not sure if it would be better), but maybe IDENTIFIER_SEPARATOR? Similar to LIST_SEPARATOR.

}

private static List<Token> tokenize(String typeString) {
final char[] chars = typeString.toCharArray();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just access the String directly via #charAt(). This performs unnecessary copy (but I know it is not crucial here)


private LogicalType parseTimeType() {
int precision = TimeType.DEFAULT_PRECISION;
if (hasNextToken(TokenType.BEGIN_PARAMETER)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use parseOptionalPrecision


private LogicalType parseTimestampType() {
int precision = TimestampType.DEFAULT_PRECISION;
if (hasNextToken(TokenType.BEGIN_PARAMETER)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use parseOptionalPrecision

}
}

private LogicalType parseDayTimeIntervalType() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about splitting this method by the 3 cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moreover I think this method could be improved, if we first parse lower/upper resolution and then reuse DataTypes.Resolution#resolveInterval. In a pseudo code it could look sth like:

public Resolution parseAsResolution(IntervalUnit type) {
    if (type == DAY) {
	dayPrecision = parseOptionalPrecision(dayPrecision);
        return new Resolution(type, dayPrecision);
    } else if (lower == SECOND) {
	fractionalPrecision = parseOptionalPrecision(fractionalPrecision);
        return new Resolution(type, fractionalPrecision);
    }
}

public LogicalType parseDayTimeIntervalType() {
    IntervalUnit upper = tokenAsIntervalUnit();
    Optional<IntervalUnit> lower = tokenAsIntervalUnit();

    DataTypes.Resolution.resolveInterval(parseAsResolution(upper), lower.map(parseAsResolution).orNull)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I think it is not worth the effort. Usually, such classes are generated and look much worse. Unless, the standard changes we will also not need to make changes in this method. So maintainability is also not an issue. Exposing DataTypes.Resolution.resolveInterval is mixing API with internal utils.

@Parameters(name = "{index}: [From: {0}, To: {1}]")
public static List<Object[]> testData() {
return Arrays.asList(
new Object[][]{
Copy link
Contributor

@dawidwys dawidwys Jul 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think it would make sense to have a TestSpec object in this case, right now we have a magic null value for most of the test cases. I think it would be easier to read if it was:

    TestSpec
         .forString("CHAR")
         .parseAsType(new CharType());

    TestSpec
         .forString("ROW<")
         .throwsExceptionWithMessage("...")

Copy link
Contributor

@dawidwys dawidwys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than one minor comment, +1 from my side.

Edit: Hah, you fixed the comment even before I posted it.

@asfgit asfgit closed this in 479fb07 Jul 18, 2019
asfgit pushed a commit that referenced this pull request Jul 18, 2019
This adds a parser for all logical types defined in FLIP-37.

This closes #9061.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants