Skip to content

Conversation

@snuyanzin
Copy link
Contributor

@snuyanzin snuyanzin commented Aug 26, 2018

What is the purpose of the change

The PR provides implementation of extract function for timeunits: decade, epoch, isodow, isoyear, millisecond,microsecond.

Brief change log

  • support timeunits extract
  • documentation updates (mention all possible timeunits for extract)
  • tests

Verifying this change

  • Added tests validates the result of extract call
  • Added tests validates that extract of isodow, isodoy, epoch, decade for time is incorrect

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (yes)
  • If yes, how is the feature documented? (docs)
    functions.md updated

Some more comments
There were the similar changes (only related to this task) in flink-libraries/flink-table/src/main/java/org/apache/calcite/avatica/util/DateTimeUtils.java as it was done in calcite-avatica apache/calcite-avatica#50. At the same time I found a comment in Flink's copy of the class

 * THIS FILE HAS BEEN COPIED FROM THE APACHE CALCITE PROJECT UNTIL CALCITE-1884 IS FIXED.
 */

The mentioned ticket is fixed in Calcite-Avatica 1.12.0 however the test org.apache.flink.table.expressions.ScalarFunctionsTest#testTimestampAdd will fail in case of class removal. Also I mentioned the issue in mail
https://lists.apache.org/thread.html/9ff42d73ed8c1e72bd9d990ba4d423fce38fafc5b31401a5a5411a91@%3Cdev.flink.apache.org%3E

@twalthr twalthr changed the title [FLINK-9432] Support extract timeunit: DECADE, EPOCH, ISODOW, ISOYEAR… [FLINK-9432] [table] Support extract timeunit: DECADE, EPOCH, ISODOW, ISOYEAR… Aug 27, 2018
@pnowojski pnowojski self-assigned this Sep 21, 2018
Copy link
Contributor

@pnowojski pnowojski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please provide on which DB were you basing this functionality? Have you checked the compatibility with other DBs? If you didn't, please do so. Handling time is very tricky and there can be tons of smaller or bigger discrepancies. Before accepting such contribution I would like to see how does this implementation compares to at least Oracle/MySQL/PostgreSQL/MS SQL.

}

@Test(expected = classOf[ValidationException])
def testISODOWWithTimeWhichIsUnsupported(): Unit = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. please deduplicate the code in those EXTRACT(XYZ FROM TIME t) tests. Either use parametrized tests from junit4 ( https://stackoverflow.com/questions/4399881/parameterized-unit-tests-in-scala-with-junit4 ), or create a for loop over units.
  2. Add expected exception message pattern like unit .* can not be applied to time variable?

// should be 0 but not
//testSqlApi(
// "EXTRACT(MILLISECOND FROM f16)",
// "0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? what's the problem here? If it returns incorrect result, we should change it into throwing an exception or not providing EXTRACT(MILLISECOND feature at all.

// TODO looks like a Calcite issue for cases like extract(microsecond from date)
// should be 0 but not
//testSqlApi(
// "EXTRACT(MICROSECOND FROM f16)",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

"EXTRACT(DOY FROM f16)",
"315")

testSqlApi(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why didn't you add support and tests to table API as well? Why do we support only subset of ranges in table api?

return year / 10;
case CENTURY:
return year > 0
? (year + 99) / 100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why default throws AssertionError? Shouldn't it be ValidationException or UnsupportedOperationException? If it should never happen then it should be IllegalStateException.

}
}

private static int julianExtract(TimeUnitRange range, int julian) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little bit worried about the julian thing in this method name... I have a bad feeling that it supposed to be Gregorian calendar here...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class should not be changed. It is copied from Calcite until a certain issue is fixed. If we apply changes here, then just to update it to the latest Calcite code base.

fmofw = firstMondayOfFirstWeek(year - 1);
}
return (int) (julian - fmofw) / 7 + 1;
return getIso8601WeekNumber(julian, year, month, day);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change the old WEEK code to getIso8601WeekNumber? If they return the same thing, why didn't you implement getIso8601WeekNumber as (julian - fmofw) / 7 + 1;?

switch (range) {
case YEAR:
return year;
case ISOYEAR:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Please extract this code into some method
  2. add some comments what is going on here
  3. Please add more tests for ISOYEAR covering those edge cases (when ISOYEAR differs from YEAR in either of the directions).

case _ => // do nothing
}

case TimeUnit.EPOCH =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this code block doing?

unit match {
case TimeUnit.YEAR |
TimeUnit.MONTH |
TimeUnit.DAY |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side question (maybe stupid one). Why do we need ExtractCallGen class at all?

@snuyanzin
Copy link
Contributor Author

I will close this PR and create a new one as there are from one side to many merge conflicts and from another calcite was updated to 1.2x since the time the PR was created.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants