Skip to content

Conversation

@hequn8128
Copy link
Contributor

What is the purpose of the change

This pull request ports window classes into flink-api-java module.

Brief change log

  • Convert Tumble/Slide/Session/Over related classes to interface and rename the original class from XXX to XXXImpl. For example, add a TumbleWithSize interface and rename the original TumbleWithSize to TumbleWithSizeImpl. We can't port TumbleWithSize directly into the api-java module because ExpressionParser is used in it.
  • Deprecate def window(window: Window): WindowedTable and add def window(window: GroupWindow): GroupWindowedTable
  • Deprecate class WindowedTable and add GroupWindowedTable
  • Add test case, such as WindowValidationTest, to test failures for the creation of window.

Verifying this change

This change is already covered by existing Window IT/UT test cases.
Furthermore, WindowValidationTest has been add to test failures for the creation of window.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (no)

@flinkbot
Copy link
Collaborator

flinkbot commented Mar 13, 2019

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Review Progress

  • ✅ 1. The [description] looks good.
  • ✅ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ✅ 4. The change fits into the overall [architecture].
  • ✅ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Details
The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

* <p>For finite batch tables, group windows provide shortcuts for time-based groupBy.
*
* <p>Note: {@link Window} is temporally used as the father class of {@link GroupWindow} for the
* sake of compatibility. It will be removed later.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here just a suggestion, how about we open a new PR for Deprecated the Window in release-1.8, since we should create a new RC2 for release 1.8. If we do not do that the Window will keep existing for almost half a year. And I'll create the Jira FLINK-11918, and link to release-1.8 vote mail thread, ask RM's options. If all of you do not agree, I'll close the JIRA, otherwise, we can open the new PR for Deprecated the window.

Copy link
Member

@sunjincheng121 sunjincheng121 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. @hequn8128

The PR looks good to me. I only left a few suggestions.
And one suggestion about deprecated Window in a new PR, I think @twalthr may give us some suggestions.

Best,
Jincheng

* aggregate for each input row over a range of its neighboring rows.
*
* <p>Java users should use the methods with parameters of String type, while Scala users should use
* the methods with parameters of Expression type.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need mention Java users should not using Expression. Frankly speaking, we expect Java users to use Expression easily after FLINK-11890. What do you think?

* elements in 5 minutes intervals.
*
* <p>Java users should use the methods with parameters of String type, while Scala users should use
* the methods with parameters of Expression type.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

@Deprecated
@deprecated(
"This method will be removed. Use table.window(window: GroupWindow) instead",
"1.9.0")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on FLINK-11918

*/
def every(slide: Expression): SlideWithSizeAndSlide = new SlideWithSizeAndSlide(size, slide)
def every(slide: Expression): SlideWithSizeAndSlide =
new SlideWithSizeAndSlideImpl(size, slide)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary change?

@sunjincheng121
Copy link
Member

@flinkbot approve description
@flinkbot approve consensus

@hequn8128 hequn8128 changed the title [FLINK-11908] Port window classes into flink-api-java [FLINK-11908][table] Port window classes into flink-api-java Mar 14, 2019
@hequn8128
Copy link
Contributor Author

@sunjincheng121 Thanks a lot for your review. I have addressed all your comments.

@twalthr @sunjincheng121 I have also rebased to the deprecating window pr and delete the deprecated methods and classes in this one. Would be great if you can take a look.

Best, Hequn

@hequn8128
Copy link
Contributor Author

Update, rebase to the latest deprecating window PR to unblock review. The updates contain two commits. One is the PR of deprecating window, the other one is the PR of the current issue.

@hequn8128
Copy link
Contributor Author

Rebase to the latest flink/master

Copy link
Contributor

@twalthr twalthr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for rebasing the PR @hequn8128. Did you see my comment in the JIRA issue? I think instead of doing a lot of reflection magic for getting the implementation for every window. We should treat the cause of the problem which is the missing ExpressionParser. I would suggest to rename ExpressionParser to ExpressionParserImpl and introduce ExpressionParser.parseExpression in flink-table-api-java that performs a reflection step at just one location. This allows to finalize the API porting with interfaces and implementation for windows. And also solves related issues that require an expression parser.

@hequn8128
Copy link
Contributor Author

hequn8128 commented Mar 15, 2019

@twalthr Sorry. I missed the important messages from you. Converting the ExpressionParser into an interface is a good solution. So we can port TumbleWithSize, TumbleWithSizeOnTime, etc. directly into api-java. This also performs a reflection step at just one location.

Good idea!
I will try to update the PR asap.

Best, Hequn

@hequn8128
Copy link
Contributor Author

@twalthr @sunjincheng121 I have updated the PR. Would be great if you can take another look.

Some changes are explained by the followings:

  • To make ExpressionParser.parseExpression() available, the ExpressionParser contains two static methods: parseExpression() and parseExpressionList().
  • As static methods cannot be overridden, I added a PlannerExpressionParser interface. The PlannerExpressionParser contains two non-static expression parser methods and are overridden by PlannerExpressionParserImpl. So once we get a PlannerExpressionParserImpl instance through refection, we can call parseExpression() in PlannerExpressionParserImpl.
  • Furthermore, I add a companion class PlannerExpressionParserImpl. This is because we can't new object directly.
  • Both ExpressionParser and PlannerExpressionParser in api-java are marked as Internal class.

Best,
Hequn

Copy link
Contributor

@twalthr twalthr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @hequn8128. The code looks very good. I had just minor comments and will address them while merging.

public static PlannerExpressionParser getExpressionParser() {

if (expressionParser == null) {
synchronized (SingletonPlannerExpressionParser.class) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API is single-threaded we don't need to synchronized code blocks.

alias,
partitionBy,
orderBy,
ExpressionParser.parseExpression("UNBOUNDED_RANGE"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't introduce string constants at an arbitrary code location. Instead, either move it to the top of a class as a static final field or in this case use an already existing constant in BuiltInFunctionDefinitions.

/** Defines a partitioning of the input on one or more attributes. */
private final List<Expression> partitionBy;

public OverWindowPartitioned(List<Expression> partitionBy) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use a default visibility for intermediate API classes. Such that users can not accidentally use the wrong methods.

@twalthr
Copy link
Contributor

twalthr commented Mar 18, 2019

@flinkbot approve all

@asfgit asfgit closed this in 7a27357 Mar 18, 2019
@hequn8128
Copy link
Contributor Author

@twalthr Thanks for your valuable suggestions. I will move on the next PR(convert Table to the interface).

HuangZhenQiu pushed a commit to HuangZhenQiu/flink that referenced this pull request Mar 20, 2019
HuangZhenQiu pushed a commit to HuangZhenQiu/flink that referenced this pull request Apr 22, 2019
sunhaibotb pushed a commit to sunhaibotb/flink that referenced this pull request May 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants