Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-27903][runtime] Introduce and support HYBRID resultPartitionType #19927

Closed
wants to merge 5 commits into from
Closed

[FLINK-27903][runtime] Introduce and support HYBRID resultPartitionType #19927

wants to merge 5 commits into from

Conversation

reswqa
Copy link
Member

@reswqa reswqa commented Jun 10, 2022

What is the purpose of the change

Introduce and support HYBRID resultPartitionType

Brief change log

  • Introduce HYBRID resultPartitionType.
  • Make streamGraph and jobGraph support HYBRID type edge.
  • Test that the pipelinedRegionSchedulingStrategy can be adapted to hybrid type edges.

Verifying this change

This change added unit tests.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): yes
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? Docs and JavaDocs

@flinkbot
Copy link
Collaborator

flinkbot commented Jun 10, 2022

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@reswqa
Copy link
Member Author

reswqa commented Jun 10, 2022

@flinkbot run azure

@reswqa
Copy link
Member Author

reswqa commented Jun 10, 2022

@xintongsong This PR has been updated according to the comments, you can take a look when you have time.

@reswqa
Copy link
Member Author

reswqa commented Jun 11, 2022

@flinkbot run azure

*
* <p>Intermediate data can be consumed directly from memory as much as possible.
*/
HYBRID(true, false, ConsumingConstraint.CAN_BE_PIPELINED, ReleaseBy.SCHEDULER);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this type will be included in the isBlockingOrBlockingPersistentResultPartition or isPipelinedOrPipelinedBoundedResultPartition ?

I found that after last change, these two new methods functionality is not very clear, It works like hard code check the Pipelined and Blocking type, Right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not be included in isBlockingOrBlockingPersistentResultPartition or isPipelinedOrPipelinedBoundedResultPartition.

The two methods here are exactly, as you said, checking for the specific types of the interface implementations. This is obviously not a good design, because you should not assume working with a specific implementation of the interface. However, this was not newly introduced. FLINK-27902 only explicitly separates use cases that rely on specific implementations from those that properly rely on the interfaces. Fixing of them probably requires more careful redesign in the problematic use cases, and we do not want to block this feature on that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xintongsong for your explanation. IMO, in the previous version, It mainly expose the isBlocking and isPipelined and isReconnectable which I think is a characteristic and not bound to a specific implementation of the type. Please correct me if I'm wrong.
But I think we can redesign this part after this feature finished.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the interfaces were intended to expose a characteristic rather than to be bounded to a specific implementation. The problem was on the caller side. Instead of relying on the intended characteristic, some callers were relying on assumptions such as "a result partition type that returns true for isPipelined must be PIPELINED or PIPELINED_BOUNDED".

E.g., in AdaptiveScheduler#assertPreconditions, the intention here is to make sure only PIPELINED and PIPELINED_BOUNDED are involved. You may take a look at where isBlockingOrBlockingPersistentResultPartition and isPipelinedOrPipelinedBoundedResultPartition for more details.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get it, thanks

Copy link
Contributor

@xintongsong xintongsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reswqa, thanks for addressing the comments. LGTM.
I just made a bit minor doc changes, in the fixup commit. Please check it out.

@reswqa
Copy link
Member Author

reswqa commented Jun 13, 2022

@xintongsong Thanks for your review and good doc changes, Is the fixup commit will be rebased by me or when you merge?

@xintongsong
Copy link
Contributor

I'll handle the fixup commit during merging.

huangxiaofeng10047 pushed a commit to huangxiaofeng10047/flink that referenced this pull request Jun 27, 2022
…o support testing for hybrid resultPartitionType.

This closes apache#19927
zstraw pushed a commit to zstraw/flink that referenced this pull request Jul 4, 2022
…o support testing for hybrid resultPartitionType.

This closes apache#19927
liujiawinds pushed a commit to liujiawinds/flink that referenced this pull request Jul 22, 2022
…o support testing for hybrid resultPartitionType.

This closes apache#19927
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants