Skip to content

Conversation

@reswqa
Copy link
Member

@reswqa reswqa commented Aug 10, 2022

What is the purpose of the change

Fix the bug that downstream task may never be notified of data available in hybrid shuffle when number of credits is zero.

Brief change log

  • HsSubpartitionView should be initialized to a notifiable state.
  • Reset needNotify to true when get a zero backlog.

Verifying this change

This change added test in HsSubpartitionViewTest.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no

…e state.

HsSubpartitionView should be initialized to a notifiable state, there may be a problem of never consuming otherwise. Imagine the following situation: Downstream task has no initial credit(i.e. exclusive buffers is configured to zero), if there is no data output in the upstream, it will feedback a zero backlog to downstream input channel. All subsequent data available notifications will be intercepted as needNotify is false.
@flinkbot
Copy link
Collaborator

flinkbot commented Aug 10, 2022

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

HsSubpartitionView should be notifiable when downstream get a zero backlog. Generally speaking, if the backlog is zero, when data become available, even if there is no credit, the backlog information will be notified also. However, in the hybrid shuffle, the notification will be ignored. This behavior is incorrect.
@reswqa
Copy link
Member Author

reswqa commented Aug 10, 2022

@flinkbot run azure

Copy link
Contributor

@xintongsong xintongsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

huangxiaofeng10047 pushed a commit to huangxiaofeng10047/flink that referenced this pull request Nov 3, 2022
HsSubpartitionView should be notifiable when downstream get a zero backlog. Generally speaking, if the backlog is zero, when data become available, even if there is no credit, the backlog information will be notified also. However, in the hybrid shuffle, the notification will be ignored. This behavior is incorrect.

This closes apache#20529
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants