Skip to content

Conversation

@reswqa
Copy link
Member

@reswqa reswqa commented Feb 6, 2023

What is the purpose of the change

Partition reuse only take effect for re-consumable edge, but hybrid selective result partition is not re-consumable. This optimization is very important to reduce the cost of the shuffle write phase. In the previous implementation, we will only force the broadcast edge to be of hybrid full(re-consumable) in the 'ResultPartitionTypeFactory'. As a result, for ALL_ EXCHANGE_HYBRID_SELECTIVE job, partition reuse cannot take effect for non-broadcast edges.
In fact, we expected to replace all the edges that can be reused with hybrid full result partition.

Brief change log

  • Change broadcast hybrid selective result partition type to hybrid full in compile phase instead of partition create phase.
  • Change reusable hybrid result partition to hybrid full to enable partition reuse optimization.

Verifying this change

This change added tests in StreamingJobGraphGeneratorTest.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no

…tion type to hybrid full in compile phase instead of partition create phase.

In the previous implementation, we will force the broadcast hybrid selective edge to be of hybrid full(re-consumable) in the ResultPartitionTypeFactory. But this is not the correct phase to do this replacement, because for scheduling topology, the result is not changed still. This inconsistency will cause many problems, such as increasing the cost of fail-over.
@flinkbot
Copy link
Collaborator

flinkbot commented Feb 6, 2023

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

…rid full to enable partition reuse optimization.
Copy link
Contributor

@xintongsong xintongsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Merging.

xintongsong pushed a commit that referenced this pull request Feb 9, 2023
…rid full to enable partition reuse optimization.

This closes #21856
akkinenivijay pushed a commit to krisnaru/flink that referenced this pull request Feb 11, 2023
…rid full to enable partition reuse optimization.

This closes apache#21856
mohsenrezaeithe pushed a commit to mohsenrezaeithe/flink that referenced this pull request Feb 21, 2023
…rid full to enable partition reuse optimization.

This closes apache#21856
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants