Skip to content

[BP-1.19][FLINK-35522][runtime] Fix the issue that the source task may get stuck in speculative execution mode.#24899

Merged
zhuzhurk merged 1 commit into
apache:release-1.18from
SinBex:FLINK-35522-1.18
Jun 7, 2024
Merged

[BP-1.19][FLINK-35522][runtime] Fix the issue that the source task may get stuck in speculative execution mode.#24899
zhuzhurk merged 1 commit into
apache:release-1.18from
SinBex:FLINK-35522-1.18

Conversation

@SinBex
Copy link
Copy Markdown
Contributor

@SinBex SinBex commented Jun 6, 2024

What is the purpose of the change

If the source task does not get assigned a split because the SplitEnumerator has no more splits, and a failover occurs during the closing process, the SourceCoordinatorContext will not resend the NoMoreSplit event to the newly started source task, causing the source vertex to remain stuck indefinitely.
This case may only occur in batch jobs where speculative execution has been enabled.

Brief change log

  • fix the issue that the source task may get stuck in speculative execution mode.

Verifying this change

  • Added it case to verify the issue.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no )
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no )
  • The S3 file system connector: (no )

Documentation

  • Does this pull request introduce a new feature? (no)

@flinkbot
Copy link
Copy Markdown
Collaborator

flinkbot commented Jun 6, 2024

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@zhuzhurk
Copy link
Copy Markdown
Contributor

zhuzhurk commented Jun 6, 2024

@flinkbot run azure

2 similar comments
@zhuzhurk
Copy link
Copy Markdown
Contributor

zhuzhurk commented Jun 6, 2024

@flinkbot run azure

@SinBex
Copy link
Copy Markdown
Contributor Author

SinBex commented Jun 6, 2024

@flinkbot run azure

Copy link
Copy Markdown
Contributor

@zhuzhurk zhuzhurk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@zhuzhurk zhuzhurk merged commit 8ae7986 into apache:release-1.18 Jun 7, 2024
@zhuzhurk zhuzhurk changed the title [FLINK-35522][runtime] Fix the issue that the source task may get stuck in speculative execution mode. [BP-1.19][FLINK-35522][runtime] Fix the issue that the source task may get stuck in speculative execution mode. Jun 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants