Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-34274][runtime] Implicitly disable resource wait timeout for A… #24238

Merged
merged 2 commits into from
Feb 28, 2024

Conversation

dmvk
Copy link
Member

@dmvk dmvk commented Jan 30, 2024

@dmvk dmvk requested a review from XComp January 30, 2024 21:32
@flinkbot
Copy link
Collaborator

flinkbot commented Jan 30, 2024

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

Copy link
Contributor

@XComp XComp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change makes sense. I have a minor proposal that would allow not touching production code. PTAL

The CI failure is most likely due to FLINK-34200

@XComp
Copy link
Contributor

XComp commented Feb 16, 2024

Could we finalize this one? 🤔

@XComp
Copy link
Contributor

XComp commented Feb 27, 2024

I added the changes I proposed. That should be good enough from my end, if you're ok with those changes (just to bring the PR closer to being merged and resolving the test instability in master).

@XComp
Copy link
Contributor

XComp commented Feb 28, 2024

I added the changes I proposed. That should be good enough from my end, if you're ok with those changes (just to bring the PR closer to being merged and resolving the test instability in master).

I reverted those changes. See my responses above. The only thing I kept/added was removing the comment. I'm gonna rebase the branch to get a green CI (the e2e tests are failing due to FLINK-34420).

dmvk and others added 2 commits February 28, 2024 10:49
…urceWaitTimeout anymore) and just adds confusion because we don't need to wait in this test:

0. The mentioned test method #testRequirementLowerBoundIncreaseBeyondCurrentParallelismAttemptsImmediateRescale does not exist (#testRequirementLowerBoundIncreaseBeyondCurrentParallelismKeepsJobRunning is most likely meant)
1. the startJobWithSlotsMatchingParallelism will be executed in the main thread and, therefore, finishes before updateJobResourceRequirements is triggered (which also runs in the main thread)
2. Running both calls in the main thread ensures sequential execution of the commands with the scheduler being in WaitingForResources state when calling updateJobResourceRequirements
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants