Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tune] Deflake PBT Async test #19135

Merged
merged 1 commit into from
Oct 6, 2021

Conversation

krfricke
Copy link
Contributor

@krfricke krfricke commented Oct 6, 2021

Why are these changes needed?

The PopulationBasedTrainingSynchTest:testAsynchFail test is currently flaky.

The test ensures that in asynchronous training, at least some trials do not exploit the best performing trial due to asynchronous evaluation. However, for certain orders this might randomly still be the case - e.g. if the worst trial exploits the best trial early, and the second trial then does the same afterwards.

By introducing different sleep times, we make sure that bad performing trials are evaluated first, continue training earlier, and finish earlier without exploiting well performing trials. In synchronous mode, this should not happen as PBT will wait for all results to arrive, first - exactly what we want to test here.

Related issue number

Closes #15730

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Copy link
Contributor

@amogkam amogkam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks!

@amogkam amogkam merged commit 9f77cd8 into ray-project:master Oct 6, 2021
@krfricke krfricke deleted the tune/deflake-pbt-synch branch October 7, 2021 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Tune] test_trial_scheduler_pbt has become flaky recently!
3 participants