Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[data] Fix slow block evaluation when splitting is enabled #20693

Merged
merged 14 commits into from
Nov 24, 2021

Conversation

ericl
Copy link
Contributor

@ericl ericl commented Nov 24, 2021

Why are these changes needed?

This fixes slow lazy block evaluation by adding an explicit get_blocks() bulk method, and using that when-ever lazy iteration is not needed.

The root cause of the slowdown was because block splitting requires ray.get() during iteration over block refs, to materialize split blocks. However, this interferes with exponential rampup.

Related issue number

Closes #20625

@ericl ericl requested a review from scv119 as a code owner November 24, 2021 03:45
@ericl ericl changed the title [WIP] [data] Fix slow block evaluation when splitting is enabled [data] Fix slow block evaluation when splitting is enabled Nov 24, 2021
@ericl ericl added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Nov 24, 2021
@ericl ericl merged commit 75bd1fb into ray-project:master Nov 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug][Dataset] Inference nightly test download regression
4 participants