-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Data] Partial fix for Dataset.context not being sealed after creation #41569
Conversation
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
@@ -545,32 +545,6 @@ def f(x): | |||
ds2.take_all() | |||
|
|||
|
|||
def test_streaming_split_with_custom_data_context( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving this test to test_context_propagation
, not changing the test code.
What about the places where we use DataContext.get_current() during planning, e.g., here? Don't we need to propagate the DataContext through to those? |
Good point. So this PR can fix the case for training jobs, where be different datasets will be proposed to different processes (the SplitCoordinator actors) for execution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but can you update the PR description to make it clear what cases this does and does not cover?
Why are these changes needed?
Dataset.context
should be sealed the first time the Dataset is created. But if a new operator is applied to the dataset, the new global DataContext will be saved again to the Dataset.This bug prevents using different DataContexts for training and validation datasets in a training job.
Note this PR only fixes the issue when multiple datasets are created in the process but will be running in different processes. If they run in the same process, it's still a bug, see #41573.
Related issue number
#41573
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.