[core][tune] fix RayTaskError (de)serialization logic #54396
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Fixes a (de)serialization issue in
RayTaskError
that happens when the original cause has its own__reduce__
method defined.Closes #54379.
Details
The existing implementation relies on the definition of
BaseException.__reduce__
:However, since it's also subclassing the
cause_cls
, if__reduce__
is also defined there then that will override the original definition fromBaseException
.This surfaced in #54379 because Ray Train V2 implements exception classes with custom
__reduce__
definitions (e.g.TrainingFailedError
).The fix is to explicitly define
__reduce__
. Also removed the comment since it's no longer an assumption that's needed to understand the logic.Testing
Added
test_errors()
test case inray/tune/tests/test_train_v2_integration.py
.