[Data] Simplify execution callback lifecycle fix Diff #1#60480
Conversation
|
@bveeramani review please |
There was a problem hiding this comment.
Code Review
This pull request refactors the execution callback system in Ray Data, moving from a dynamic, per-job injection model to a static, eager initialization approach. The DataContext now serves as a registry for callback factories, and the StreamingExecutor loads all callbacks at startup. This simplifies the callback lifecycle and makes the system more declarative. The LoadCheckpointCallback is also updated to fit this new model, becoming a no-op when checkpointing is not configured. The changes are logical and well-implemented. I have one suggestion to improve the robustness of exception handling when loading callbacks from environment variables.
|
@owenowenisme would you mind doing a first pass on this? I think we should pay special attention to the checkpointing changes and ensuring there are no new correctness issues |
bveeramani
left a comment
There was a problem hiding this comment.
Some high-level feedback: two disadvantages of requiring all subclasses to share the same constructor are: (1) it makes it less explicit what the dependencies of that subclass are, and (2) it makes it harder to test the subclass because you can’t directly inject fake dependencies.
An alternative pattern would be to introduce a require factory class method like from_executor(cls, executor), and use that when constructing the callbacks. Then, subclasses can still define whatever dependencies they need in the constructor.
Also, make sure the other callback tests still pass? (If you haven’t already)
6f87e23 to
6ac4e42
Compare
9b0e0db to
7b8c9dc
Compare
|
@bveeramani @owenowenisme another review please |
Signed-off-by: Limark Dcunha <limarkdcunha@gmail.com>
824de72 to
bb435d8
Compare
Signed-off-by: Limark Dcunha <limarkdcunha@gmail.com>
Signed-off-by: Balaji Veeramani <bveeramani@berkeley.edu>
…d of instances (#61293) Larger Issue Refactored the execution callback system from dynamic, per-job injection to static, eager initialization. DataContext now serves as a stateless registry of callback factories, enabling the StreamingExecutor to load all callbacks at startup. Each callback then self-configures or no-ops based on the runtime environment. ### Context This is Part 2 of a four-part change set intended to resolve the issue described above. Part 1 diff: #60480 Closes #60279. --------- Signed-off-by: Limark Dcunha <limarkdcunha@gmail.com>
Description
Larger Issue
Refactored the execution callback system from dynamic, per-job injection to static, eager initialization. DataContext now serves as a stateless registry of callback factories, enabling the StreamingExecutor to load all callbacks at startup. Each callback then self-configures or no-ops based on the runtime environment.
Context
This is Part 1 of a four-part change set intended to resolve the issue described above.
Closes #60279.