[train] expose training input/output in callbacks #53869

matthewdeng · 2025-06-17T00:25:25Z

Changes

Made TrainRunContext frozen and added all training inputs (configs, datasets, etc.)
Changed TrainContext to contain TrainRunContext instead of inheriting from it
Added after_controller_finish callback to expose final training results
Modified after_controller_start to receive TrainRunContext parameter

TODO

Remove redundant TrainRunContext from Callback initializations
Refactor logging to handle both TrainContext and TrainRunContext
Fix/add tests

Signed-off-by: Matthew Deng <matt@anyscale.com>

justinvyu

looks good to me overall!

justinvyu · 2025-06-18T00:30:16Z

python/ray/train/v2/_internal/execution/context.py

+    # The configuration passed to the training function.
+    train_loop_config: Optional[Dict[str, Any]]


should we also add some metadata like ray version/commit, backend version (torch version)?

Signed-off-by: Matthew Deng <matt@anyscale.com>

- Made `TrainRunContext` frozen and added all training inputs (configs, datasets, etc.) - Changed `TrainContext` to contain `TrainRunContext` instead of inheriting from it - Added `after_controller_finish` callback to expose final training results - Modified `after_controller_start` to receive `TrainRunContext` parameter --------- Signed-off-by: Matthew Deng <matt@anyscale.com> Signed-off-by: Scott Lee <scott.lee@rebellions.ai>

- Made `TrainRunContext` frozen and added all training inputs (configs, datasets, etc.) - Changed `TrainContext` to contain `TrainRunContext` instead of inheriting from it - Added `after_controller_finish` callback to expose final training results - Modified `after_controller_start` to receive `TrainRunContext` parameter --------- Signed-off-by: Matthew Deng <matt@anyscale.com>

matthewdeng added 2 commits June 16, 2025 17:18

[train] expose input and output in callbacks

91e0bbf

Signed-off-by: Matthew Deng <matt@anyscale.com>

add context

9163653

Signed-off-by: Matthew Deng <matt@anyscale.com>

justinvyu approved these changes Jun 18, 2025

View reviewed changes

matthewdeng added 6 commits June 17, 2025 17:43

callback init

36c5a68

Signed-off-by: Matthew Deng <matt@anyscale.com>

logging

06dafa6

Signed-off-by: Matthew Deng <matt@anyscale.com>

fix tests

bfbff5f

Signed-off-by: Matthew Deng <matt@anyscale.com>

fix controller test

b28cd52

Signed-off-by: Matthew Deng <matt@anyscale.com>

fix tests

58c0395

Signed-off-by: Matthew Deng <matt@anyscale.com>

worker group test

f6b5ae9

Signed-off-by: Matthew Deng <matt@anyscale.com>

matthewdeng added the go add ONLY when ready to merge, run all tests label Jun 18, 2025

matthewdeng added 2 commits June 18, 2025 14:02

move to util

0b81b5d

Signed-off-by: Matthew Deng <matt@anyscale.com>

fix logging tests

1be35d3

Signed-off-by: Matthew Deng <matt@anyscale.com>

matthewdeng marked this pull request as ready for review June 18, 2025 22:17

matthewdeng requested a review from a team as a code owner June 18, 2025 22:17

matthewdeng merged commit 2ccc18d into ray-project:master Jun 18, 2025
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[train] expose training input/output in callbacks #53869

[train] expose training input/output in callbacks #53869

Uh oh!

matthewdeng commented Jun 17, 2025 •

edited

Loading

Uh oh!

justinvyu left a comment

Uh oh!

justinvyu Jun 18, 2025

Uh oh!

Uh oh!

Uh oh!

		# The configuration passed to the training function.
		train_loop_config: Optional[Dict[str, Any]]

[train] expose training input/output in callbacks #53869

[train] expose training input/output in callbacks #53869

Uh oh!

Conversation

matthewdeng commented Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

TODO

Uh oh!

justinvyu left a comment

Choose a reason for hiding this comment

Uh oh!

justinvyu Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

matthewdeng commented Jun 17, 2025 •

edited

Loading