-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[air] New train.Checkpoint
API: Update lightning, transformers, and tf integrations
#38491
[air] New train.Checkpoint
API: Update lightning, transformers, and tf integrations
#38491
Conversation
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
1e66990
to
f265479
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
nit: Do we have any warning message when the users trying to call legacy checkpoint apis on the new checkpoint objects?
@woshiyyya Good point. I'll add that as a todo on the issue. I think we can add these legacy methods on the base |
… tf integrations (ray-project#38491) Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: harborn <gangsheng.wu@intel.com>
… tf integrations (ray-project#38491) Signed-off-by: Justin Yu <justinvyu@anyscale.com>
… tf integrations (ray-project#38491) Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
… tf integrations (ray-project#38491) Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: Victor <vctr.y.m@example.com>
Why are these changes needed?
This PR updates
LightningTrainer
,TransformersTrainer
, and the TF/KerasReportCheckpointCallback
to report a newFrameworkCheckpoint
(from #38452) when the feature flag is enabled. The callbacks for the new torch integration APIs are also updated to return a newCheckpoint
rather than the oldair.Checkpoint
.This PR also makes
Backend._encode_data
a noop for all frameworks. This is not needed with the new checkpoints.Notes
Checkpoint
because the lightning/transformers trainers are being deprecated soon, and it's ok to just keep the legacy behavior for them. (Otherwise, there'd be a bunch of unnecessary updates to make.)TensorflowCheckpoint
inReportCheckpointCallback
. This is because it's also a checkpoint that we create for the user, so the user may need some accessors to get stuff out of it. Another option is to add a getter toReportCheckpointCallback
(similar toXGBoostTrainer.get_model
) and just return a genericCheckpoint
.Related issue number
#38292
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.