You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CheckpointIO takes care of the Booster.save and Booster.load logic to allow for model saving/resuming/loading. It should be noted that CheckpointIO is often used in pair with the Plugin as a Plugin can possibly require a specific saving/loading strategy. However, we should propose general ones for normal pytorch model and a DTensor-based model. As the DTensor is under development, we should focus on the native PyTorch implementation first.
Wanna track the development progress? Take a look at
Overview
CheckpointIO
takes care of theBooster.save
andBooster.load
logic to allow for model saving/resuming/loading. It should be noted thatCheckpointIO
is often used in pair with thePlugin
as aPlugin
can possibly require a specific saving/loading strategy. However, we should propose general ones for normal pytorch model and a DTensor-based model. As the DTensor is under development, we should focus on the native PyTorch implementation first.Wanna track the development progress? Take a look at
proposal: #3046
project kanban: https://github.com/orgs/hpcaitech/projects/19
Goal
The CheckpointIO should allow the user to save/load the native PyTorch model/optimizer/lr schduler.
The text was updated successfully, but these errors were encountered: