You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I know this can be easily solved by assigning cfg.checkpoint_folder some value, but just curious why adding another config rather than using cfg.dist_checkpoint_root_folder and cfg.dist_checkpoint_folder.
Besides, using two dist_ configs is also strange. Isn't one such config enough?
The text was updated successfully, but these errors were encountered:
When fine-tuning with
StateDictType.FULL_STATE_DICT
, the program crashes when saving checkpoint.The error is caused here
https://github.com/facebookresearch/llama-recipes/blob/74bde65a62667a38ee0411676cf058c53f85771c/model_checkpointing/checkpoint_handler.py#L145
I know this can be easily solved by assigning
cfg.checkpoint_folder
some value, but just curious why adding another config rather than usingcfg.dist_checkpoint_root_folder
andcfg.dist_checkpoint_folder
.Besides, using two
dist_
configs is also strange. Isn't one such config enough?The text was updated successfully, but these errors were encountered: