-
Notifications
You must be signed in to change notification settings - Fork 264
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[fix] better handling non-flatten in FSDP (#1072)
* [fix] better handling non-flatten in FSDP - see the detailed comment about that backward firing case - also minor debugging help in FSDP - also minor fix in FPW's state dict * [feat] disallow reset_parameters by default * [feat] adding fsdp_instances API - useful in check wrapping by user code * [fix] one line fix but more than a day of debugging * fixed the case of loading combined check with empty fsdp instances * fixed another bug around state loading the root/nonroot module full param caching due to not resharding after forward * [feat] support .half and .float better * fixed a bug in gather optim state losses extra keys from the original state_dict * fixed a test failure in mixed precision * fixed another bug affecting no_sync grad acc * fixed a bug and a test in fsdp optim state * fixed another corner case * added a comment * skip ssd offload tests * skip fsdp one for ssd overload Co-authored-by: Min Xu <min.xu.public@gmail.com>
- Loading branch information
Showing
13 changed files
with
332 additions
and
134 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,6 +12,7 @@ | |
OffloadConfig, | ||
TrainingState, | ||
auto_wrap_bn, | ||
get_fsdp_instances, | ||
no_pre_load_state_dict_hook, | ||
) | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.