Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack:
limit_all_gathers=True
#88432 [FSDP] Default tolimit_all_gathers=True
sharding_strategy
docs and other minor doc changes #88431 [FSDP][Docs] Rewordsharding_strategy
docs and other minor doc changesmixed_precision
ctor docs #88429 [FSDP][Docs] Simplifymixed_precision
ctor docsBACKWARD_PRE
#88428 [FSDP] Default toBACKWARD_PRE
fully_shard()
onlyFULL_SHARD
#88260 [FSDP()][Easy] Makefully_shard()
onlyFULL_SHARD
fully_shard()
abide by@contract
! #88235 [FSDP()] Havefully_shard()
abide by@contract
!_State
to_FSDPState
#88234 [FSDP()][Easy] Rename_State
to_FSDPState
fully_shard()
and move to_composable/
#88233 [FSDP()] Rename tofully_shard()
and move to_composable/
TrainingState
transition #88232 [FSDP][Easy] Remove unneededTrainingState
transitionunflat_param_name
->fqn
for consistency #88123 [FSDP] Renameunflat_param_name
->fqn
for consistency_get_buffer_names()
#88122 [FSDP] Simplify_get_buffer_names()
torch.no_grad()
context when offloading to CPU #88121 [FSDP] Remove unneededtorch.no_grad()
context when offloading to CPU_lazy_init()
into_fsdp_root_pre_forward()
#87941 [FSDP()][26/N] Move_lazy_init()
into_fsdp_root_pre_forward()
_post_forward_reshard()
#87940 [FSDP()][25/N] Add_post_forward_reshard()
_lazy_init()
#87939 [FSDP()][24/N] Refactor_lazy_init()
_cast_buffers()
#87935 [FSDP()][21/N] Refactor and fix_cast_buffers()
dtype
tobuffer_name_to_dtype
#87934 [FSDP] Renamedtype
tobuffer_name_to_dtype
device
arg from_cast_buffers()
#87933 [FSDP] Removedevice
arg from_cast_buffers()
pre_forward_unshard()
#87931 [FSDP()][18/N] Refactorpre_forward_unshard()
_fsdp_root_pre_forward()
#87930 [FSDP()][17/N] Refactor_fsdp_root_pre_forward()
_init_streams()
#87928 [FSDP()][15/N] Refactor_init_streams()