Skip to content

Commit

Permalink
[FSDP][2/N] Remove params_with_grad (pytorch#87480)
Browse files Browse the repository at this point in the history
This PR removes the property `params_with_grad` from `FullyShardedDataParallel`. It was introduced when implementing `clip_grad_norm_()` but was not consistently used. Personally, I do not think it makes sense for `FullyShardedDataParallel` to expose this helper because it is not a common paradigm.

This PR is technically BC-breaking. However, I checked that no one internally is using this API.

cc @ezyang @gchanan
Pull Request resolved: pytorch#87480
Approved by: https://github.com/rohan-varma
  • Loading branch information
awgu authored and kulinseth committed Dec 9, 2022
1 parent d84cd1f commit 27dc00b
Showing 1 changed file with 0 additions and 9 deletions.
9 changes: 0 additions & 9 deletions torch/distributed/fsdp/fully_sharded_data_parallel.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,6 @@
_sync_params_and_buffers,
_to_kwargs,
)
from torch.nn.parameter import Parameter

from ._optim_utils import (
_broadcast_pos_dim_tensor_states,
_broadcast_processed_optim_state_dict,
Expand Down Expand Up @@ -3913,13 +3911,6 @@ def no_sync(self) -> Generator:
)
m._sync_gradients = old_flag

@property
def params_with_grad(self) -> List[Parameter]:
"""
Recursively returns a list of all module parameters that have a gradient.
"""
return [p for p in self.parameters() if p.grad is not None]

@torch.no_grad()
def clip_grad_norm_(
self, max_norm: Union[float, int], norm_type: Union[float, int] = 2.0
Expand Down

0 comments on commit 27dc00b

Please sign in to comment.