[FSDP] Clarify CPU offload implicitly in reshard_doc (#98666)

Per title Differential Revision: [D44812344](https://our.internmc.facebook.com/intern/diff/D44812344/) Pull Request resolved: #98666 Approved by: https://github.com/awgu
pytorch · Apr 11, 2023 · 85e1d74 · 85e1d74
1 parent c00fd71
commit 85e1d74
Showing 1 changed file with 3 additions and 1 deletion.
diff --git a/torch/distributed/fsdp/flat_param.py b/torch/distributed/fsdp/flat_param.py
@@ -1487,7 +1487,9 @@ def reshard(self, free_unsharded_flat_param: bool):
         """
         Runs the reshard logic. This includes freeing the unsharded flat
         parameter if ``free_unsharded_flat_param`` and switching to using the
-        sharded flat parameter.
+        sharded flat parameter. Note that this also implicitly offloads
+        the sharded flat parameter (if CPU offload is enabled) by pointing
+        it to the ``_local_shard`` attribute which resides on CPU.
         """
         # Switch to the sharded `FlatParameter` before freeing to prevent
         # "use-after-free"-type bugs with external profiling tools, where for